Jump to content
IGNORED

A8 PDF viewer ?


Bee

Recommended Posts

So the one thing I would like to be able to do is view PDFs on my A8.  Is there such a program?  When searching for this i get 2 results, a million PDfs about A8's or Empty.   Has this been done, am I looking for something that has not been made?

 

Thank you

Link to comment
Share on other sites

The A8 isn't capable of displaying all that goes into most or all PDF's (most PDF's are larger than even the largest memory upgrades for the A8, and all the non-text visuals). So it doesn't exist. however, I'd like to see an A8 app that can strip just the text out to view on an A8. That would be cool.

 

But since PDF's aren't saved as any form of text, but are graphic visual representations of text and images, I doubt an app could be made for the A8 to strip just the text out. The app to do this would need to be on a PC and then allows one to save the text in standard .txt or.doc file. Then you could load the text into a viewer or Atari word processor (probably something like The Last Word processor since it can use upto 320K for text files)

 

I've never tried, but Adobe's software might have an option to change it to text-only, then the text file might be loaded into The Last Word.

Edited by Gunstar
  • Like 2
Link to comment
Share on other sites

 I wonder if I can set this up as a script -

 

https://online.pdfconvertertools.com/wim/static/wi/main.html?tp=wi&v=40.8&gnum=15&cid=8594&kw=pdf to text converter&gclid=CjwKCAiAi4fwBRBxEiwAEO8_HkHJCdQdGULpuMon9j0z_Y8A2YmvImqECtVRnqJLq-EbflYyua5dFhoCA2MQAvD_BwE&clickid=77624039990&cachecode=Q2icH4pzxMbJhYrvnvAQqg==&fcid=8404

 

and then dump it into a shared directory over SIO2PC with a check for file size and part it out if it's too big.

 

Thanks for the idea.

 

Link to comment
Share on other sites

I am currently putting together materials to rasterize a PDF stream for output for cloud printing from an ESP device.

 

(I say the following coming from an experience with writing PostScript by hand:)

The more I dig into PDF, the more completely and utterly horrified I am that it is pushed as a long term archival format. It makes so many cardinal sins that long term archival formats should NEVER do:

 

* Mixing of textual and binary data forms

* Direct output of internal object graphs

* No rosetta stone for decoding the file data from the file itself, you literally have to understand any and all implicit internal contexts of PDF parsers.

* mixing of device independent and device dependent forms in the same chunks of data

 

It's a clusterfuck.

 

-Thom

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

@ZylonBane seriously dude, you are an arse.

 

He's referring to the fact that PDF files are not in an easily parseable format. Something I'm knee deep in at the moment, and have spent a chunk of my professional career having to deal with (data transformation)

 

PDF files are NOT textual. They are a mash of object graphs, some of them text, some of them binary. They are _VERY_ difficult to parse.

 

I am speaking from actual experience.

 

-Thom

Edited by tschak909
  • Like 1
Link to comment
Share on other sites

17 hours ago, tschak909 said:

I am currently putting together materials to rasterize a PDF stream for output for cloud printing from an ESP device.

 

(I say the following coming from an experience with writing PostScript by hand:)

The more I dig into PDF, the more completely and utterly horrified I am that it is pushed as a long term archival format. It makes so many cardinal sins that long term archival formats should NEVER do:

 

* Mixing of textual and binary data forms

* Direct output of internal object graphs

* No rosetta stone for decoding the file data from the file itself, you literally have to understand any and all implicit internal contexts of PDF parsers.

* mixing of device independent and device dependent forms in the same chunks of data

 

It's a clusterfuck.

 

-Thom

 

Its always the same when a "Standard" gets chosen, it's not always the best that wins, remember BetaMax, VHS and V2000

of the 3 VHS was the worst and it won, V2000 the best, but was always in 3rd place 

  • Like 1
Link to comment
Share on other sites

PDF is like a mutant version of PostScript.  It would be so beyond slow to build an interpreter that would be usable on a 6502.  Plus the fact that many people scanning books and magazines do it as an image and embed it in a PDF instead of OCRing the text.  Trying running CPEGview and looking at a JPG on your 8bit.  A PDF interpreter would make that look fast.

 

Best bet is just to strip the text from PDF, convert to ATASCII and send it on the the 8bit.  There's plenty of Linux CLI tools that could help there.

Link to comment
Share on other sites

And we go full Circle now - I work with people trying it import SVGs to design software.  Often it will not work.  This is because it is not a real SVG but a bitmap in a SVG wrapper.

 

I'm fine with the processing taking place off CPU, just that I can get the text content.  However I'm seeing the limitations that might be a problem in just file size alone.

 

I'm looking at a Pi Zero W as helper CPU for my A8 anyway.

 

Thx

 

 

Edited by Bee
Link to comment
Share on other sites

On 12/24/2019 at 11:13 PM, tschak909 said:

He's referring to the fact that PDF files are not in an easily parseable format.

Please, this is Gunstar we're talking about here. He meant no such thing. Look at the full sentence: "But since PDF's aren't saved as any form of text, but are graphic visual representations of text and images...".

 

He obviously thinks text in PDFs is converted to bitmaps or vector outlines or something, discarding the original textual content.

Link to comment
Share on other sites

  • 1 year later...

I can speak to this, as I'm one of two people who have worked on the #FujiNet printing framework, which generates a PDF document from scratch (the other more important individual being @jeffpiep)

 

PDF is an absolute nightmare to implement. It is as if the designers of PDF took all the good elements of PostScript, wiped their collective arses with it, and promptly flushed it down the toilet.

 

The biggest mark against PDF being an archival quality format is the sheer amount of system specific information encoded in the file to make it display and print correctly on different platforms. On average, 35% to 40% of a file encoded in Acrobat contains system specific information, and so much of the font selection and measurement bias data is encoded here, to say nothing of system specific entities encoded in what are supposed to be standardized . THE EXTREME USAGE OF SYSTEM SPECIFIC DATA AND THE MIXING OF SYSTEM SPECIFIC DATA IN "PORTABLE" SECTIONS OF THE STANDARD CATEGORICALLY DISQUALIFIES PDF AS A LONG TERM ARCHIVAL STORAGE FORMAT.

 

(and for those snarky enough to say "PostScript was worse!" ... sigh. PostScript never claimed to be an archival document format. It was specifically a printer language built around a FORTH interpreter.

 

-Thom

Edited by tschak909
  • Like 3
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...