Jump to content
Bee

A8 PDF viewer ?

Recommended Posts

So the one thing I would like to be able to do is view PDFs on my A8.  Is there such a program?  When searching for this i get 2 results, a million PDfs about A8's or Empty.   Has this been done, am I looking for something that has not been made?

 

Thank you

Share this post


Link to post
Share on other sites

The A8 isn't capable of displaying all that goes into most or all PDF's (most PDF's are larger than even the largest memory upgrades for the A8, and all the non-text visuals). So it doesn't exist. however, I'd like to see an A8 app that can strip just the text out to view on an A8. That would be cool.

 

But since PDF's aren't saved as any form of text, but are graphic visual representations of text and images, I doubt an app could be made for the A8 to strip just the text out. The app to do this would need to be on a PC and then allows one to save the text in standard .txt or.doc file. Then you could load the text into a viewer or Atari word processor (probably something like The Last Word processor since it can use upto 320K for text files)

 

I've never tried, but Adobe's software might have an option to change it to text-only, then the text file might be loaded into The Last Word.

Edited by Gunstar
  • Like 2

Share this post


Link to post
Share on other sites

 I wonder if I can set this up as a script -

 

https://online.pdfconvertertools.com/wim/static/wi/main.html?tp=wi&v=40.8&gnum=15&cid=8594&kw=pdf to text converter&gclid=CjwKCAiAi4fwBRBxEiwAEO8_HkHJCdQdGULpuMon9j0z_Y8A2YmvImqECtVRnqJLq-EbflYyua5dFhoCA2MQAvD_BwE&clickid=77624039990&cachecode=Q2icH4pzxMbJhYrvnvAQqg==&fcid=8404

 

and then dump it into a shared directory over SIO2PC with a check for file size and part it out if it's too big.

 

Thanks for the idea.

 

Share this post


Link to post
Share on other sites

I am currently putting together materials to rasterize a PDF stream for output for cloud printing from an ESP device.

 

(I say the following coming from an experience with writing PostScript by hand:)

The more I dig into PDF, the more completely and utterly horrified I am that it is pushed as a long term archival format. It makes so many cardinal sins that long term archival formats should NEVER do:

 

* Mixing of textual and binary data forms

* Direct output of internal object graphs

* No rosetta stone for decoding the file data from the file itself, you literally have to understand any and all implicit internal contexts of PDF parsers.

* mixing of device independent and device dependent forms in the same chunks of data

 

It's a clusterfuck.

 

-Thom

  • Like 1

Share this post


Link to post
Share on other sites
14 hours ago, Gunstar said:

But since PDF's aren't saved as any form of text...

Completely wrong. PDFs store textual content as text, displayed using an embedded font. That's why you can copy-paste text out of a PDF, and edit them in Acrobat.

Share this post


Link to post
Share on other sites

@ZylonBane seriously dude, you are an arse.

 

He's referring to the fact that PDF files are not in an easily parseable format. Something I'm knee deep in at the moment, and have spent a chunk of my professional career having to deal with (data transformation)

 

PDF files are NOT textual. They are a mash of object graphs, some of them text, some of them binary. They are _VERY_ difficult to parse.

 

I am speaking from actual experience.

 

-Thom

Edited by tschak909
  • Like 1

Share this post


Link to post
Share on other sites
17 hours ago, tschak909 said:

I am currently putting together materials to rasterize a PDF stream for output for cloud printing from an ESP device.

 

(I say the following coming from an experience with writing PostScript by hand:)

The more I dig into PDF, the more completely and utterly horrified I am that it is pushed as a long term archival format. It makes so many cardinal sins that long term archival formats should NEVER do:

 

* Mixing of textual and binary data forms

* Direct output of internal object graphs

* No rosetta stone for decoding the file data from the file itself, you literally have to understand any and all implicit internal contexts of PDF parsers.

* mixing of device independent and device dependent forms in the same chunks of data

 

It's a clusterfuck.

 

-Thom

 

Its always the same when a "Standard" gets chosen, it's not always the best that wins, remember BetaMax, VHS and V2000

of the 3 VHS was the worst and it won, V2000 the best, but was always in 3rd place 

Share this post


Link to post
Share on other sites

Well with PDF it's more of a historic issue. It started simple. It didn't last simple.

Anyway I can't come up with any format less suitable for A8 right now :-D

Edited by R0ger
  • Like 1

Share this post


Link to post
Share on other sites

PDF is like a mutant version of PostScript.  It would be so beyond slow to build an interpreter that would be usable on a 6502.  Plus the fact that many people scanning books and magazines do it as an image and embed it in a PDF instead of OCRing the text.  Trying running CPEGview and looking at a JPG on your 8bit.  A PDF interpreter would make that look fast.

 

Best bet is just to strip the text from PDF, convert to ATASCII and send it on the the 8bit.  There's plenty of Linux CLI tools that could help there.

Share this post


Link to post
Share on other sites

And we go full Circle now - I work with people trying it import SVGs to design software.  Often it will not work.  This is because it is not a real SVG but a bitmap in a SVG wrapper.

 

I'm fine with the processing taking place off CPU, just that I can get the text content.  However I'm seeing the limitations that might be a problem in just file size alone.

 

I'm looking at a Pi Zero W as helper CPU for my A8 anyway.

 

Thx

 

 

Edited by Bee

Share this post


Link to post
Share on other sites
On 12/24/2019 at 11:13 PM, tschak909 said:

He's referring to the fact that PDF files are not in an easily parseable format.

Please, this is Gunstar we're talking about here. He meant no such thing. Look at the full sentence: "But since PDF's aren't saved as any form of text, but are graphic visual representations of text and images...".

 

He obviously thinks text in PDFs is converted to bitmaps or vector outlines or something, discarding the original textual content.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...