A8 PDF viewer ?

Bee · December 24, 2019

So the one thing I would like to be able to do is view PDFs on my A8. Is there such a program? When searching for this i get 2 results, a million PDfs about A8's or Empty. Has this been done, am I looking for something that has not been made?

Thank you

Gunstar · December 24, 2019

The A8 isn't capable of displaying all that goes into most or all PDF's (most PDF's are larger than even the largest memory upgrades for the A8, and all the non-text visuals). So it doesn't exist. however, I'd like to see an A8 app that can strip just the text out to view on an A8. That would be cool.

But since PDF's aren't saved as any form of text, but are graphic visual representations of text and images, I doubt an app could be made for the A8 to strip just the text out. The app to do this would need to be on a PC and then allows one to save the text in standard .txt or.doc file. Then you could load the text into a viewer or Atari word processor (probably something like The Last Word processor since it can use upto 320K for text files)

I've never tried, but Adobe's software might have an option to change it to text-only, then the text file might be loaded into The Last Word.

Edited December 24, 2019 by Gunstar

Bee · December 24, 2019

I wonder if I can set this up as a script -

https://online.pdfconvertertools.com/wim/static/wi/main.html?tp=wi&v=40.8&gnum=15&cid=8594&kw=pdf to text converter&gclid=CjwKCAiAi4fwBRBxEiwAEO8_HkHJCdQdGULpuMon9j0z_Y8A2YmvImqECtVRnqJLq-EbflYyua5dFhoCA2MQAvD_BwE&clickid=77624039990&cachecode=Q2icH4pzxMbJhYrvnvAQqg==&fcid=8404

and then dump it into a shared directory over SIO2PC with a check for file size and part it out if it's too big.

Thanks for the idea.

tschak909 · December 24, 2019

I am currently putting together materials to rasterize a PDF stream for output for cloud printing from an ESP device.

(I say the following coming from an experience with writing PostScript by hand:)

The more I dig into PDF, the more completely and utterly horrified I am that it is pushed as a long term archival format. It makes so many cardinal sins that long term archival formats should NEVER do:

* Mixing of textual and binary data forms

* Direct output of internal object graphs

* No rosetta stone for decoding the file data from the file itself, you literally have to understand any and all implicit internal contexts of PDF parsers.

* mixing of device independent and device dependent forms in the same chunks of data

It's a clusterfuck.

-Thom

Fuji-Man · December 25, 2019

Push a text file the tnfs server, then using enscript followed by ps2pdf?

ZylonBane · December 25, 2019

14 hours ago, Gunstar said:

But since PDF's aren't saved as any form of text...

Completely wrong. PDFs store textual content as text, displayed using an embedded font. That's why you can copy-paste text out of a PDF, and edit them in Acrobat.

tschak909 · December 25, 2019

@ZylonBane seriously dude, you are an arse.

He's referring to the fact that PDF files are not in an easily parseable format. Something I'm knee deep in at the moment, and have spent a chunk of my professional career having to deal with (data transformation)

PDF files are NOT textual. They are a mash of object graphs, some of them text, some of them binary. They are _VERY_ difficult to parse.

I am speaking from actual experience.

-Thom

Edited December 25, 2019 by tschak909

TGB1718 · December 25, 2019

17 hours ago, tschak909 said:

I am currently putting together materials to rasterize a PDF stream for output for cloud printing from an ESP device.

(I say the following coming from an experience with writing PostScript by hand:)

The more I dig into PDF, the more completely and utterly horrified I am that it is pushed as a long term archival format. It makes so many cardinal sins that long term archival formats should NEVER do:

* Mixing of textual and binary data forms

* Direct output of internal object graphs

* No rosetta stone for decoding the file data from the file itself, you literally have to understand any and all implicit internal contexts of PDF parsers.

* mixing of device independent and device dependent forms in the same chunks of data

It's a clusterfuck.

-Thom

Its always the same when a "Standard" gets chosen, it's not always the best that wins, remember BetaMax, VHS and V2000

of the 3 VHS was the worst and it won, V2000 the best, but was always in 3rd place

R0ger · December 26, 2019

Well with PDF it's more of a historic issue. It started simple. It didn't last simple.

Anyway I can't come up with any format less suitable for A8 right now :-D

Edited December 26, 2019 by R0ger

snicklin · December 26, 2019

And I thought that the RTF format was bad....

kogden · December 26, 2019

PDF is like a mutant version of PostScript. It would be so beyond slow to build an interpreter that would be usable on a 6502. Plus the fact that many people scanning books and magazines do it as an image and embed it in a PDF instead of OCRing the text. Trying running CPEGview and looking at a JPG on your 8bit. A PDF interpreter would make that look fast.

Best bet is just to strip the text from PDF, convert to ATASCII and send it on the the 8bit. There's plenty of Linux CLI tools that could help there.

Bee · December 26, 2019

And we go full Circle now - I work with people trying it import SVGs to design software. Often it will not work. This is because it is not a real SVG but a bitmap in a SVG wrapper.

I'm fine with the processing taking place off CPU, just that I can get the text content. However I'm seeing the limitations that might be a problem in just file size alone.

I'm looking at a Pi Zero W as helper CPU for my A8 anyway.

Thx

Edited December 26, 2019 by Bee

ZylonBane · December 28, 2019

On 12/24/2019 at 11:13 PM, tschak909 said:

He's referring to the fact that PDF files are not in an easily parseable format.

Please, this is Gunstar we're talking about here. He meant no such thing. Look at the full sentence: "But since PDF's aren't saved as any form of text, but are graphic visual representations of text and images...".

He obviously thinks text in PDFs is converted to bitmaps or vector outlines or something, discarding the original textual content.

_The Doctor__ · January 24, 2021

There is nothing portable about it... complete misnomer...

tschak909 · January 24, 2021

I can speak to this, as I'm one of two people who have worked on the #FujiNet printing framework, which generates a PDF document from scratch (the other more important individual being @jeffpiep)

PDF is an absolute nightmare to implement. It is as if the designers of PDF took all the good elements of PostScript, wiped their collective arses with it, and promptly flushed it down the toilet.

The biggest mark against PDF being an archival quality format is the sheer amount of system specific information encoded in the file to make it display and print correctly on different platforms. On average, 35% to 40% of a file encoded in Acrobat contains system specific information, and so much of the font selection and measurement bias data is encoded here, to say nothing of system specific entities encoded in what are supposed to be standardized . THE EXTREME USAGE OF SYSTEM SPECIFIC DATA AND THE MIXING OF SYSTEM SPECIFIC DATA IN "PORTABLE" SECTIONS OF THE STANDARD CATEGORICALLY DISQUALIFIES PDF AS A LONG TERM ARCHIVAL STORAGE FORMAT.

(and for those snarky enough to say "PostScript was worse!" ... sigh. PostScript never claimed to be an archival document format. It was specifically a printer language built around a FORTH interpreter.

-Thom

Edited January 24, 2021 by tschak909

A8 PDF viewer ?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members