Jump to content
IGNORED

Cleaning and dusting off some old Atari scanned documents.


ldelsarte

Recommended Posts

I have a "net etiquette" and technical problem. Quite often, I find fantastic original Atari documents scanned and posted on archive.org.

Sadly, some of them are sometimes difficult to read because the ink has faded to very pale or the document was xeroxed too many time (not straight, binder holes, lots of black dots everywhere, etc).
So, patiently, I extract all the pages with Adobe Acrobat (or other online tools). Then I try to "clean up" all the pages, one by one, with the assistance of GIMP and Paint.NET. I give them a new life with clear white background and new darker ink. Finally, I recreate a clean .PDF.

 

The trouble is, I don't know how to contact the original publishers on archive.org to offer my "cleaner" version for him/her to publish. I don't want to publish these documents myself: that would be really disrespectful to the original publisher. I'm very grateful for all these documents, and I don't want to offend anyone, but some of these documents are really easier to read when "reworked" a little bit. Any idea or suggestion?

 

Thank you.

 

To illustrate my point:
Original document "Atari 600XL 1983-07-01 Product Status Meeting Handout" --> https://archive.org/details/AtariA600XLProductStatusMeetingHandout A great document I really enjoyed reading !

Enclosed: My "reworked" version as well as other documents, that I also "reworked".

 

Atari-600XL-1983-07-01-Product-Status-Meeting-Handout-(darker, easier to read).pdf

Atari 810 Disk Peripheral Device Description (darker, easier to read).pdf

Atari Disk Data Structures Tutorial (darker, easier to read).pdf

Atari LOGO A proposed plan (Nov 10, 1982) (darker, easier to read).pdf

Atari Disk File Manager Functional Description (darker, easier to read).pdf

Atari Speech Handler External Reference Specification (darker, easier to read).pdf

John Starkweather about PILOT (Date 23 Nov 1981) (darker, easier to read).pdf

Atari Colleen-Candy RAM Memory Map (Date 07-03-1979, Rev. A) (darker, easier to read).pdf

  • Like 11
Link to comment
Share on other sites

Id assume that everyone contributing to the internet archive does it to preserve stuff, so if you made it better and easier to use, I cant imagine theyd be put off. You can still add a note that youre not the original uploader to the metadata to placate your conscience.

  • Like 5
Link to comment
Share on other sites

Hi @ldelsarte

 

It's fine with me if you clean up and post documents. I'd recommend including links to the original scans, in case someone wants to see what something closer to what the original version looks like.

 

thanks for helping make these old docs more readable.

-Kevin

  • Like 2
Link to comment
Share on other sites

Just a general question to those experienced in scanning in documents for preservation...

 

I'm working on scanning in all of my Vantari User Group newsletters, but before I post them publicly I'd like to OCR them as best as possible, and ideally with human verification of 'low confidence' words to ensure the best searchability, rather than relying solely on the automatic guesses.

 

The older documents that were printed on dot matrix printers especially difficult for OCR, and a very high percent of the words require corrections.

 

I've so far been using the "Recognize Text" function of Adobe Acrobat, but the interface seems really kludgy.. I can't tell it areas of the page not to recognize, I can't mark certain uncertainties as not text, instead of deleting the text and press accept, or I have to switch to 'review recognized text' to allow me to click on a different word.. it would be nice to have a 'skip word' type option...

 

It also seems that even if I go through this effort, when uploading to Internet Archive, they do their own OCR and throw away my own efforts already in the document...

 

Are there better OCR workflows?

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...