ppelleti Posted April 15, 2019 Share Posted April 15, 2019 The "De Re Intellivision" documents are in a weird character set (MS-DOS specific?) and don't display well on modern systems. Here is a command that will convert a "De Re Intellivision" file to UTF-8 encoding, which should be viewable in most modern programs: iconv -f cp437 -t utf-8 dri_2.txt | tr '\020\021\036\037' '><^v' > dri_2-utf8.txt (This should work on Linux and Mac OS X, and probably most UNIX systems. May also work on Windows if you have Cygwin, MSYS, or WSL installed.) 3 Quote Link to comment Share on other sites More sharing options...
+Lathe26 Posted April 16, 2019 Share Posted April 16, 2019 Thanks, I can confirm this work in Cygwin on Windows. If JoeZ sees this post, it might be good to have the jzintv tools update its doc/De_Re_Intellivision folder with the dir_*-tf8.txt files. Quote Link to comment Share on other sites More sharing options...
intvnut Posted April 19, 2019 Share Posted April 19, 2019 I'll ask William for his blessing. I might also consider upgrading to PDFs to ensure a more consistent experience for everyone. When I put those files in there nearly 20 years ago, PDFs were the bogeyman. That was last millennium, though. 1 Quote Link to comment Share on other sites More sharing options...
Zendocon Posted July 31, 2019 Share Posted July 31, 2019 I'll give this a try in Termux. Maybe I can do the same thing with GROM_CHA.txt, which used "Code Page 437" line-drawing characters as well. Quote Link to comment Share on other sites More sharing options...
Zendocon Posted July 31, 2019 Share Posted July 31, 2019 I adapted your command to grom_cha.txt and it worked! Here it is for your UTF-8 viewing pleasure. Curiously, within Termux on my phone it looks fine, but Termux on my tablet wanted to display "***" as " \*", so I replaced all the asterisks to 'X'. grom_cha-utf8.txt Quote Link to comment Share on other sites More sharing options...
ppelleti Posted January 16, 2022 Author Share Posted January 16, 2022 A slight update: My original command missed a character used in drawing the Centronics parallel port in dri_6.txt. So, my new recommended command is: iconv -f cp437 -t utf-8 dri_6.txt | tr '\020\021\036\037\026' '><^v-' > dri_6-utf8.txt 1 Quote Link to comment Share on other sites More sharing options...
Zendocon Posted January 17, 2022 Share Posted January 17, 2022 I used a simpler command. After "-t utf-8" I used "-f cp437", since I already knew how the line-drawing characters were created. Quote Link to comment Share on other sites More sharing options...
ppelleti Posted January 17, 2022 Author Share Posted January 17, 2022 3 minutes ago, Zendocon said: I used a simpler command. After "-t utf-8" I used "-f cp437", since I already knew how the line-drawing characters were created. Does it really matter which order the "-t" and "-f" arguments come in? Seems like it would be the same either way. Quote Link to comment Share on other sites More sharing options...
Zendocon Posted January 17, 2022 Share Posted January 17, 2022 It shouldn't. -t is "to" and -f is "from". Quote Link to comment Share on other sites More sharing options...
ppelleti Posted January 17, 2022 Author Share Posted January 17, 2022 4 minutes ago, Zendocon said: It shouldn't. -t is "to" and -f is "from". Then I guess I'm not understanding your post. I'm using "iconv -f cp437 -t utf-8" and it sounded like you were using "iconv -t utf-8 -f cp437". Quote Link to comment Share on other sites More sharing options...
Zendocon Posted January 17, 2022 Share Posted January 17, 2022 Whoops. Don't know how I missed that. Quote Link to comment Share on other sites More sharing options...
Peripheral Posted January 17, 2022 Share Posted January 17, 2022 While "iconv -f CP437 -t UTF-8" worked for most things, it seems to have missed a few characters. Well, at least under cygwin on Windows, can't speak for other platforms. Below is a sed script (dri.sed) that replaces the special characters in the dri_*.txt files with their HTML character entities. With this, you can wrap a whole dri_*.txt file inside an HTML <pre> tag and view it in your browser using, for example: echo "<!DOCTYPE html>" > dri_1.htm echo "<html lang=\"en\">" >> dri_1.htm echo "<head><title>dri_1</title></head><body><pre>" >> dri_1.htm LC_ALL=C sed -f dri.sed dri_1.txt >> dri_1.htm echo "</pre></body></html>" >> dri_1.htm ----- Here is the dri.sed file ----- # Sed script to translate IBM Extended ascii characters in # dri_*.txt files to HTML entities. # # The dri_*.txt encodings are *mostly* code page 437. However, with cygwin # on Windows, "iconv -f CP437 -t UTF-8 dri*.txt" seems to ignore the # following input characters, leaving them (incorrectly) unmodified: # # 0x10 -- right arrow head (0x10 is ctrl-P) # 0x11 -- left arrow head (0x11 is ctrl-Q) # 0x1e -- up arrow head (0x1e is ctrl-^) # 0x1f -- down arrow head (0x1f is ctrl-underscore) # # And iconv treats the following as CP437 symbols, while dri_1.txt # seems to use them as the slanted accent symbol above an 'e' or 'a'. # 0xe1 -- á # 0xe9 -- é # # Strip the CR from CR/LF line endings. s/[\x0d]//g # Strip trailing white space. s/ *$// # '&', '<' and '>' need to be escaped in HTML. s/&/\&/g s/</\</g s/>/\>/g s/[\x10]/\►/g s/[\x11]/\◄/g s/[\x16]/\▬/g s/[\x1e]/\▲/g s/[\x1f]/\▼/g s/[\xb3]/\│/g s/[\xb4]/\┤/g s/[\xb6]/\╢/g s/[\xb9]/\╣/g s/[\xba]/\║/g s/[\xbb]/\╗/g s/[\xbc]/\╝/g s/[\xbd]/\╜/g s/[\xbf]/\┐/g s/[\xc0]/\└/g s/[\xc1]/\┴/g s/[\xc2]/\┬/g s/[\xc3]/\├/g s/[\xc4]/\─/g s/[\xc5]/\┼/g s/[\xc6]/\╞/g s/[\xc8]/\╚/g s/[\xc9]/\╔/g s/[\xca]/\╩/g s/[\xcc]/\╠/g s/[\xcd]/\═/g s/[\xce]/\╬/g s/[\xcf]/\╧/g s/[\xd0]/\╨/g s/[\xd2]/\╥/g s/[\xd7]/\╫/g s/[\xda]/\┌/g s/[\xdb]/\█/g s/[\xd8]/\╪/g s/[\xd9]/\┘/g s/[\xe1]/\á/g s/[\xe9]/\é/g s/[\xf0]/\≡/g s/[\xf8]/\ø/g # # The replacements below fix a handful of spelling errors/typos. # s/\<accesories\>/accessories/g s/\<Accomodates\>/Accommodates/g # When fixing "acknnowledge", we need to add a space near the end # of the line to keep the enclosing diagram character well aligned. s/\<acknnowledge\. Followed by IAD\. /acknowledge. Followed by IAD. /g s/\<adderess\>/address/g s/\<addess\>/address/g s/\<advertizing\>/advertising/g s/\<abreviation\>/abbreviation/g s/\<appropirate\>/appropriate/g s/\<Avalable\>/Available/g s/\<B-17 Bomer\>/B-17 Bomber/g s/\<B- 17 Bomber\>/B-17 Bomber/g s/\<begining\>/beginning/g s/\<best-remembered Intellivision game\>/best-remembered Intellivision games/g s/\<bidrectional\>/bidirectional/g s/\<cartidges\>/cartridges/g s/\<casettes\>/cassettes/g s/\<charater\>/character/g s/\<Christmass\>/Christmas/g s/\<Comission\>/Commission/g s/\<Commision\>/Commission/g s/\<componenet\>/component/g s/\<componenets\>/components/g s/\<Componenet\>/Component/g s/\<Compnents\>/Components/g s/\<conditons\>/conditions/g s/\<connecter\>/connector/g s/\<conprises\>/comprises/g s/\<conputer\>/computer/g s/\<consistant\>/consistent/g s/\<contolled\>/controlled/g s/\<criple\>/cripple/g s/\<deliberite\>/deliberate/g s/\<eariler\>/earlier/g s/\<eather\>/either/g s/\<effecitve\>/effective/g s/\<empolyees\>/employees/g s/\<ENVIROMENT\>/ENVIRONMENT/g s/\<equiped\>/equipped/g s/\<everytime\>/every time/g s/\<exsisting\>/existing/g s/\<exstensively\>/extensively/g s/\<extrodinary\>/extraordinary/g s/\<facilties\>/facilities/g s/\<follwoing\>/following/g s/\<frequencey\>/frequency/g s/\<Ginini\>/Gimini/g s/\<hoplessly\>/hopelessly/g s/\<hrizon\>/horizon/g s/\<in conjection sith\>/in conjunction with/g s/\<Intellivison\>/Intellivision/g s/\<Intellivsion\>/Intellivision/g s/\<irrelevent\>/irrelevant/g s/\<medievel\>/medieval/g s/\<modifyed\>/modified/g s/\<moring\>/morning/g s/\<muliplexed\>/multiplexed/g s/\<playtesting\>/play testing/g s/\<possiblity\>/possibility/g s/\<Preceeds\>/Precedes/g s/\<PRESETNT\>/PRESENT/g s/\<programable\>/programmable/g s/\<programmed Jay\>/programmed by Jay/g s/\<refered\>/referred/g s/\<reliablility\>/reliability/g s/\<reluctatly\>/reluctantly/g s/\<remander\>/remainder/g s/\<Richocheting\>/Ricocheting/g s/\<selcted\>/selected/g s/\<sheilding\>/shielding/g s/\<SIGNFIY\>/SIGNIFY/g s/\<souces\>/sources/g s/\<sucessful\>/successful/g s/\<sucessfully\>/successfully/g s/\<successfuly\>/successfully/g s/\<synchroniztation\>/synchronization/g s/\<synchronizzation\>/synchronization/g s/\<thsi\>/this/g s/\<Tempsest\>/Tempest/g s/\<to rights to\>/the rights to/g s/\<the intellivision\>/the Intellivision/g s/\<visability\>/visibility/g s/\<volitile\>/volatile/g Quote Link to comment Share on other sites More sharing options...
ppelleti Posted January 17, 2022 Author Share Posted January 17, 2022 (edited) 1 hour ago, Peripheral said: While "iconv -f CP437 -t UTF-8" worked for most things, it seems to have missed a few characters. Well, at least under cygwin on Windows, can't speak for other platforms. My experience, on Mac OS X and Linux, with the four dri_n.txt files that come with jzIntv, is that iconv successfully converts all of the CP437 characters that are above the ASCII range (i. e. those with the hi bit set). In my experience, iconv does not convert the characters below the printable ASCII range (i. e. control characters), even though those are printable characters in CP437. This is why I pipe the output of iconv into tr: iconv -f cp437 -t utf-8 dri_6.txt | tr '\020\021\036\037\026' '><^v-' > dri_6-utf8.txt The five characters in question are four arrow heads, plus "black rectangle". I replace these with ASCII characters, rather than Unicode characters, because the suggested Unicode replacements (at least the ones given on Wikipedia) are double width characters (►, ◄, ▲, ▼, and ▬), which would mess up the formatting. So, I used ASCII characters, since those are single-width. 1 hour ago, Peripheral said: Below is a sed script (dri.sed) that replaces the special characters in the dri_*.txt files with their HTML character entities. With this, you can wrap a whole dri_*.txt file inside an HTML <pre> tag and view it in your browser using, for example: That's cool! Personally, I prefer having a text file I can view in Emacs, or with "less" on the command line. But I can see how HTML would be a useful format for many people. Edited January 17, 2022 by ppelleti pluralize "file" Quote Link to comment Share on other sites More sharing options...
Peripheral Posted January 20, 2022 Share Posted January 20, 2022 I just realized that the sed translation s/[\xf8]/\ø/g in my earlier post's dri.sed script is actually a no-op, as 248 decimal is 0xf8 hex. This is probably OK. But a perhaps more appropriate translation would be s/[\xf8]/\ø/g This gives an 'o' with a slash through it, representing the funky 'o' in Broderbund in dri_1.txt. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.