newcoleco Posted October 12, 2014 Share Posted October 12, 2014 (edited) For me, graphics in ColecoVision projects is what takes the most space and what can be compressed the most. So, I've been working on data compression from time to time but most seriously during the last month, trying to figure out the secrets of a good compressor and data format based on what others made to save memory space without the need of a too big decompressor and no extra (cpu) memory. My research resulted into 2 realisations : Run Length Encoding is fast but isn't good enough even if you encode it in a clever way, Huffman is good but you need a variation of LZ77 to really perform great compression and it's the base for the popular compressors in the homebrew scene and modern computers compressors including WinRAR, WinZip, GZip and 7Zip. I tried to do my own compression based on LZ77 with the option of an extra bits saving with a fixed Huffman. At first coding the compressor gave me headaches because a greedy LZ77 compression algorithm doesn't give the best compressed size. I went back to the source, reading papers about LZ77, and finaly I understood what I was doing wrong and code an algorithm for getting the best compression every time even if it means taking a long time to process. When the first results came out, I was shocked to discover that this simple change from greedy to a better strategy algorithm made my own compression format going from "as good as Pletter and ZX7" to "a bit better than Exomizer". So before yelling "Victory!", I've done more tests and I started to write a paper based on my research, showing not only what I've learned but also compression tests on real graphics. The following tables are in my paper, the size of the decompressors mentionned are from the ones I've encounter and do decompress directly into VRAM to respect the rule of no extra memory needed. 1st : decompression routine size 2nd : compressed data size of the Smurf Challenge title screen by Youki 3rd : compressed data size of the pixelart Unreal by Rion 4th : compressed data size of Lena (used in many computer science papers, original photo in Playboy) Run Length Encoding 43 1994 14814 14562 PuCrunch 297 1438 6780 9358 Bitbuster 173 1262 6588 9188 APLib (commercial) 289 1227 6556 9138 Pletter 241 1235 6476 9081 ZX7 128 1234 6474 9079 MegaLZ (commercial) 110 1243 6486 8960 Exomizer 250 1130 6236 8694 My own compression 200 1144 6136 8515 These 3 pictures in PC format (PATTERN 6144 bytes + COLOR 6144 bytes) : 3 sample pics in PC format.zip Victory? Not yet, I still have to code a lot including a tool using my own format. Please leave a comment Edited October 12, 2014 by newcoleco 9 Quote Link to comment Share on other sites More sharing options...
digress Posted October 12, 2014 Share Posted October 12, 2014 It clearly is beating the other compression methods. And even though the Exomizer beat it on one image it was overall not as good as your your new technqiue. It is amazing the quality of graphics the colecovision can do. Nice picture samples. Quote Link to comment Share on other sites More sharing options...
alekmaul Posted October 12, 2014 Share Posted October 12, 2014 I agree Daniel, graphics waste too much space in rom , really nice to see you again for such Coleco help for homebrewers. Is your own compression based on your Dan0 compression algorythm ? Can' t wait to see news regarding your new tool about that ! Quote Link to comment Share on other sites More sharing options...
newcoleco Posted October 12, 2014 Author Share Posted October 12, 2014 I agree Daniel, graphics waste too much space in rom , really nice to see you again for such Coleco help for homebrewers. Is your own compression based on your Dan0 compression algorythm ? Can' t wait to see news regarding your new tool about that ! Dan0 is a Run Length Encoding with a bit of my magic composed of a somewhat adaptive Huffman and the possibility of re-using part of another data table to reduce even more the number of bytes to be encoded. It's a bit complicated to explain and my tool based on this format isn't for everyone (no interface, you modify my tool coded in java and use jdk to compile and run it each time) but it works. Essentialy, all raw part of a RLE is a potential for saving more bytes by saying : "hey! I've seen this byte not long ago, let's just encode a reference to it in a few bits instead of encoding it raw". Since it's RLE based first, the decoding process is relatively fast, faster than all the lz77 based ones, which can be a big deal for some projects, Even if Dan0 isn't the best compression, it has a potential benifit for some projects, it depends of the needs. The compression I propose today is based on LZ77 with the option of a fixed Huffman that helps in some cases to get a great compression. Mixing LZ77 with Huffman isn't a new idea, it is used in many formats like Deflate in WinZip. Since my work is based on my ColecoVision projects, this custom compression do give great compression for me, I just assume it should works great for your projects too. Quote Link to comment Share on other sites More sharing options...
youki Posted October 13, 2014 Share Posted October 13, 2014 Very interresting!. It would be interresting you put the comparaison with your Dan0 in that list. I use Dan0 in all my projects since a while , and it does already some miracle for me. But if you put in your dev kit a better compression/decompression tool i will use it for sure. Curious that for Smurf Challenge , Exomizer does better than yours. Because on other the difference is really important. The rion Unreal image is a ZX Spectrum image no? I saw some very impressive images in some ZX Spectrum Demo , it is incridible what Artist manage to do with a so much constraint. Quote Link to comment Share on other sites More sharing options...
artrag Posted October 18, 2014 Share Posted October 18, 2014 (edited) Very good results, when do you plan to release compressor and decompressor ? They could be very handy also on msx. Just for comparison I've tried to compress the sample images using msx-o-mizer (based on exomizer 2.0 beta 6) and I had this: [youki] smurf challenge 1130 bytes [rion] unreal 6426 bytes Lena 8697 bytes Basically the results confirm that msx-o-mizer is derived by exomizer. Edited October 18, 2014 by artrag Quote Link to comment Share on other sites More sharing options...
newcoleco Posted October 27, 2014 Author Share Posted October 27, 2014 Update Error in the previous results. Oops! Corrected results suggest a compression ratio between MegaLZ and Exomizer, or similar to Exomizer ; it depends on the data to compress and the way you approach the case ( "naive or "clever" ). What I mean by "clever" is mostly because it needs human intervention (or maybe highly advanced algo with a lot of computations) to cut the dataset into subsets where the sum of the bytes for each compressed subset is smaller than the the whole dataset compressed as one set. With mode 2 pictures, groups are quite easy to detect and this particular case can be automated (duh!) : PATTERN and COLOR tables. Working on the final format so results may change again in my next post. No evaluation tool for the moment. Another thing. Most of the mode 2 pictures we do use in our projects are relatively simple graphics, with unused data. Many times we get values 00 and FF for the pattern which make one of the nibbles for the color code totaly meaningless (unused). There is also some cases where consecutive color codes may be just alternate version of the same 2 colors if the pattern codes where coded differently. This situation cause the graphic tool to generate arbritary color codes that are not used anywhere else in the picture that creates more variations of bytes possible for color than needed, making the picture harder to compress for no good reason. [youki] smurf challenge (naive) 1155 or (clever) 1136 bytes [rion] unreal (naive) 6329 or (clever) 6204 bytes Lena (naive) 8803 or (clever) 8601 bytes Quote Link to comment Share on other sites More sharing options...
artrag Posted October 30, 2014 Share Posted October 30, 2014 Humm... partitioning the input data in statistically homogeneous subsets would improve the compression rate also of the other compressors. In msx-o-mizer (close relative of exomizer) compressing colors and patterns separately does improve the results a lot. I suspect that if you segment input data in subsets statistically homogeneous and apply the other algorithms, their compression rate would improve outperforming the results of your proposed algorithm. Quote Link to comment Share on other sites More sharing options...
newcoleco Posted October 31, 2014 Author Share Posted October 31, 2014 Humm... partitioning the input data in statistically homogeneous subsets would improve the compression rate also of the other compressors. In msx-o-mizer (close relative of exomizer) compressing colors and patterns separately does improve the results a lot. I suspect that if you segment input data in subsets statistically homogeneous and apply the other algorithms, their compression rate would improve outperforming the results of your proposed algorithm. Well, my current tests shown : if cleverly segmented, yes exomizer do beat my proposed data compression in all the samples I've tried. With that said, I'm happy with the results. And a tool using this compression is coming next month. Quote Link to comment Share on other sites More sharing options...
artrag Posted October 31, 2014 Share Posted October 31, 2014 Consider also that exomizer needs a ram buffer of about 384 bytes IIRC, it is not trivial to fit it in the ram of a plain coleco. If your algorithm does not need ram buffers, it could be the best choice in many cases where the buffer for exomizer simply would not fit in ram Quote Link to comment Share on other sites More sharing options...
newcoleco Posted September 4, 2015 Author Share Posted September 4, 2015 My presentation during ADAMCon summer 2015 was about my work on data compression and my upcoming tool for it. For the moment, the research and development process is still going. When the first beta version of this new is done, I'll let it know here. Quote Link to comment Share on other sites More sharing options...
digress Posted September 7, 2015 Share Posted September 7, 2015 Cool, I'm using pletter for all the graphics. I've setup a batch file which will take the binary uncompressed data , pletter it, and then reoutput to hex data after being plettered. The one step I find still tedious is converting it into binary from the hex data which involves (cropping out the "0x", the "," and the "<spaces>". I do it with 3 mass find & replace searches which isn't too bad really. so ideally if it has the option to output uncompressed binary data which you can then compress with your own preferred method would be nice. I developed (out of necessity) a nice way to compress music. It's not really compressing it really but rather using predefined notes and a shorthand version of the song being played. It then plays the notes 1 at a time and won't play the next note til the delay of say 16 has passed. I noticed that when I analyzed my straight up music data the same notes & lengths would be used over and over again so I decided to just code the note 1 time then use a lookup table whenever that same note was played again. It cut the music data down to about 1/8th the original size due to the same note being used over & over with a custom playback routine that controls the lookup and timing. My presentation during ADAMCon summer 2015 was about my work on data compression and my upcoming tool for it. For the moment, the research and development process is still going. When the first beta version of this new is done, I'll let it know here. Quote Link to comment Share on other sites More sharing options...
Pixelboy Posted September 16, 2015 Share Posted September 16, 2015 Hey Daniel, how come I cannot send you a private message? Is your PM inbox full? I'm trying to contact you... Quote Link to comment Share on other sites More sharing options...
newcoleco Posted June 4, 2016 Author Share Posted June 4, 2016 Hey Daniel, how come I cannot send you a private message? Is your PM inbox full? I'm trying to contact you... Deleted 15 old messages in my AtariAge mailbox. You can send me messages. 1 Quote Link to comment Share on other sites More sharing options...
newcoleco Posted June 4, 2016 Author Share Posted June 4, 2016 Cool, I'm using pletter for all the graphics. I've setup a batch file which will take the binary uncompressed data , pletter it, and then reoutput to hex data after being plettered. The one step I find still tedious is converting it into binary from the hex data which involves (cropping out the "0x", the "," and the "<spaces>". I do it with 3 mass find & replace searches which isn't too bad really. so ideally if it has the option to output uncompressed binary data which you can then compress with your own preferred method would be nice. I developed (out of necessity) a nice way to compress music. It's not really compressing it really but rather using predefined notes and a shorthand version of the song being played. It then plays the notes 1 at a time and won't play the next note til the delay of say 16 has passed. I noticed that when I analyzed my straight up music data the same notes & lengths would be used over and over again so I decided to just code the note 1 time then use a lookup table whenever that same note was played again. It cut the music data down to about 1/8th the original size due to the same note being used over & over with a custom playback routine that controls the lookup and timing. Please tell me more about how to make things less tedious. And I would love to learn about your music format. Send me a message about it. 1 Quote Link to comment Share on other sites More sharing options...
newcoleco Posted June 4, 2016 Author Share Posted June 4, 2016 Consider also that exomizer needs a ram buffer of about 384 bytes IIRC, it is not trivial to fit it in the ram of a plain coleco. If your algorithm does not need ram buffers, it could be the best choice in many cases where the buffer for exomizer simply would not fit in ram I kept my focus on not spending more memory space and I'm quite confident about its efficiency. Expect a first BETA version in a few days. 1 Quote Link to comment Share on other sites More sharing options...
digress Posted June 5, 2016 Share Posted June 5, 2016 ok, PM sent. Please tell me more about how to make things less tedious. And I would love to learn about your music format. Send me a message about it. 1 Quote Link to comment Share on other sites More sharing options...
newcoleco Posted June 5, 2016 Author Share Posted June 5, 2016 ok, PM sent. Got it! Thanks Quote Link to comment Share on other sites More sharing options...
digress Posted June 6, 2016 Share Posted June 6, 2016 cool, So there is a predefiined note table for 1 voice melody only. It plays them based on a lookup data array so yto play a simple scale you could have a data array ={1,2,3,4,5,6,7}; rather than 0x40,0x57,0x63,0x04,0x42,0x57,0x83,0x04,0x24,0x11,0x50, //c3 8th note 0x40,0xFA,0x62,0x04,0x42,0xFA,0x82,0x04,0x24,0x11,0x50, //d3 8th note etc. really cuts down on the reuse becasue you'll use the same c3 note again and again. so to call c3 again you just use {1,1,1} or whatever the base or accompaniemnt is a timed beat that that is repeated. The biggest problem is the predefined notes array/sound array was limited to approx 62 sounds so I could only do it for 1 voice. as for the data fro the ICVGM which is an excellent program. We all appreciate it. If you ever desgin another similar it would be great if I could have an option to have the outputs optionally like so: Currently: byte NAME[] = { 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, }; I have to convert it to before I can compress it: 20202020202020202020202020202020 So maybe just as an extra option to have all the "," "spaces" & "0x" already taken out. I've gotten used to it so it's a nice to have. Perhaps even output to a window so it can just be cut & paste into the final data file. If you are going to build the DAN0 compression option right into it then perhaps it doesn't matter. Got it! Thanks Quote Link to comment Share on other sites More sharing options...
Kiwi Posted June 6, 2016 Share Posted June 6, 2016 You could use Notepad++ feature called, Search>Replace. Highlight the data table you want. Replace 0x with <nothing>. Then replace , with <nothing>. It should provide the table you want. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.