Harry Potter Posted April 2, 2021 Share Posted April 2, 2021 I've been using the LZS technique to compress for a program I'm writing. It seems fast enough. I'm also working on my own compression technique for a package. The latter uses LZ77, Adaptive Huffman Codes, two alternate methods of compressing literals, a method of shortening repeated LZ77 blocks and a method of writing data not in the range of a power of 2. It's very large and slow. I could use a little help with that. You can Google Wikipedia.com for Stac Compression for information on the LZS (Lempil Ziv Stac) technique. As for my technique, I could use a little advice. Quote Link to comment Share on other sites More sharing options...
elmer Posted April 2, 2021 Share Posted April 2, 2021 16 hours ago, Harry Potter said: I've been using the LZS technique to compress for a program I'm writing. It seems fast enough. I'm also working on my own compression technique for a package. The latter uses LZ77, Adaptive Huffman Codes, two alternate methods of compressing literals, a method of shortening repeated LZ77 blocks and a method of writing data not in the range of a power of 2. It's very large and slow. I could use a little help with that. You can Google Wikipedia.com for Stac Compression for information on the LZS (Lempil Ziv Stac) technique. As for my technique, I could use a little advice. The core idea of LZS, which is first Adaptive-Huffman coding the lengths and offsets in hundreds of test files, and then coming up with a Static-Huffman coding scheme that averages out the results from all of those files so that you can avoid the time-consuming Adaptive-Huffman step ... well that is basically where most of the different LZ-style compressors came from in the late 80s and early 90s, just using different coding schemes. When you do the testing, you find that short matches, or short runs of unmatched-literals, are far more common than long runs. You also find that match offsets are more commonly close to the current data than far away from it. This leads to the kind of encoding schemes that you commonly see today (for 8-bit and 16-bit micros). While I encourage you to have fun experimenting with creating your own format, because you will learn a lot ... you will find that you'll have a very hard time beating LZSA1 and LZSA2, which are pretty optimized versions of that scheme. Better compression seems to need a slightly more adaptive solution, and Huffman is one way of doing that, but it has decompression-speed and code-size penalties that ususally aren't worth the trouble on old computers. Elias gamma coding is another scheme that does seem to offer worthwhile benefits, as show by the aPLib and ZX0 compressors. Improvements beyond those tend to involve some really clever, but slow, techniques that just aren't well suited to run on older hardware. If you are writing your compressor to run on the Atari itself, then some of the compression speedup techniques just won't be available to you because of the limited memory size. If you are writing your compressor to run on a modern PC, then I suggest that you take a look at Jørgen Ibsen's BriefLZ compressor, and note its use of hash chains for finding match repeats, which seems to be the 1st step in speeding up a simple LZ-style compressor. 2 Quote Link to comment Share on other sites More sharing options...
elmer Posted April 4, 2021 Share Posted April 4, 2021 (edited) On 3/29/2021 at 1:57 PM, elmer said: I've now written, and experimented with optimizing, a decompressor for ZX0 ... and it's an interesting format. After taking a look at ZX0 decompressor in the bitfire project, I decided to do some more optinmization work on my ZX0 decompressor, and to optimize the code for the fact that match and literal lengths are almost-always less than 256 bytes. The result is an approx 9% improvement in decompression speed, at a cost of adding 4 bytes of code, so the decompressor is now 196 bytes long. Tobias's bitfire decompressor is between 0% and 3% faster to decompress, depending upon the data file, but it is 15% (30 bytes) longer, so I'm happy with the tradeoff. I have applied the same optimization to my LZSA1 and LZSA2 decompressors, so while the ZX0 decompressor is better than before, it is still 25% slower than LZSA2. zx0_6502.mad Edited November 6, 2021 by elmer 3 Quote Link to comment Share on other sites More sharing options...
Harry Potter Posted April 15, 2021 Share Posted April 15, 2021 I am interested in LZSA2. How well does it do over Deflate? Where online can I find it? I tried Wikipedia and didn't see it. Quote Link to comment Share on other sites More sharing options...
xxl Posted April 15, 2021 Share Posted April 15, 2021 7 minutes ago, Harry Potter said: I am interested in LZSA2. How well does it do over Deflate? https://github.com/emmanuel-marty/lzsa/blob/master/pareto_graph.png https://atariage.com/forums/topic/316628-most-efficient-compression-for-atari/?tab=comments#comment-4741617 Quote Link to comment Share on other sites More sharing options...
elmer Posted November 6, 2021 Share Posted November 6, 2021 (edited) On 4/4/2021 at 1:47 PM, elmer said: I have applied the same optimization to my LZSA1 and LZSA2 decompressors, so while the ZX0 decompressor is better than before, it is still 25% slower than LZSA2. I have tried to improve my ZX0 (not DX0 as I was incorrectly saying earlier) a few times over the previous months, but while I can make some small improvements to the speed ... the cost in terms of the size increase is never worth it (IMHO). The same happens with the changes in ZX0 for the "new" improvements (targeted at the Z80 CPU) ... the 6502 decompressor is both slower and longer if that "new" format is used, so I recommend folks to just use the option to write out the "old" or "classic" ZX0 format, and to use the assemly-language code that I posted earlier to decompress it. The REALLY BIG news is that Emmanuel Marty released his own much, much, much faster ZX0 compressor a few days ago, and this should now (hopefully) make ZX0 a good replacement for general-purpose development, and it will probably relegate LZSA2 to the trash-bin, along with so many other "great-in-their-time" ideas. You can find Emmanuel Marty's new compressor here ... https://github.com/emmanuel-marty/salvador Edited November 6, 2021 by elmer 1 Quote Link to comment Share on other sites More sharing options...
elmer Posted November 6, 2021 Share Posted November 6, 2021 Some quick testing shows that, yes, the new compressor turns ZX0 into something that is (IMHO) usable for development! File Size Compressor Time ========================================== PoPCore.bin 40,164 PoPCore.lz2 23,922 lzsa.exe -f2 1.0s PoPCore.zx0 23,310 zx0.exe -quick 2.5s PoPCore.zx0 22,655 salvador.exe 1.0s PoPCore.zx0 22,646 zx0.exe 35.5s 3 Quote Link to comment Share on other sites More sharing options...
elmer Posted November 20, 2021 Share Posted November 20, 2021 Whoops ... there was a small bug in my ZX0 decompressor when handling matches that were a multiple of 256 bytes long ... which are so rare that it didn't get caught earlier. Luckily, the fix actually saves a few bytes and makes the decompressor a liitle faster! Since nobody is complaining about the decompressor size, I've kept it at the same 196 bytes long, and added another 2 branch-unlikely optimizations to make it a tiny bit faster. Here is the fixed decompressor ... zx0_6502.mad 4 Quote Link to comment Share on other sites More sharing options...
Harry Potter Posted November 20, 2021 Share Posted November 20, 2021 I have some ways to better the technique, but most of them I want to keep secret for now. Of the techniques I don't mind revealing, only one helped me noticeably, and it only helped me slightly. Should I reveal it here? I also ask you to try to add Adaptive Huffman Codes to the technique. Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted November 20, 2021 Share Posted November 20, 2021 20 minutes ago, Harry Potter said: Should I reveal it here? Are you really sure we are worthy? ? (Tell it or keep it but don't make a fuss about it...) 3 Quote Link to comment Share on other sites More sharing options...
ivop Posted November 20, 2021 Share Posted November 20, 2021 (edited) 32 minutes ago, Irgendwer said: Are you really sure we are worthy? ? (Tell it or keep it but don't make a fuss about it...) My years of experience with compression is that if people want to keep their method secret it usually doesn't work, or is an old concept they were not aware of. Perhaps Harry Potter could look into recursive compression. That would be really great to have on the memory limited Atari! Edited November 20, 2021 by ivop Quote Link to comment Share on other sites More sharing options...
Harry Potter Posted November 20, 2021 Share Posted November 20, 2021 My technique is simple: If an LZ77 block is found, check the next byte. If it results in a larger LZ77 block size, don't compress the current byte. BTW, I want to hide the technique for only a short while. If it succeeds, I want to reveal it a couple years after its debut. As far as it working, it is in its Alpha stages, so it currently does not work. Quote Link to comment Share on other sites More sharing options...
ivop Posted November 20, 2021 Share Posted November 20, 2021 44 minutes ago, ivop said: it usually doesn't work 19 minutes ago, Harry Potter said: so it currently does not work Point made. 1 1 Quote Link to comment Share on other sites More sharing options...
elmer Posted November 20, 2021 Share Posted November 20, 2021 1 hour ago, Harry Potter said: My technique is simple: If an LZ77 block is found, check the next byte. If it results in a larger LZ77 block size, don't compress the current byte. So, you're talking about "lazy" coding the matches instead of the traditional "greedy" coding. Yes, that definitely helps the compression! This is not new, and I'm afraid that you're a couple of decades behind the state-of-the-art. The next step beyond what you're proposing is obvious, and that is to look 2 bytes ahead. The cool programmers these days are using "optimal" coding of the compressed stream, but that takes some pretty-sophisticated techniques that just make my head hurt! 1 Quote Link to comment Share on other sites More sharing options...
Harry Potter Posted November 20, 2021 Share Posted November 20, 2021 At least you like my idea. Quote Link to comment Share on other sites More sharing options...
+Stephen Posted November 20, 2021 Share Posted November 20, 2021 1 hour ago, Harry Potter said: At least you like my idea. Of course - ideas that have been proven for decades are usually liked. Quote Link to comment Share on other sites More sharing options...
Harry Potter Posted November 30, 2021 Share Posted November 30, 2021 Hi! I just applied MTF to the ZX0 technique and saved one more block. I need to debug it now. Should I reveal my technique here now? Or should I debug it first? Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted November 30, 2021 Share Posted November 30, 2021 1 hour ago, Harry Potter said: Should I reveal my technique here now? Or should I debug it first? Since I'm sure you don't like to waste our time with half-baked or half-working (only the deflating works, but not inflating) software, take your time establishing a full round-trip tool chain before publishing. You're welcome! 3 Quote Link to comment Share on other sites More sharing options...
Harry Potter Posted November 30, 2021 Share Posted November 30, 2021 Fiddling gave me another block! Quote Link to comment Share on other sites More sharing options...
+DjayBee Posted December 1, 2021 Share Posted December 1, 2021 What an effort for just a single block. I go to my basement and immediately find lots of blocks and bricks. 2 Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted December 1, 2021 Share Posted December 1, 2021 +? Quote Link to comment Share on other sites More sharing options...
elmer Posted December 2, 2021 Share Posted December 2, 2021 On 11/30/2021 at 3:29 PM, Harry Potter said: Hi! I just applied MTF to the ZX0 technique and saved one more block. I need to debug it now. Should I reveal my technique here now? Or should I debug it first? So, you applied a move-to-front transform on the input stream, and it improved the compression (just like it was designed to do) ... congratulations! https://en.wikipedia.org/wiki/Move-to-front_transform Unfortunately, the decompression is now slower than before, and it needs to keep a bunch of extra information around, thus using more memory. Whether the time-vs-space tradeoff is worth it to you, is something that only you can decide ... but I'm glad that you're having fun experimenting. Quote Link to comment Share on other sites More sharing options...
Harry Potter Posted December 2, 2021 Share Posted December 2, 2021 elmer: You're right. Maybe, in the end, I'll make it an option. Quote Link to comment Share on other sites More sharing options...
Harry Potter Posted December 2, 2021 Share Posted December 2, 2021 I tweaked the technique a little further and bought some more points but not a whole block. Now, I'm down to 19 blocks. BTW, elmer, I probably only need 256 bytes for the MTF buffer, 64 bytes for the data to compress and no more than 512 bytes for the rest of the MTF code and data. So, everything should fit. Quote Link to comment Share on other sites More sharing options...
Harry Potter Posted December 2, 2021 Share Posted December 2, 2021 Milestone achieved! 18 blocks! I increased the sliding dictionary size by 256 bytes. Now, it's 3.25k. I will up it to 3.5k. I have 4k allocated to the LZ buffer, but some of it is unused, as the program fills the buffer at 3800 bytes used. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.