Most efficient compression for Atari

tane · March 27, 2021

Thinking in the first boy:

c.png.359ad01f3a81719b61b51649cf05a8ac.png

And also thinking in that display list filled with almost every depacker listed in the first post.

dmsc · March 27, 2021

Hi elmer!

23 hours ago, elmer said:

While this is all interesting in that there are definitely some cases where decompression time doesn't matter, I believe that those circumstances are rare, and that users generally don't like to be kept waiting (too much).

From what I can see, nobody has yet mentioned the new-kid-on-the-block ... ZX0.

Current testing show this to be as-fast or faster to decompress than APLIB, but with better compression ... so 2x or 3x as fast as Exomizer.

lzsa: 1811 (lzsa.exe -f2 -r)

apl: 1655

ZX0: 1625

EXO: 1537 (exomizer raw -E)

Ah, I always suspected that adding repeat-offset blocks could increase the compression substantially. I tried to add this to the LZSS format in LZSS-SAP, but it became messy and too slow, so I did not try further. The idea of using one bit to signal between LITERAL/COPY and REPEAT_COPY is a good one. Sadly, by using elias-gamma coding the optimal parsing becomes much more computationally intensive.

I see that the author also has a ZX1 format that stores offsets in a simpler format instead of the elias-gamma format, for faster speed.

What IMHO we could tests: Is really elias-gamma coding the optimal trade-off in speed v/s compression for encoding lengths and offsets?

Obviously it gives better compression that the LZ4 coding, but at a slower decoding speed. But as our computers have a natural limit of 65536 bytes for the lengths and offsets (and you could expect almost all lengths and offsets < 32768), perhaps a coding that limits the number of bits could be faster.

Have Fun!

elmer · March 27, 2021

16 hours ago, dmsc said:

Ah, I always suspected that adding repeat-offset blocks could increase the compression substantially. I tried to add this to the LZSS format in LZSS-SAP, but it became messy and too slow, so I did not try further. The idea of using one bit to signal between LITERAL/COPY and REPEAT_COPY is a good one. Sadly, by using elias-gamma coding the optimal parsing becomes much more computationally intensive.

I see that the author also has a ZX1 format that stores offsets in a simpler format instead of the elias-gamma format, for faster speed.

What IMHO we could tests: Is really elias-gamma coding the optimal trade-off in speed v/s compression for encoding lengths and offsets?

Hi dmsc! ?

IMHO, LZSA-2 is currently the compressor to beat in terms of the compression ratio and decompression speed, being only 20% slower than LZSA-1, but with much better compression ... but it looks like ZX0 may well have beaten it.

I won't be able to really judge the space/speed tradeoff on the 6502/HuC6280 until I have finished optimizing a decompressor for it.

BTW ... if anyone cares, I recently uploaded new versions of my LZSA-1 and LZSA-2 decompressors to the official LZSA project on github. They are both a little bit smaller and faster than before, while still being friendly for decompressing from banked ROM/RAM on the Atari.

LZSA-2 already supports repeat-offset blocks, so I suspect that it really is the elias-gamma coding that is helping ZX0, along with its very-clever rearrangment of the flag/data bits in the elias-gamma coding that makes the coding of LITERAL/COPY essentially free (in comparison to aPLib).

Another really clever idea is its use of the bottom bit of the offset to provide another elias-gamma bit, which keeps the elias-gamma bitstream on bit-pair boundaries, and so removes the need for 50% of the checks for an empty bit-buffer compared to aPLIB.

Apart from that ... the bottom byte of the offset is just stored, and not gamma coded, and that helps a lot with the decompression speed ... which is why the ZX1 format is only a measly 15% faster, and so only interesting because it shows that ZX0 is very well designed.

16 hours ago, dmsc said:

Obviously it gives better compression that the LZ4 coding, but at a slower decoding speed. But as our computers have a natural limit of 65536 bytes for the lengths and offsets (and you could expect almost all lengths and offsets < 32768), perhaps a coding that limits the number of bits could be faster.

Well, everything (except LZSS) beats LZ4 for compression, and things like LZSA-1 are just as fast to decompress, so I'm not sure why LZ4 is even mentioned anymore! ;-)

Both LZSA and ZX0 are specifically designed for 8-bit computers (unlike LZ4), and both are optimized for maximum lengths and offsets of 65535, so I believe that you already have a good answer to your question.

<edit>

One more thing ... ZX0 is horribly slow to compress, even in "quick" mode, but hopefully that will be improved in the future.

Edited March 27, 2021 by elmer

xxl · March 27, 2021

unShrinkler: consumer version now available: https://github.com/atari8xxl/unShrinkler

dmsc · March 27, 2021

Hi!

26 minutes ago, elmer said:

Hi dmsc! ?

IMHO, LZSA-2 is currently the compressor to beat in terms of the compression ratio and decompression speed, being only 20% slower than LZSA-1, but with much better compression ... but it looks like ZX0 may well have beaten it.

Yes, LZMA-2 is really good, but it is a little more complicated to decode. On speed, I can't say, as decoding elias-gamma is not very efficient, specially if you have interleaved bits.

26 minutes ago, elmer said:

I won't be able to really judge the space/speed tradeoff on the 6502/HuC6280 until I have finished optimizing a decompressor for it.

BTW ... if anyone cares, I recently uploaded new versions of my LZSA-1 and LZSA-2 decompressors to the official LZSA project on github. They are both a little bit smaller and faster than before, while still being friendly for decompressing from banked ROM/RAM on the Atari.

Thanks, will take a look.

26 minutes ago, elmer said:

LZSA-2 already supports repeat-offset blocks, so I suspect that it really is the elias-gamma coding that is helping ZX0, along with its very-clever rearrangment of the flag/data bits in the elias-gamma coding that makes the coding of LITERAL/COPY essentially free (in comparison to aPLib).

Yes, that is what I also suspect.

26 minutes ago, elmer said:

Another really clever idea is its use of the bottom bit of the offset to provide another elias-gamma bit, which keeps the elias-gamma bitstream on bit-pair boundaries, and so removes the need for 50% of the checks for an empty bit-buffer compared to aPLIB.

Did not look at the code (only the description), so I did not see this, and looking now it seems standard elias-gamma coding but interleaving the bits (like the original paper from Peter Elias), so encoding numbers from 32768 to 65535 take one more bit than necessary. Reading the bits interleaved seems better for processors without a barrel shifter.

26 minutes ago, elmer said:

Apart from that ... the bottom byte of the offset is just stored, and not gamma coded, and that helps a lot with the decompression speed ... which is why the ZX1 format is only a measly 15% faster, and so only interesting because it shows that ZX0 is very well designed.

I see that the ZX1 don't use elias-gamma for the offset, so it decompress faster but at slightly reduced compression ratio.

26 minutes ago, elmer said:

Well, everything (except LZSS) beats LZ4 for compression, and things like LZSA-1 are just as fast to decompress, so I'm not sure why LZ4 is even mentioned anymore!

Ha ha ha.... for the game "The Lady" I wrote a simple compressor I called "LZ8", using 8 bits for the match/literal length prefix, this is really fast to decode from BASIC (as no shifts are necessary), but compression ratio is lower yet than LZ4

26 minutes ago, elmer said:

Both LZSA and ZX0 are specifically designed for 8-bit computers (unlike LZ4), and both are optimized for maximum lengths and offsets of 65535, so I believe that you already have a good answer to your question.

<edit>

×

One more thing ... ZX0 is horribly slow to compress, even in "quick" mode, but hopefully that will be improved in the future.

IMHO, this is because the implementation is not coded for speed. Modern compressors like LZMA use vastly more complicated techniques and achieve faster compression, so writing a fast compressor for ZX0 should be possible.

Have Fun!

dmsc · March 27, 2021

Hi again @elmer!

7 minutes ago, dmsc said:

Did not look at the code (only the description), so I did not see this, and looking now it seems standard elias-gamma coding but interleaving the bits (like the original paper from Peter Elias), so encoding numbers from 32768 to 65535 take one more bit than necessary. Reading the bits interleaved seems better for processors without a barrel shifter.

So, I read again, and yes, now I see that offsets are stored as two parts: elias-gamma for the MSB and 7 literal bits for the LSB. I suppose that this was giving better results than a full elias-gamma for the offsets. Anyway, lengths are still unlimited.

Have Fun!

tane · March 28, 2021

On 2/1/2021 at 11:35 AM, xxl said:

there may also be a compressor that does not yet have an Atari decompression procedure (but must be more efficient than others)

This website has a ton of compressors to have fun during a spare time:

https://www.sac.sk/files.php?d=7&l=

A conclusion would be that, being that 7zip is very well qualified everywhere, everything near LZMA is a good achievement, so a second factor would be the speed and buffer size.

Edited March 28, 2021 by tane

elmer · March 28, 2021

6 hours ago, dmsc said:

Ha ha ha.... for the game "The Lady" I wrote a simple compressor I called "LZ8", using 8 bits for the match/literal length prefix, this is really fast to decode from BASIC (as no shifts are necessary), but compression ratio is lower yet than LZ4

I guess that you are technically correct, but personally, I classify all of those fixed-size coding variants of "either a literal, or a match within a fixed-size window" as being in the same category as "LZSS".

There were dozens of minor variants of the same scheme used by different development teams in the late 1980's and most of the 1990s.

Yes, LZ4, as an example of variable-size coding, will do better than those ancient compression schemes ... at the cost of slightly slower decompression.

As you have shown with your excellent LZSS-SAP program, even those ancient compression schemes still have their place in modern development on these old computers that we love. ;-)

xxl · March 29, 2021

unSrinkler and another massive acceleration, Fox added a fast multiplication, on the test program acceleration by 201 frames

https://github.com/atari8xxl/unShrinkler

tane · April 1, 2021

xxl · April 1, 2021

which one is good I checked zpaq but the result is poor (2189). If any of them are good, do the packaging test of the picture example from the first post.

R L · April 4, 2021

There is always PackFire v1.2k by Neural. You can find it on Pouet. Slightly worse than Shrinkler on your test .gfx file (1452), but it usually works better than Shrinkler for many other kinds of data.

I think PF is greatly harmed by the closed packer. That thing finishes in moments. I'm sure it could do a better job if it were a more open format, had a better packer, a boatload of depackers, etc...

xxl · April 4, 2021

12 hours ago, R L said:

PackFire v1.2k by Neural

thanks: added 1452

Packing binary file... done.

Input Output Gain % Depacker
-----------------------------------------------------
7680 1452 6228 81.09 Large

Packing binary file... done.

Input Output Gain % Depacker
-----------------------------------------------------
7680 1581 6099 79.41 Tiny

the large version requires a 16KB buffer? I do not see its use ... but the Tiny version does not require a buffer, besides it has disadvantages: the maximum size of the packed file is 32KB

instead, it has a huge advantage: there are sources available for the Z80. https://www.cpcwiki.eu/forum/programming/dumb-question-15839-bitbuster-1-2/msg32069/#msg32069

Edited April 4, 2021 by xxl

xxl · April 4, 2021

ok then we also have a PackFire decompressor Tiny version - conversion of this Z80 code with slight optimizations. it's not fast but it packs well.

in a few days, when the code matures, I will publish the source

pack-fire-6502.obx

xxl · April 4, 2021

decompressor for Atari: https://xxl.atari.pl/packfire-decompressor/

R L · April 5, 2021

Beats me what is the use case of the PF family of algorithms, honestly. It's supposed to be a better Shrinkler, but it sometimes isn't. Marginal benefits, big buffers, sucky Windows-only packer, etc...

Things that are LZMA derivatives, such as Shrinkler and PF take their time to spin up the adaptive part of the encoding. Ironically, the more state you have, the longer it takes for it all to spin up. If you had a bigger testcase it would have done quite a bit better.

I actually kinda have an interest in packers, though in a very unusual context. I am unpacking ARMv7 firmware that runs at GHz speeds. Initially, however, I only have 64 kilobytes of SRAM available, before the megabytes of DDR come online. I am using compression to shorten the time to download the firmware from the SPI flash, while only relying on SRAM for decompression. I needed a very simple format, since I had to give it a streaming interface (unpacking that stops on any byte boundary), which meant making a complete depacker by myself. In this context there is not much more to explore other than Shrinkler, LZSA2 and now possibly ZX0. Here's what they do to my firmware. Shrinkler used here is a hack that has double parity, since ARMv7 code is structured around 32-bit machine words.

Size:

PF: 171691

Shrinkler: 172516

ZX0: 199685

LZSA2: 207889

Original: 406996

Time to compress:

LZSA2: under 1s

PF: 2s

Shrinkler: 1m and a few seconds

ZX0: 22+ minutes. >_<

The one obvious improvement that I am still expecting to be made in compression is a departure from universal coding for sizes. ZX0 uses one of the Elias encodings, but that kind of encoding doesn't have an upper bound. If you are willing to have a bounded top, you might be able to eke out a few more bits here and there. It won't improve Shrinkler though, the encoding used there works very well with the adaptive probabilities.

I'd really like to see Squishy let loose on arbitrary data, but right now it can only do executables: http://logicoma.io/squishy/

ggn · April 5, 2021

4 hours ago, R L said:

Beats me what is the use case of the PF family of algorithms, honestly. It's supposed to be a better Shrinkler, but it sometimes isn't. Marginal benefits, big buffers, sucky Windows-only packer, etc...

PackFire was released many years before shrinkler, so there's that.

R L · April 5, 2021

40 minutes ago, ggn said:

PackFire was released many years before shrinkler, so there's that.

That certainly explains a lot.

I might revisit this field some day, when I'm not fighting the USB spec on a daily basis. I think there is a big amount of things that can be done in the "slower than Shrinkler" category. I'll let you guys know if I ever get something, though writing my own packer is not high on my priority list right now.

xxl · April 5, 2021

Shrinkler vs PackFire is a compressor clash for Amiga vs AtariST ? both are from the same period - updates are from 2020 and 2017 ?

The advantage of unShrinkler with low requirements (in a minimalist form 1KB buffer) it crushes the competition.

PackFire (Tiny) seems to be stronger than the top compressors: Deflater, aPLib, ZX0 - although more testing is needed here, and only Deflater requires a buffer.

Who's gonna beat the Shrinkler?

Evidence!

R L · April 5, 2021

Not Brotli, ending up at 1537. Surprising. Quite a disappointment.

xxl · April 6, 2021

Unlike most general purpose compression algorithms, Brotli uses a pre-defined dictionary, roughly 120 KiB in size, in addition to the dynamically populated ("sliding window") dictionary.

nice try...

LZ4 allows you to use compression with an external dictionary - a properly prepared dictionary for a data type can do wonders. So what, because the decompressor must have access to this dictionary Brotli is just such a "wonderful" child. you have to try harder ;-)

Brotli: 1537 added

ClausB · April 16, 2021

Does anyone know which compression method they used in this?

Edit: the original image data was 144K, just thin lines on black background (my avatar - look left). The XEX, including decompressor, is 27K.

Edited April 16, 2021 by ClausB

xxl · April 16, 2021

I'm pretty sure it's inflate

ClausB · April 16, 2021

I don't see that in your list.

xxl · April 16, 2021

is on the list. is one of the leading decompressors

def: 1,598 --- > decompressor for Atari: https://github.com/pfusik/zlib6502

Most efficient compression for Atari

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members