Jump to content
IGNORED

Any compressor between rle and lz4 ?


Recommended Posts

BTW you can also record the converted Pokey values and stream those. Perhaps you can just record a sid2gumby track with Altirra. Don't know if it supports stereo SAP-R.

 

Edit: BTW2: this does not necessarily mean way more streams (16 or 18). AUDCTL is constant and one sid voice is mirrored on the second pokey. Also, IIRC, the AUDC values of one half of the 16-bit channels are constant. But even if that's not the case, you end up with 6 AUDF/AUDC pairs, hence 12 streams of importance. Just need two extra stores for the mirrored sid voice.

Edited by ivop
Link to comment
Share on other sites

BTW you can also record the converted Pokey values and stream those. Perhaps you can just record a sid2gumby track with Altirra. Don't know if it supports stereo SAP-R.

 

Edit: BTW2: this does not necessarily mean way more streams (16 or 18). AUDCTL is constant and one sid voice is mirrored on the second pokey. Also, IIRC, the AUDC values of one half of the 16-bit channels are constant. But even if that's not the case, you end up with 6 AUDF/AUDC pairs, hence 12 streams of importance. Just need two extra stores for the mirrored sid voice.

 

I didn't know Sid2Gumby, was only aware of the latest iteration of your AtariSid... Honestly, S2G feels a little dated, especially stacked against AS5 :). Wish I could SAPR those instead... but yeah at 15khz that's not going to happen (although only the first channel seems to vary much during a frame)

I don't know the tech behind AS5, nor do I know anything about Pokey :grin:. But just out of curiosity, I think I read you were doing software mixing, does that mean a Sid voice is not mapped to a Pokey voice ? although according to the PMG visualizer that seems to be the case.

Link to comment
Share on other sites

Hi all!

 

Perhaps larger buffer sizes in dmsc's lzss would yield good gains too? It's pretty usable for any tune under 2 minutes right now.

Interesting that stream #3 actually increases in size (probably the drum channel), but stream #1 and #5 decrease in size, with an overall saving of an extra 85 bytes. But it would need an extra operation during decompression.

I just implemented 12 bit matches in the LZSS coder/decoder, this allows using larger window sizes, and produces a big gain in compression ratio, compare:

 

SHADOW.SAP:

 

8 bit match, 16 bytes window: max offset = 16, max len = 17, match bits = 8, ratio: 5675 / 42759 = 13.27%

12 bit match, 128 bytes window: max offset = 128, max len = 33, match bits = 12, ratio: 2201 / 42759 = 5.15%

 

3D_RMT.SAP (from RMT distribution, "3D, Atari version by raster/c.p.u. 2009"):

 

8 bit match, 16 bytes window: max offset = 16, max len = 17, match bits = 8, ratio: 15493 / 53568 = 28.92%

12 bit match, 128 bytes window: max offset = 128, max len = 33, match bits = 12, ratio: 4800 / 53568 = 8.96%

 

 

As the player is only 180 bytes, in the case of the "3D_RMT" sample, the player now is smaller than the original RMT player, 4980 v/s 5259 bytes.

 

The new "lzss" program accepts options, use "-8" for original 8 bit matches, "-2" for new 12 bit matches, or any other combination to try other match window sizes.

lzss-sap-20190527.zip

shadows-12.xex

3d_rmt-12.xex

3d_rmt.xex

  • Like 6
Link to comment
Share on other sites

 

I didn't know Sid2Gumby, was only aware of the latest iteration of your AtariSid... Honestly, S2G feels a little dated, especially stacked against AS5 :). Wish I could SAPR those instead... but yeah at 15khz that's not going to happen (although only the first channel seems to vary much during a frame)

I don't know the tech behind AS5, nor do I know anything about Pokey :grin:. But just out of curiosity, I think I read you were doing software mixing, does that mean a Sid voice is not mapped to a Pokey voice ? although according to the PMG visualizer that seems to be the case.

 

https://github.com/ivop/atarisid

 

It seems you have missed AtariSid 6? The binaries are in the xex directory.

 

You are right that the PMG's do not vary much, but that's because they only tell the volume, at 50Hz :) The width is not based on the samples. There simply wasn't enough cpu time to show some sort of osciloscope when I managed to switch from 7.8kHz to 15.6kHz.

 

Atarisid uses Pokey's first channel as a 15.6kHz timer. Each time it fires, it writes to pokey channels 2, 3 and 4. The whole emulation is a sort of a soft-synth.

Edited by ivop
  • Like 2
Link to comment
Share on other sites

Hi all!

 

 

 

I just implemented 12 bit matches in the LZSS coder/decoder, this allows using larger window sizes, and produces a big gain in compression ratio, compare:

 

 

 

Tried it on 7 gates of Jambala (110KB SAP).

 

Lzs8: 30KB

Lzs12: 8.5KB

 

The new version can be a tiny bit slower but still several times faster than the original player!

 

Guess it's an all around solution now :-D

  • Like 1
Link to comment
Share on other sites

Hi!

 

Tried it on 7 gates of Jambala (110KB SAP).

 

Lzs8: 30KB

Lzs12: 8.5KB

 

The new version can be a tiny bit slower but still several times faster than the original player!

 

Guess it's an all around solution now :-D

Main difference is in the RAM usage, as the 12 bit version uses 128 bytes for each stream.

 

I implemented a third version, with 16 bit matches. Using 8 bits for the window size (so 256 bytes for each stream), the compression is much better, tested it with 4 samples (attached original and compressed files):

 

---- shadows.sap -----
 max offset= 16,  max len= 17,  match bits= 8,  ratio:  5675 / 42759 = 13.27%
 max offset= 128, max len= 33,  match bits= 12, ratio:  2201 / 42759 =  5.15%
 max offset= 256, max len= 256, match bits= 16, ratio:  1103 / 42759 =  2.58%

---- 3d_rmt.rsap -----
 max offset= 16,  max len= 17,  match bits= 8,  ratio: 15493 / 53568 = 28.92%
 max offset= 128, max len= 33,  match bits= 12, ratio:  4800 / 53568 =  8.96%
 max offset= 256, max len= 256, match bits= 16, ratio:  3536 / 53568 =  6.60%

---- 4tk35.rsap ------
 max offset= 16,  max len= 17,  match bits= 8,  ratio: 27051 / 106587 = 25.38%
 max offset= 128, max len= 33,  match bits= 12, ratio:  9442 / 106587 =  8.86%
 max offset= 256, max len= 256, match bits= 16, ratio:  6342 / 106587 =  5.95%

---- aurora.rsap -----
 max offset= 16,  max len= 17,  match bits= 8,  ratio: 38741 / 114048 = 33.97%
 max offset= 128, max len= 33,  match bits= 12, ratio: 14495 / 114048 = 12.71%
 max offset= 256, max len= 256, match bits= 16, ratio: 11903 / 114048 = 10.44%
As you see, now "shadow.sap" is a little more than 1kB. My samples are from the RMT128 distribution, converted to SAP type-R.

 

Have Fun!

lzss-sap-20190529.zip

3d_rmt-16.xex

shadows-16.xex

aurora-16.xex

4tk35-16.xex

samples.zip

  • Like 7
Link to comment
Share on other sites

Hi!

 

great tool DMSC

 

playlzs16.asm (aurora.lz16) plays wrong

Thanks.

 

I'm not near the PC now, but you must use te following command line:

 

  lzss -6 input.rsap test.lz12
[code]

This is the same as:

[code]
  lzss -b 16 -o 8 -m 1 input.rsap test.lz12
Link to comment
Share on other sites

Another idea: turn the compressed data+player into a SAP file again, but this time Type B :D

 

This would allow existing SAP players without SAP-R support to play these songs (i.e. ALL of them, except for Altirra, but that's not strictly a SAP player).

Link to comment
Share on other sites

Hi!

 

 

Main difference is in the RAM usage, as the 12 bit version uses 128 bytes for each stream.

 

I implemented a third version, with 16 bit matches. Using 8 bits for the window size (so 256 bytes for each stream), the compression is much better, tested it with 4 samples (attached original and compressed files):

 

 

 

Tried it on 7 gates of Jambala (110KB SAP).

 

Lzs8: 30KB

Lzs12: 8.5KB

 

lzs16: 3.7KB !!

 

Can't wait for the next release ;)

  • Like 2
Link to comment
Share on other sites

  • 6 months later...
  • 2 weeks later...
On 5/13/2019 at 11:14 PM, xxl said:

LZ4 is very fast

 

gpl3.txt - 35147 bytes

exomizer - 12382 bytes + depacker 1 page =~ 12.3 KB, decompress 128 frames (2.6 sec)

deflate - 11559 bytes + depacker 2 pages =~ 11.8 KB, decompress 179 frames (3.6 sec)

LZ4 - 15622 bytes + depacker <150 bytes =~ 15.3 KB, decompress 55 frames (1,1 sec)

 

 

Even though @dmsc has done such a wonderful job in fulfilling @rensoup's needs for SAP compression/decompression,  I'm resurrecting this old post to go back to the less-specific discussion about file compression on our old 8-bit and 16-bit computers/consoles.

 

For me, LZ4's minimum match length of 4-bytes is a serious problem on these old target machines, although it is absolutely fine on the 32-bit target platforms that the LZ4 algorithm was actually designed for.

 

To show why I think that, I've written an optimized 6502 decompressor for aPLib (to replace the slow 65C02 example that is in aPLib), and done some testing on it.

 

The gpl3.txt text file example that @xxl shows doesn't really match Atari game code or data, but it is interesting to see how aPLib does in comparison to the others ...

 

aplib - 13148 bytes + depacker 1 page =~12.8 KB, decompress 68.5 frames (1,4 sec)

 

So ... 0.3 sec longer than @xxl's LZ4 depacker, for a saving of 2.5KB.

 

Looking at the compressed data, over 30% of the matches are under 4-bytes in length.

 

 

Looking at the "Legend of Xanadu 2" actual game data that I have mentioned before, using aPLib results in 934005 out of 1635397 matches (i.e. 57%) that were less than 4-bytes.

 

This suggests that even if decompression speed is your most important criteria, then someone should be able to come up with a better solution than LZ4 for the kind of data that we see on 8-bit and 16-bit machines.

 

For programmers that haven't seen it yet, may I point out Emmanuel Marty's LZSA ... https://github.com/emmanuel-marty/lzsa

 

 

For anyone that is willing to spend a few more cycles to get better compression than LZSA, my 6502 decompressor for aPLib can be found here ... https://github.com/jbrandwood/aplpak

 

Edited by elmer
Remove whitespace.
  • Like 2
  • Thanks 1
Link to comment
Share on other sites

Hi!

2 hours ago, elmer said:

Even though @dmsc has done such a wonderful job in fulfilling @rensoup's needs for SAP compression/decompression,  I'm resurrecting this old post to go back to the less-specific discussion about file compression on our old 8-bit and 16-bit computers/consoles.

 

For me, LZ4's minimum match length of 4-bytes is a serious problem on these old target machines, although it is absolutely fine on the 32-bit target platforms that the LZ4 algorithm was actually designed for.

Yes, I do believe that the LZ4  format is not the best for our 8-bit machines!

 

2 hours ago, elmer said:

To show why I think that, I've written an optimized 6502 decompressor for aPLib (to replace the slow 65C02 example that is in aPLib), and done some testing on it.

 

The gpl3.txt text file example that @xxl shows doesn't really match Atari game code or data, but it is interesting to see how aPLib does in comparison to the others ...

 

aplib - 13148 bytes + depacker 1 page =~12.8 KB, decompress 68.5 frames (1,4 sec)

 

So ... 0.3 sec longer than @xxl's LZ4 depacker, for a saving of 2.5KB.

 

Looking at the compressed data, over 30% of the matches are under 4-bytes in length.

 

 

Looking at the "Legend of Xanadu 2" actual game data that I have mentioned before, using aPLib results in 934005 out of 1635397 matches (i.e. 57%) that were less than 4-bytes.

 

This suggests that even if decompression speed is your most important criteria, then someone should be able to come up with a better solution than LZ4 for the kind of data that we see on 8-bit and 16-bit machines.

 

For programmers that haven't seen it yet, may I point out Emmanuel Marty's LZSA ... https://github.com/emmanuel-marty/lzsa

Did not know that, a great format indeed!

 

But it got me thinking, perhaps an LZSS derived format with an optional bit for reusing last match offset could archive greater compression with a very small size - specially in the 6502, where you can load bits with one instruction. Reusing match offset is great when compressing frames of an animation or other data that has only some bytes changed from the previous, as all the offsets are referencing the last frame, so are the same.

 

Have Fun!

 

Link to comment
Share on other sites

9 hours ago, elmer said:

aplib - 13148 bytes + depacker 1 page =~12.8 KB, decompress 68.5 frames (1,4 sec)

 

So ... 0.3 sec longer than @xxl's LZ4 depacker, for a saving of 2.5KB.

an even more efficient smallzl4 compressor is available - https://create.stephan-brumme.com/smallz4/#numbers

in tests, the decompression time on atari did not change

Link to comment
Share on other sites

16 hours ago, elmer said:

To show why I think that, I've written an optimized 6502 decompressor for aPLib (to replace the slow 65C02 example that is in aPLib), and done some testing on it.

 

The gpl3.txt text file example that @xxl shows doesn't really match Atari game code or data, but it is interesting to see how aPLib does in comparison to the others ...

aPlib looks like it could be interesting for Prince of Persia.

 

I tried to compress a bunch of files and the compression ratio was pretty good for code (not as good as deflate but within 5%) but not for graphics.

 

I could probably use it for loading the main executable which is around 23KB compressed (40+KB uncompressed). inflate takes about 4-5 seconds which is a big pause.

 

Where do you think it sits on the decompression rate axis ? somewhere around LZSA2 ?

 

pareto_graph.png

 

Any chance you could provide the source in MADS format with a decompression example? I quickly hacked it to get it to build but I can't get it do decompress properly (I skipped the APK 4 bytes header)

 

 

Link to comment
Share on other sites

To follow up on my original question...

 

I was looking to compress 2 types of data and decompress them in realtime:

 

1. music: which @dmsc masterfully solved with LZSS

2. sprite data.

 

There may probably not be any good solution for 2.

For a single uncompressed 200 bytes frame, LZ4 can't do much, it gave almost no compression. I also tried @Irgendwer's autogamy which compressed slightly better than LZ4. Only deflate gave reasonable results but the decompression time was astronomical.

I got almost as good results as deflate by simply removing empty bytes and storing an extra byte per line (sprites are 8 bytes large max) but the prospect of having to write a sprite routine for this format was too daunting so I gave up.

 

Link to comment
Share on other sites

7 hours ago, xxl said:

an even more efficient smallzl4 compressor is available - https://create.stephan-brumme.com/smallz4/#numbers

in tests, the decompression time on atari did not change

I tried a bunch of LZ4 compressors like the one you mentioned but most of them gave marginal gains at best (especially if you use the best compression option with the original LZ4). LZ4 still has the worst compression ratio.

Link to comment
Share on other sites

3 minutes ago, rensoup said:

I tried a bunch of LZ4 compressors like the one you mentioned but most of them gave marginal gains at best (especially if you use the best compression option with the original LZ4). LZ4 still has the worst compression ratio.

 

Yes, the problem isn't in the LZ4 compressor, it's in the LZ4 data format.

 

LZ4 was never designed to produce the best compression ratios, it was specifically designed for both fast compression and decompression in order to reduce the memory usage of databases and other large datasets on modern PCs and servers.

 

Just like smalllz4, Emmanuel Marty has also done his own "optimal" LZ4 packer that can get slightly better results than the official LZ4 packer ... https://github.com/emmanuel-marty/lz4ultra

 

Again, the results are marginal improvements, and not major gains.

 

Link to comment
Share on other sites

1 hour ago, rensoup said:

Where do you think it sits on the decompression rate axis ? somewhere around LZSA2 ?

 

I've not written an LZSA2 decompressor, so I don't really know.

 

Peter Ferrie's 6502 decompressor that is included in LZSA2 is written very similarly to his 65C02 decompressor for aPLib, and my aPLib code is about 2.5x faster than his.

 

An optimized LZSA2 decompressor should be faster than an aPLib decompressor ... there has to be some size/speed tradoff for the extra compression that aPLib gets, just like with the LZ4 decompressors.

 

Now that Emmanuel Marty has written an open source compressor for aPLib (https://github.com/emmanuel-marty/apultra), we have the opportunity to tweak the format a tiny bit in order to improve the decompression speed on the 6502.

 

 

1 hour ago, rensoup said:

Any chance you could provide the source in MADS format with a decompression example? I quickly hacked it to get it to build but I can't get it do decompress properly (I skipped the APK 4 bytes header)

 

I don't know MADS format, but I could probably take a look at it.  I'm more interested in writing for a banked Atari cartridge rather than a floppy-disc game, so I modified the assembler in HuC to output Atari .car format files.

 

If you've already hacked my source to get it to build, then you've probably already done everything that is needed.

 

I suspect that the problem that you've found with my aPLpak format is because you only skipped the first 4 bytes of header info instead of the 12 bytes that you can skip if you are only compressing a single file.

 

aPLpak is designed to store multiple compressed files/assets, so it starts with a header table that lists where the start/size is of all of the files within the archive.

 

I'll post some example code.

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...