Jump to content
IGNORED

VGM Compression Tool


Tursi

Recommended Posts

So I wanted to show some progress, I got the TI player running tonight, although it's not a comprehensive example. It does at least show that the direction I was hoping would work has some hope of working. ;)

 

Unfortunately, the player code being compiled C failed at competing with the previous assembly code. It uses 2 bytes more RAM (including the stack space it requires), which isn't bad, but uses roughly twice the code and on average twice the CPU cycles. (Worst case is about 20% worse instead of 100% worse, but worse wasn't an end goal. ;) )

TISN.dsk

The new code is a bit more flexible at the expense of taking a little more effort to use, but when I get all the examples done that should hide that.

 

I'll take a pass through and see if it's worth hand optimizing the compiler's code or maybe hand-compiling it. I'm pretty sure I should be able to get it down. Even if I can only get it to parity the new toolchain is much more useful.

 

Anyway, code's up on gitlab, but here's a quick demo disk to show it working. It's late, I'm tired, and it's neither a pretty demo nor the best song to have used (missing a full channel!), but it's a thing. ;) Boot with XB or EA#5.

 

  • Like 3
Link to comment
Share on other sites

12 hours ago, GDMike said:

Some song is stuck in my head..tch tch tch cu ch ch

The soundtrack to that game is some of the best on the NES - 

 

 

Edited by Tursi
(proper link, other was a remix)
  • Like 1
  • Thanks 2
Link to comment
Share on other sites

28 minutes ago, Tursi said:

The soundtrack to that game is some of the best on the NES - 

 

ok.. now this demo and the soundtrack are just way. too. cool.  First time hearing it and I'm already hooked, I really like stage 3!  This comment was very intriguing:   

 

"The usual arrangement for a NES soundtrack was to use the two pulse channels for melody, the triangle channel for the bass and the noise channel for percussion/drums. The DPCM channel wasn't used. Then came SunSoft and turned this entire convention on its head. They delegated the bass to the DPCM channel, the two pulse channels for the melody (either two voices at a time, or using one track as the echo of the other one), and the noise + triangle channels were used together for the drums: the attack portion was done with the noise channel and the decay with the triangle channel, with a downwards pitch bend usually. That's why the kick drum, snare drum and the toms are so realistic (at least in the scope of NES music) By the way, the Simmons SDS-V drums (you know, with the hexagonal pads), used a very similar approach to synthesize their sound. (they had an analog sine wave generator instead of a 4 bit triangle, though)"

 

 

  • Like 2
Link to comment
Share on other sites

19 hours ago, InsaneMultitasker said:

ok.. now this demo and the soundtrack are just way. too. cool.  First time hearing it and I'm already hooked, I really like stage 3!  This comment was very intriguing:   

Yeah, first time I heard it I thought it had a sound chip, too. That composer understood how to use his hardware. ;)

 

Link to comment
Share on other sites

I finished my first pass at hand-optimizing the assembly for playback. Overall GCC did pretty good, but some of the code is pretty spaghetti. Still, it's all documented and obvious improvements were made, although frankly there were few that could be easily added to the compiler. 

 

It's closer now... close enough that I'm going to need to switch test songs so I can compare apples to apples with the old one. I'm kind of surprised though, I really thought the new code was going to save a ton of time by removing three streams entirely - but I guess since it's rare you parse every stream every cycle, it doesn't matter as much as I hoped.

 

RAM usage is down to about 90 bytes, and no longer requires stack or a separate workspace (unless you want it to - it will consume whatever workspace you give it fully. However, this doesn't seem to bother GCC and even with correct tagging (I think), GCC let it run without feeling the need to cache every register on the stack before calling.) This should make asm interface easier as well.

 

Code size is about 900 bytes - the old one was about 600. Compared to the C version, performance is about 30% better on average (but the worst case is about 60% fewer cycles, which is faster and more consistent than the C version and the old version). Average is roughly consistent with the old version, but again, I need to compare the same song on both of them.

 

Going to take one more pass at optimizing the code for size, and see if there's anything to shave a few more cycles. I think it's pretty close to parity though. But I really am surprised, I thought beating that old code would be pretty easy. ;)

 

  • Like 1
Link to comment
Share on other sites

I didn't have much time tonight, so I just worked on converting the test song over to the old compressor. I found that the difference in compression was significant (the old one got nearly half the size). Since this is a small file, I'm going to take the time to analyze /why/ and see if I can incorporate the lesson into the new tool. Since the new tool beats the old one in most tests, it seems like finding this case where it does NOT would be advantageous.

 

That also let me get better verification of the CPU times. Through the whole song I'm currently seeing this on the hand-tuned GCC assembly:

 

CPU USAGE  Over the course of my test song (Silius title):

                 GCC code                            OLD              NEW
        MIN:  2,198 cycles                        938              1,534
        MAX: 17,310 cycles                      9,722            8,924
        AVG:  6,208 cycles                       3,989            4,221
Scanlines:  11-90 (avg 32)                5-51 (avg 21)    8-47 (avg 22)

 

So performance-wise, it's close, though again I kind of want to understand why. Still, should be able to shave a few more cycles and bytes off the assembly code. I suppose "slightly better in all cases" is still worth calling it an update, so I just need to get it down to the "all cases" part. ;)

 

The Coleco version was always in C... so I guess I won't have as big a challenge on that one. ;)

 

Link to comment
Share on other sites

  • 2 weeks later...

Ooh, I like that tune just a few seconds in...

 

I worked... a lot on this over the last little while. CPU usage is good... the minimum CPU usage is only a few hundred cycles more than the old, and I don't think there's much I can do about that. The good news is that the the maximum CPU usage and average CPU usage are about 10% better than the old player. (This is on the same Silius tune, which is pretty intense - it will vary by song). In terms of scanlines, the old player took from 5-51 scanlines (average 21), and the new one is doing 7-46 (average 19). Roughly.

 

I need to take a step backwards before I say it's solid and work on finishing off the toolset. It's generally compressing better than the old one, but when it's worse, it's /really/ worse. Silius again provided a good sample track - just the noise volume channel is twice as large with the new compressor compared to the old one - and the old one was only 600 bytes. If I can understand WHY and patch the reason into the new compressor, should be able to get the best of both worlds. I need to analyze that anyway - that's the last big part to push.

 

I am disappointed in the results. The new compressor/player will be better than the old, but not /much/ better. Of course, that one was the result of years of continual tinkering - so there's room for this one to improve. I also analyzed /why/ I wasn't seeing the improvement I expected. The old one had 12 compressed streams, after all, and this one only 9, so I expected 25% improvement in CPU right off the top. But the old one was smarter than I remembered - in the idle case it only needs to test the four timestreams and nothing else, while the new one has a timestream /and/ four volume streams to check. That's why the new one is slightly more expensive when the tune is idle. I worked very hard on this in particular, but there's probably still room to get it to par without changing the protocol. I'll also note that it's very frustrating without better pointer support in the CPU to optimize pointer operations. ;) (To load and de-reference a pointer in one operation would be lovely...) The old player also kept its own workspace, so some data was kept in registers which was of course faster.

 

I occasionally think I should drop it, but... it's close now. The extra tools will be useful. And slight improvement is still improvement. Although I worry about what the Coleco version is going to eat for CPU... I can't optimize that one much...

 

As usual, current code is up on Github.

 

  • Like 3
Link to comment
Share on other sites

Well, the deep dive analysis showed I was chasing a ghost - Journey to Silius is a complex tune and the reason it's bigger on the new compressor is mostly because the old one simplified it. Duh.

 

That said, I wrote a new tool to dump the format of a compressed stream so it was easier to examine what the tool decided, and messed a bit more with some of the tuning. I need to re-evaluate the scoring functions, cause I think there's a bug there, but otherwise I think it's decide. I re-compressed some older test tunes and compressed as good as, or better, than the last time I did (markedly better on one). But I'll probably stop playing with the compression code soon.

 

Tweaked up some little bugs in the VGM reading codes that wasn't properly handling when a channel was quiet at the beginning of a song, too. That's all up on Gitlab.

 

  • Like 2
Link to comment
Share on other sites

Finally had a bit of a win on this... I created some explicit test cases so I could walk through the compressor and fixed a whole pile of string search bugs. It's now both markedly faster and outperforms the old compressor (and previous versions of this one) in final size handily. It even beats the old one on my 680Rock stress test (although that requires a non-default command line ;) ). The old one only won there because it had extra codes specifically for helping the timestreams pack better - this one doesn't need them.

 

So with that, my four test songs compress down like this:

 

LetsPlay (converted from MOD) - 266.98 seconds - OLD: 10,806 bytes, NEW: 7,924 bytes (29.68 bytes/second)

Outrun (converted from MOD) - 174.23 seconds - OLD: 8,536 bytes, NEW: 8,050 bytes (43.70 bytes/second)

Afterburner (converted from Master System) - 84.03 seconds - OLD: 3,116 bytes, NEW: 2,952 bytes (35.13 bytes/second)

680Rock (converted from MIDI) - 307.20 seconds - OLD: 31,608 bytes, NEW: 31,346* bytes (102.04 bytes/second)

 

* the special command is a "deep dive" that expands the search to try all switches on every stream ;).

 

So, now I can say with a straight face that the new toolset uses less RAM, less CPU, and compresses better than the old one. I can work on finishing the tools.

 

I've pushed the latest up to Github - and also pushed a rough documentation PDF. It's not pretty, just an edited copy of my working document, but it should help introduce the various tools and the internal formats.

 

https://github.com/tursilion/vgmcomp2/blob/master/dist/Docs.pdf

 

 

 

Edited by Tursi
  • Like 3
Link to comment
Share on other sites

I made the mistake of typing a teasing comment that reminded me I missed something. Specifically, I joked that I don't need to make an AY version of the TI player because there's no AY board for the TI. Then I remembered Jim is re-launching the SID Blaster.

 

The AY was already slightly crippled by the SN dataset, but it was close enough that you barely noticed, but now I've successfully crippled the SID too. ;) If, for some reason, you just need three more SN channels - be they tone or noise - there's now support through the toolchain and a sample player for it. By crippled I mean it doesn't use any of the SID features like envelopes, mixing or filters. In fact I learned that I needed a bit of a hack to modulate the volume, because ONLY the envelope does that, but the end effect appears to work, at least in emulation (on two different emulators).

 

In theory, it's possible to create a tune specifically for the SID that uses half the envelope - the attack and decay registers are untouched (sustain receives the current volume and release never happens) - indeed using the AD part would save a lot of volume changes, and you can still release manually using volume changes. In addition, the filter registers are also untouched and the waveform select register is under player control, except for the gate bit. But that will require someone with a need for it - I've accomplished my goal for now.

 

However, I don't have the SID hardware anymore. If someone with a SID blaster wants to try it, this is a quick and dirty port of the Afterburner Take Off theme - does it play? ;)

 

TISIDPLAY.zip

  • Like 1
Link to comment
Share on other sites

Otherwise... everything seems stable. I still need to finish the TI SFX player and then the base toolset will be complete. Then there's the Coleco ports and a heaping handful of little tools for manipulating the datastreams to round it off...

Edited by Tursi
Link to comment
Share on other sites

I hope to do so also... I was repairing @arcadeshopper sid99, and along the way found that mine didn't work so well... So I tried a mini-SwinSID in it, and that doesn't work right... 

 

We got @arcadeshopper 's board fully operational, and with a real sid, Marc's sid player works great... with the SwinSID 60% of what a real sid would output was missing.

 

So hopefully my curiosity about this will get me off my rear end and dig out a real sid - maybe compare. 

  • Like 1
Link to comment
Share on other sites

Following along with great interest.

 

I thought I'd share some of the SN76489 music I've bookmarked:


This is some really good PSG music (SN76489)
https://tomy.bandcamp.com/album/psg-series-1
from https://www.smspower.org/forums/14380-TomysPSGSeriesSN76489Music

 

Especially "Assembled in 1987"

 

Good attacks on Cotton Candy

 

Glazed Eyes is not too monotonous

 

On Tomy's PSG 5, The Outsider has an interesting low-frequency beat. It sounds like sequential periodic and white noises, with attack using white noise, then periodic noise in the decay.  Sometimes it's  periodic noises with a jump in frequency.

https://tomy.bandcamp.com/track/the-outsider-2
 

Tomy records from a Sega Master System.

 

 

  • Thanks 1
Link to comment
Share on other sites

I am crying at how good Force Command looks.. I really need to get my system up and running. Hopefully soon.

 

The SwinSID does sound a bit off - I thought the SID clones were solid by now, but the rise and fall times seem slower, although at least my gate trick works and the pitches seem to be right. ;) There's also an occasional high pitched tone that shouldn't be there - but may well be my bug. On the SN and AY you can get "silence" with a very high pitched tone (beyond hearing range), but the entire range of the SID is audible. My code was supposed to mute those cases, but maybe it failed... (alternately, the slower decay time might be why they are audible, if it is indeed slower.)

 

The real SID sounds closer to emulation (or more correctly, emulation seems to get the real SID pretty well ;) ), though it seemed to cut out a couple of times that I wasn't expecting. My main fear was that it would be clicking all the time due to re-triggering, but I don't hear that. It didn't play long enough for me to hear if the high pitched tones that shouldn't be there were there. ;)

 

Well, it works, thanks for the testing. I kind of want to do a quick test with Castlevania that runs both chips at once, then we can see what the master volume needs to be for them to balance, but I can keep moving forward.
 

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

5 hours ago, FarmerPotato said:

Tomy records from a Sega Master System.

That guy is crazy good. He and Rushjet1 need to get into a PSG battle. ;)

 

(Edit: LaurenX is my fav, though Witch's Den also made me say "damn" out loud. ;) )

 

Rushjet doesn't have a page of just his PSG tunes, though... I did buy a bunch from him for Super Space Acer though. More of his retro chip tunes are NES, though.

 

https://rushjet1.com/

 

Edited by Tursi
  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...