Rybags Posted January 30, 2015 Share Posted January 30, 2015 If using RLE the obvious solution would be to do seperate arrays per register as obviously you'll get decent runs for AUDCTL, fair to good ones for AUDF, probably not so good ones for AUDC if using envelopes. Maybe the ideal would be a combination of RLE and some sort of delta system - often AUDF and AUDC registers will just change value by 1 up or down. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 30, 2015 Author Share Posted January 30, 2015 Rybags... don't get it... why should RLE cause any problems? each stream is handled separatly? maybe delta encoding can help, too.... or sampling rate. Quote Link to comment Share on other sites More sharing options...
Creature XL Posted January 30, 2015 Share Posted January 30, 2015 Nice work with the dumber Packing is a must as you have pointed out. What Rybags meant - probably - that RLE should be perfect for AUDCTL as it doesn't change much and there for oyu can pack maybe 250 bytes into 2. And the 4 AUDC channels use - most probably - envelopes with the same waveform and different volume so delta packing the 4 bits should be enough. Or at least, RLE is ill-suited as you will almost never have two or more times the same value. So it might even be bigger in the end, as you have to escape certain byte values. AUDFx COULD be suitable for RLE as well. So, heaven, next test for oyu is a packed and depacked data stream to check how much cycles/scanlines the depacker takes Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 30, 2015 Author Share Posted January 30, 2015 na, the dumper and the streamer are not rocket science... fortunalty the setting of POKEY registers is at the end of RMT so adding few lines of "dumb" code was easy... Quote Link to comment Share on other sites More sharing options...
emkay Posted January 30, 2015 Share Posted January 30, 2015 So, if your musicians need to use RMT for creating music. And you have the possibility to change the code, and to do optimisation there, Someone should do the instruments that change "interleaved". You could do that wit h 2 or even 4 channels. Later you adjust the player to play "Frame a" Channel 1 & 3 , "Frame b" Channel 0 & 2 ... Avoiding "Vibrato" and other automated FX helps to keep the player small. Quote Link to comment Share on other sites More sharing options...
Tezz Posted January 30, 2015 Share Posted January 30, 2015 Great This is something I've wanted to get around to trying too and with regard to SID playing. I'm interested to do that once I've got this game completed hopefully by the weekend. 1 Quote Link to comment Share on other sites More sharing options...
miker Posted January 30, 2015 Share Posted January 30, 2015 here is my dumper... <snip> Could you make it more "universal"? Ie. there's a number of tunes (including BASIC-ones), that could be dumped only this way. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 30, 2015 Author Share Posted January 30, 2015 How? As you Need to catch the writes... Or alternative Altirra could do the dump while emulating... Quote Link to comment Share on other sites More sharing options...
miker Posted January 30, 2015 Share Posted January 30, 2015 (edited) Oh well, I thought about putting PLA as first instruction (to allow running it from BASIC) and then make the program exit after each pass. Don't know if if will be very hard to recreate actual "tempo" of song. Good idea or waste of time, what do you think? Edited January 30, 2015 by miker Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 30, 2015 Author Share Posted January 30, 2015 Miker... you can simply alter the basic program to dump pokes (or recode the sound command) directly to disc or memory... but as pokey is write only register... how should my program know what value was written? Quote Link to comment Share on other sites More sharing options...
ac.tomo Posted January 30, 2015 Share Posted January 30, 2015 (edited) Nah, overkill for me (aside from a great Motorhead track) is to not have any RMT player on the A8 at all. Have the RMT composer app itself output the whole 9-byte dump stream and then: a) take each byte stream and convert the values to the movement size (note 1) b) Huffman analyse the frequencies of the resulting values (note 2) c) encode each stream using the table produced. Alternatives would be to make a) optional (leave values as is) and/or for b) & c) be to replaced with run-length-encoding. note 1: i.e. new value - old value but as this only gives us -127 -> +127 So -128 (b10000000) could be used as an escape to have the following byte set the new value. note 2: bit of extra memory space wasted needing to save the table in addition to the encoded song data. Therefore a small playback routine can get the next byte from each of the the 9 streams and then pump these to the Pokey registers. Benefit... more CPU time for your super screen/sprite code. Potential space savings by the compression on the raw Pokey data and size of playback routine. Disadvantages... probably a PITA to implement... even the compressed data may end up larger than the actual RMT song/pattern/instrument data Other considerations... sound effects?.... 50/60Hz need different initial outputs? yeah... look how much a Stereo RMT track needs cycles per frame... and a 9 byte RLE encoded stream could be the way to go. I definitly will try... right now RMT costs me 1 FPS. Oxyron used on c64 similar method in some productions btw. re: sound fx... right now we even have no sound fx library (not RMT)... I have no clue how... Look like RoF, Koronis Rift, Alley Cat, Star Raiders etc programmed sound. I am totally newbie on this field... Does anyone have experience in programming "explosions" etc? Forgive me if I sound dull, but what is RLE & 9 bit streams, I'm really finding it difficult to understand what is trying to be explained, don't get me wrong, I am up on POKEY, I just don't know some of these modern terms, I've been away from 8-bit programming for some time & so am a bit rusty, some things are coming back to me, though. I do understand some of what is being said, in that your trying to update the sound registers at an evenly measured amount of time, which ofcourse can be done via a DLI. This is the same for graphics, when your processing some kind of visual effect that takes multiple frames to process. The sound method merely needs a flag, which when set the DLI or VBI stores the values into the AUDio registers, obviously, you'll need to extract the AUDio storing from the sound monitor play routine, and when this is finished, then set the flag which the DLI or VBI needs to store the sound values. The graphics method is similar, you just switch between 2 different screens, like page flipping, displaying the one whilst altering the 2nd & displaying the 2nd when updating the 1st for nice smooth changes. Off the top of my head, this is probably necessary with sound, unless the sound play routine takes this into consideration, for nice smooth changes. Edited January 30, 2015 by ac.tomo Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 30, 2015 Author Share Posted January 30, 2015 Basicly what we trying to do is instead of using complex music routines simply use a stream of data which gets fed into the pokey. Look RMT creates music with its own patterns and algorithms like instruments etc but in the end of the day it simply writes 50x or 60x per second values into the 9 pokey registers. Now.... Think of we are recording these stream into memory and instead of using the complex music routines we just store the already recorded ("precalced") values into the pokey registers? Normally precalculating happens with gfx. Think of sprite animations which get stored in frames. We do the same for audio data. An example of audio data sampling is digitized music. Now... Writing 50 times per second data into 9 audio registers of pokey costs a lot of memory for storage... In my example nearly 27kb for 40 sec. Now when you are looking into the stream you recognize that you can pack the data stream to make it shorter. The dli example was to show how many CPU cycles we are saving by using a stream instead of RMT replay routines. 1 Quote Link to comment Share on other sites More sharing options...
ac.tomo Posted January 30, 2015 Share Posted January 30, 2015 Basicly what we trying to do is instead of using complex music routines simply use a stream of data which gets fed into the pokey. Look RMT creates music with its own patterns and algorithms like instruments etc but in the end of the day it simply writes 50x or 60x per second values into the 9 pokey registers. Now.... Think of we are recording these stream into memory and instead of using the complex music routines we just store the already recorded ("precalced") values into the pokey registers? Normally precalculating happens with gfx. Think of sprite animations which get stored in frames. We do the same for audio data. An example of audio data sampling is digitized music. Now... Writing 50 times per second data into 9 audio registers of pokey costs a lot of memory for storage... In my example nearly 27kb for 40 sec. Now when you are looking into the stream you recognize that you can pack the data stream to make it shorter. The dli example was to show how many CPU cycles we are saving by using a stream instead of RMT replay routines. Ah, right, I got you. In my opinion, this way obviously reduces CPU time, but excessively uses memory, This is usually the method for DIGItized SamPLes, perhaps an extra variable can be included in order to know how many frames before re-doing the AUDio registers. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 30, 2015 Author Share Posted January 30, 2015 And to reduce the memory needed we can pack the data... And there are several ways like run length encoding ("rle" instead of storing 100x byte 0 why not only store 100,0?) or delta encoding ("storing only +,- difference to previous value and pack that cleverly) or more complex stuff like lz4. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 30, 2015 Author Share Posted January 30, 2015 Regarding why doing this? To say it with TBBT... Because we can.... maybe because I want the CPU to render gfx instead of spending too much time on playing music 1 Quote Link to comment Share on other sites More sharing options...
1NG Posted January 30, 2015 Share Posted January 30, 2015 When I put my first RMT in the source code I thought that it was using 50 Hz update frequency, but it didn`t! Streamwise doing the music at 50 Hz would reduce the quality of RMT music a bit. But of course it has a lot of advantages to play at constant 50 Hz. For example it can be done with constant time in a raster program. Quote Link to comment Share on other sites More sharing options...
Creature XL Posted January 31, 2015 Share Posted January 31, 2015 When I put my first RMT in the source code I thought that it was using 50 Hz update frequency, but it didn`t! Streamwise doing the music at 50 Hz would reduce the quality of RMT music a bit. But of course it has a lot of advantages to play at constant 50 Hz. For example it can be done with constant time in a raster program. The stream method would work with 100Hz, 200Hz and all other freqs. It's just a matter of memory. FYI, for a quick and not really useful test I ZIPped the dat file from Heaven. From 27k to 736 bytes. Don't know how many bytes are gzip header WAIT... maybe not so useless. Maybe a kind of double buffer would work. decoding the next 5 seconds with inflate while playing Guess Heaven (or someone else but me) needs to produce dump data for a more complex song which might not be packable that well. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 31, 2015 Author Share Posted January 31, 2015 Creature... That's sound promising.... You can simply export another song and alter the addresses in my source. Not a big deal. On my list is trying lz4 for each sound channel stream as lz4 is usable for stream de packing... (Not sure if actually faster though) Quote Link to comment Share on other sites More sharing options...
Creature XL Posted January 31, 2015 Share Posted January 31, 2015 (edited) Creature... That's sound promising.... You can simply export another song and alter the addresses in my source. Not a big deal. Too much trouble for a quicky. i have no mads, and no 7z on my windows machine, so I pass and wait for your experiments EDIT: I'd rather try out the ring-buffer thinggy. In the end I would prefer that for ease of use. Edited January 31, 2015 by Creature XL Quote Link to comment Share on other sites More sharing options...
Tezz Posted January 31, 2015 Share Posted January 31, 2015 The crux of the matter with this to be most useful has been down to compression. What I was intending to do was to compress the individual patterns to a list of streams and reconstruct the song so I would have a very simple minimal player with hopefully not too large compressed streams. When it's been explored we'll have a better idea of what the trade off of memory for performance will be compared to the play routines, frequency tables and song data. 2 Quote Link to comment Share on other sites More sharing options...
emkay Posted January 31, 2015 Share Posted January 31, 2015 When it's been explored we'll have a better idea of what the trade off of memory for performance will be compared to the play routines, frequency tables and song data. What to explore? The A8 has no math Co-Processor. So any calculation means lost CPU cycles. It's as I wrote above. Using a "Runtime" that only acts on changes and does nothing when no change is needed. One VBI cycle difference, to start an instrument on different channels is the way to go. Also, no automated features as portamento or vibrato is allowed. For something like this: you need some update every 20-40 VBI cycles... Quote Link to comment Share on other sites More sharing options...
Tezz Posted January 31, 2015 Share Posted January 31, 2015 What to explore? The A8 has no math Co-Processor. So any calculation means lost CPU cycles. Perhaps you misunderstood what I wrote, I mean with regard to compression and memory usage. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 31, 2015 Author Share Posted January 31, 2015 (edited) here an example of packing channel 1 of the horror.rmt with lz4... its little harder than thought... I need to alter Fox unlz4... esp. the store_byte... and have no time basicly my idea is to alter the depacker directly into POKEY register... so we got 9 depack routines for each pokey register. not sure if that would be any faster... I guess Tezz's way might the way to go .proc lz4_depacker unlz4 jsr lz4_GET_BYTE ; length of literals sta token lsr lsr lsr lsr beq read_offset ; there is no literal cmp #$0f jsr getlength literals jsr lz4_GET_BYTE jsr store bne literals read_offset jsr lz4_GET_BYTE tay sec eor #$ff adc lz4_dest sta lz4_src tya php jsr lz4_GET_BYTE plp bne not_done tay beq unlz4_done not_done eor #$ff adc lz4_dest+1 sta lz4_src+1 ; c=1 lda #$ff token equ *-1 and #$0f adc #$03 ; 3+1=4 cmp #$13 jsr getLength @ lda $ffff lz4_src equ *-2 inc lz4_src bne @+ inc lz4_src+1 @ jsr store bne @-1 beq unlz4 ; zawsze store sta $ffff lz4_dest equ *-2 sta $d200 inc lz4_dest bne @+ inc lz4_dest+1 @ dec lenL bne @+ dec lenH @ unlz4_done rts getLength_next jsr lz4_GET_BYTE tay clc adc #$00 lenL equ *-1 bcc @+ inc lenH @ iny getLength sta lenL beq getLength_next tay beq @+ inc lenH @ rts lenH .byte $00 lz4_get_byte lda $ffff lz4_source equ *-2 inc lz4_source bne @+ inc lz4_source+1 @ rts .endp Edited January 31, 2015 by Heaven/TQA Quote Link to comment Share on other sites More sharing options...
Tezz Posted January 31, 2015 Share Posted January 31, 2015 very promising Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted January 31, 2015 Author Share Posted January 31, 2015 Not sure yet.... have no time and passion today. AbovE lz code does not work... Just for the record. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.