Jump to content

Photo

Question: Digitized Sound on Atari 2600


26 replies to this topic

#1 chjmartin2 OFFLINE  

chjmartin2

    Moonsweeper

  • 322 posts
  • Location:Massachusetts

Posted Sun Jun 26, 2011 10:18 AM

Has anyone been able to get the Atari 2600 to play digitized sound?

#2 SpiceWare OFFLINE  

SpiceWare

    Draconian

  • 12,158 posts
  • Medieval Mayhem
  • Location:Planet Houston

Posted Sun Jun 26, 2011 10:40 AM

Quadrun


Open Sesame (seconds 2-4, it's a bit hard to understand)


Berzerk, Voice Enhanced


This is my test for Frantic


Frantic's going to be an updated version of Berzerk/Frenzy, complete with 64K room layouts just like the arcade! You can follow the development in my blog entries, I'll start posting builds in the homebrew forum once it's playable.

Titlescreen (48 pixels with 2 colors per scanlines without flicker!)
frantic_harmony.bin_38.png

Berzerk style maze
frantic_harmony.bin_36.png

Frenzy style maze
frantic_harmony.bin_35.png

Edited by SpiceWare, Sun Jun 26, 2011 11:02 AM.


#3 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • 211 posts

Posted Sun Jun 26, 2011 1:35 PM

I've given this problem some thought as well, and there are a couple of ideas:

The simplest is to set AUDC0 (or 1) to zero and put raw 4-bit PCM values in ADVC0 at your sampling rate. You can bump the resolution to 5-bit by using both channels.
You need to update AUDV0 at regular intervals with this method, which is probably going to be tricky. If you only update it in your display kernel you'll get a ring modulator effect (think Dalek) since you're not updating it during VBLANK. Free audio effect or annoying distortion depending on how you look at it :)

The method I'd probably shoot for is some kind of vector quantizer. Build a big codebook where each entry corresponds to the spectrum of every combination of the audio registers in one or both channels. Then just do a brute force search for the spectrum of the audio for each frame of your effect. I'd probably stick to one channel in order to avoid having to worry about phase.

For voice you can probably get away with something very simple. If you accompany it with text just getting the volume right with some white noise could suffice.
In the demo world, this is what (core) seems to do - see http://pouet.net/prod.php?which=30236 . Another example that seems to be using a similar technique is Robotic Liberation (http://pouet.net/prod.php?which=10626), though that is on a VIC-20.

#4 SeaGtGruff OFFLINE  

SeaGtGruff

    Quadrunner

  • 5,558 posts
  • Location:Georgia, USA

Posted Sun Jun 26, 2011 6:28 PM

The simplest is to set AUDC0 (or 1) to zero and put raw 4-bit PCM values in ADVC0 at your sampling rate. You can bump the resolution to 5-bit by using both channels.

Thanks! I hadn't thought of using both channels to double the effective amplitude. But wouldn't the maximum effective amplitude be 30? So it's just a tad shy of being true 5-bit sound, more like 2^5-1 sound.

You need to update AUDV0 at regular intervals with this method, which is probably going to be tricky. If you only update it in your display kernel you'll get a ring modulator effect (think Dalek) since you're not updating it during VBLANK.

I think you should (ideally) update on every line with this method, even during VBLANK and VSYNC. And if possible, updating twice per line would be better than updating once per line.

Also, for higher frequencies I think it's necessary to cherry-pick your samples to ensure a better reproduction. I've been researching TIA sound-- and sound/music/synthesis in general-- for a while now, so I've been thinking about the problem of higher frequencies. True, you might not need to worry about that for recreating speech. But suppose you want to be able to output every MIDI note from 0 (C-2) to 127 (G8). Since 60 (C3) is referred to as "middle C" in MIDI, I shift the numbers up an octave-- 0=C-1, 60=C4, and 127=G9. G9 has a frequency of 12543.85Hz, which is almost equal to the scan line rate of 15699.89Hz (based on the 2600's crystal rate of 3579575Hz). So unless you output a sample every half-line, there's really no way to produce G9. If you divide the scan line frequency by G9's frequency, you get about 1.25, so you need to output 4 cycles (of sine waves, square waves, or whatever) for every 5 scan lines (1.25=5/4). Turning it around, 1 scan line correlates to 0.8 (4/5) of G9's frequency cycle.

If you sample once per scan line, you get

0.0
0.8
1.6=0.6
2.4=0.4
3.2=0.2
4.0=0.0

where the values listed above are the fraction of a cycle. If you translate those fractions into indices for a lookup table-- whether it's 16 values per cycle, or 32, or 256-- the values you pull from the table aren't going to produce a G9 note.

Even if you sample twice per scan line, you get

0.0
0.4
0.8
1.2=0.2
1.6=0.6
2.0=0.0
2.4=0.4
2.8=0.8
3.2=0.2
3.6=0.6
4.0=0.0

You do get a better digital reproduction of the G9 sine wave, but you still don't get a good G9 note, because none of the values go up or down to the full amplitude, and the amplitudes for each cycle would be different.

What you really want to do is cherry-pick the samples so each cycle has a high value and a low value equal to the amplitude, like this:

15
0
15
0
15
0
etc.

except you'd have an extra halfway value (7 or 8 ) in certain spots between 15 and 0, or between 0 and 15.

Voice reproduction is different-- in some ways maybe easier-- because the exact frequencies aren't as important as for musical notes, and voices have lower frequencies than high musical notes anyway. Check out SpiceWare's Blog to see what he's been doing for "Frantic." :)

Michael

Edited by SeaGtGruff, Sun Jun 26, 2011 6:29 PM.


#5 chjmartin2 OFFLINE  

chjmartin2

    Moonsweeper

  • Topic Starter
  • 322 posts
  • Location:Massachusetts

Posted Mon Jun 27, 2011 6:33 AM

http://www.msx.org/P...wspost3520.html

It would seem to me that a Viterbi search would be your best bet to get the highest fidelity sound. You could utilize all of the channels. You may have to use the entire 2600 to get it right, but would be a cool demo to get max sound quality. Wonder if the MSX source code would help at all? What is the max frequency for a 4 bit implementation now?

#6 SeaGtGruff OFFLINE  

SeaGtGruff

    Quadrunner

  • 5,558 posts
  • Location:Georgia, USA

Posted Mon Jun 27, 2011 8:50 AM

What is the max frequency for a 4 bit implementation now?

If by "frequency" you mean the number of cycles per second for a sound wave, the scan line rate is about 15700 Hz, so if you update the AUDV registers once per scan line, you can use a frequency of 15700 Hz. Of course, that's really just for one half of a full cycle-- at least if you're talking about a sine wave or a square wave for a pure note of music, since to get a complete wave you need at least two values (high and low)-- so it would be a maximum frequency of about 7850 Hz for a pure note. For voice or multi-note sound, where you have a bunch of different sine waves merged together to create a series of up-and-down values that follow no clearly identifiable sine wave shape, that would be about 15700 Hz for each sample.

If you update the AUDV registers twice per scan line, the frequency would be about 31400 Hz. That's actually the TIA's "native" sound frequency (we're talking strictly NTSC here, since PAL and SECAM have a slightly lower frequency). The TIA generates an A-Phi-1 and an A-Phi-2 audio clock pulse twice per scan line. Their timing is kind of funky-- 9, 10, 18, 20, where those values represent a multiple of 1 "horizontal count" or 4 color clocks, so it's 36, 40, 72, 80 color clocks from A-Phi-1 to A-Phi-2, from A-Phi-2 to A-Phi-1, from A-Phi-1 to A-Phi-2, and from A-Phi-2 to A-Phi-1. However, I believe (but am not certain) that the important thing is the transition from A-Phi-1 to A-Phi-2, in which case the timing is 28, 29 horizontal counts (or 112, 116 color clocks), so if you think of those as two consecutive samples, the odd samples are ever so slightly shorter than the even samples, but average to 114 color clocks each. The TIA takes these audio clock pulses-- with their base frequency of about 31400 Hz-- and divides the frequency by a number from 1 to 32 based on the AUDF setting, then feeds the resulting pulses through a process that generates different kinds of sounds (most of them "noise," although some of them are regular pulse waves) based on the AUDC setting. Finally, the AUDV setting is applied to the resulting pattern of 0 and 1 values to get a wave of the desired amplitude from 0 to 15.

But for digitized sound, you don't want to use the TIA's sound generation, you really want to set AUDC to 0 and modulate AUDV to create your own waves, as Tjoppen said. Setting AUDC to 0 will output a steady stream of 1s, so you can use AUDV to change the amplitude of each sample from 0 to 15 (or, if you combine both channels as Tjoppen suggested, from 0 to 30). There are 76 cycles per scan line, so if you don't care about drawing the video display and just want to play sound, you can update AUDV as often as your code allows-- twice per line (31400 Hz), or three times per line (47100 Hz), or four times per line (62800 Hz), or whatever you can manage. But if you're drawing a display in addition to playing sound, the complexity of the display will affect how many times you can update AUDV. For example, if you're making a lot of graphics changes during the line, you might not have time to update AUDV more than once or twice per line. But if you're doing simplistic or minimal graphics, you might be able to update AUDV three or more times. The maximum number of times will depend on how long it takes you to load each sample and store it to AUDV, not to mention any incrementing, decrementing, addition, subtraction, or branching that needs to be done, so you'd probably be lucky to squeeze in three updates at most. If you're using the DPC+ chip, and set up your sample data so you can load it with an immediate-mode instruction, you should be able to squeeze in several updates-- again, depending on your code, and whether you're drawing a display.

Michael

#7 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • 211 posts

Posted Mon Jun 27, 2011 11:36 AM

The simplest is to set AUDC0 (or 1) to zero and put raw 4-bit PCM values in ADVC0 at your sampling rate. You can bump the resolution to 5-bit by using both channels.

Thanks! I hadn't thought of using both channels to double the effective amplitude. But wouldn't the maximum effective amplitude be 30? So it's just a tad shy of being true 5-bit sound, more like 2^5-1 sound.

Yes, of course. It comes to log2(31) = 4.95 bit which you might as well round up to 5-bit :)

You need to update AUDV0 at regular intervals with this method, which is probably going to be tricky. If you only update it in your display kernel you'll get a ring modulator effect (think Dalek) since you're not updating it during VBLANK.

I think you should (ideally) update on every line with this method, even during VBLANK and VSYNC. And if possible, updating twice per line would be better than updating once per line.

Yes, this is probably doable for a demo. For a game, less so, except maybe a static screen.
Regarding near exact reproduction of certain notes, you could always "not care" like lft suggested in his talk at Revision this year (Elements of Chip Music, VCS mentioned in the middle somewhere).

Voice reproduction is different-- in some ways maybe easier-- because the exact frequencies aren't as important as for musical notes, and voices have lower frequencies than high musical notes anyway. Check out SpiceWare's Blog to see what he's been doing for "Frantic." :)

I already looked a bit, but now I'll have to take a closer look.

#8 GroovyBee OFFLINE  

GroovyBee

    Games Developer

  • 9,800 posts
  • Busy bee!
  • Location:North, England

Posted Mon Jun 27, 2011 11:42 AM

There is also speech on the 7800 in the game Jinks.

#9 SeaGtGruff OFFLINE  

SeaGtGruff

    Quadrunner

  • 5,558 posts
  • Location:Georgia, USA

Posted Mon Jun 27, 2011 12:41 PM

I think you should (ideally) update on every line with this method, even during VBLANK and VSYNC. And if possible, updating twice per line would be better than updating once per line.

Yes, this is probably doable for a demo. For a game, less so, except maybe a static screen.
Regarding near exact reproduction of certain notes, you could always "not care" like lft suggested in his talk at Revision this year (Elements of Chip Music, VCS mentioned in the middle somewhere).

As far as updating two or more times per line, I was thinking more in terms of a music-only or music-mostly program, like a 2600 soft synth (which is actually what I've been interested in lately), or else a program that uses DPC+ and can update the TIA registers more quickly.

Thanks for posting the video link; I haven't watched it yet, but it sounds like exactly the kind of thing I'm interested in right now!

As far as reproduction of musical notes, I don't think it's necessary to be "near-exact," because as I understand it, the "just-noticeable difference" might be anywhere from 6 cents to 10 cents, so I figure that as long as a note is within 3 cents sharp or flat, no two notes will be out of tune from each other by more than 6 cents, which should sound fine to most people. Obviously we'd like to be as exact as possible-- if it's feasible-- but there's probably no need to obsess over it.

What I was saying about the higher frequencies might not even come up in most situations, because a piano keyboard ends at C8, which has a frequency about a third of G9 (which is the highest MIDI note). So even though G9 would require updating AUDV at least twice per line, C8 doesn't-- although what I said about "cherry-picking" the samples would probably still be true for C8 (if you're updating only once per line). I did a spreadsheet last night for G9 and made four charts to illistrate, so I'll post them tonight.

Anyway, I don't know if many pieces of music even go up to C8 and beyond; most music might not even need to go as high as C7. So if you don't use any of the notes in the "nosebleed section," updating AUDV once a line should be adequate, and the need for cherry-picking the samples probably wouldn't apply. The only reason I've spent any time thinking about notes as high as G9 is because I'm interested in the feasibility of having "full" MIDI capability with the 2600-- at least in terms of being able to play MIDI notes 0 through 127. The lower frequencies are no problem, whereas the higher frequencies represent the "worst case scenario"! :)

Michael

#10 SeaGtGruff OFFLINE  

SeaGtGruff

    Quadrunner

  • 5,558 posts
  • Location:Georgia, USA

Posted Mon Jun 27, 2011 7:03 PM

Here's four graphs that show what the waveform for G9 looks like depending on how "frequent" your samples are. Actually, the waveforms aren't for G9 per se, but for a frequency that's close to G9's frequency:

3579575 Hz / 228 = ca. 15699.89 Hz (the 2600's crystal rate divided by the number of color clocks per scan line, which equals the scan line rate)
G9 = ca. 12543.85 Hz
15699.89 Hz / 12543.85 Hz = ca. 1.25 = ca. 5 / 4 (about 4 cycles every 5 scan lines)
15699.89 Hz * 4 / 5 = ca. 12559.91 Hz (about +2.21 cents from G9, well within the "just-noticeable difference")

The graph in the upper left corner shows 4 cycles of a sine wave spanning 5 scan lines, with 16 samples per scan line (a total of 80 samples, plus 1 extra sample). The shape of the sine wave is clearly identifiable.

Going clockwise from there, the graph in the upper right corner shows what we get if we sample the wave once per scan line (5 samples in all, plus the start of 1 extra sample). The actual result will vary depending on where we begin sampling, but the overall effect should still be the same-- we get a wave with only 1 peak and 1 valley, so we end up with a note having 1/4th the frequency of G9, or 2 octaves below G9 (i.e., G7).

Continuing clockwise, the graph in the lower right corner shows what we get if we sample the wave twice per scan line (10 samples in all, plus the start of 1 extra sample). Now we get 4 peaks and 4 valleys, but some of the peaks and valleys don't extend for the full amplitude of the original wave.

Finally, the graph in the lower left corner shows what we get if we cherry-pick samples of the wave twice per scan line. This is done by analyzing the range covered by each sample. If a peak or a valley falls within the range covered by the sample, we take the peak or valley as the sample. If no peak and no valley fall within the range covered by the sample, we take the middle value (i.e., the sine of 0 degrees or the sine of 180 degrees-- but adding 7.5 to it and rounding, since we're using 15 for the sine of 90 degrees and 0 as the sine of 270 degrees). The resulting graph looks very similar to the previous one, except each peak and each valley extends for the full amplitude of the original wave.

I've created four WAV files to show how each of these sound. However, I've used 8 samples per scan line for the first WAV file-- which still closely conforms to the shape of the 16-sample wave-- because 16 samples per scan line produced a sound that was more difficult to hear.

The second WAV file is for 1 sample per scan line. It's the "same" note, but it's noticeably lower (shifted downward 2 octaves).

The third WAV file is for 2 samples per scan line. It has the higher pitch as expected, but it sounds quieter because of the reduction in its average amplitude.

The fourth WAV file is for 2 cherry-picked samples per scan line. It has the same higher pitch as the third file, but is noticeably louder.

WARNING: Listening to pure sine waves can damage your hearing much more quickly than listening to mixtures of different sine waves (which includes sawtooth, square, and triangle waves, since they can be constructed by combining multiple sine waves of different harmonics). It is advised that you do NOT listen repeatedly to the first WAV file.

Now that I see (hear) just how dang high G9 is, I'm thinking there's really no good reason for trying to duplicate the full range of MIDI notes, since G9 is so piercingly high, and C-1 is below the threshold of most humans' hearing anyway. I'm thinking it's plenty good enough to duplicate the range of a standard 88-key piano, or from A0 to C8. Since C8 is ca. 4186.01 Hz, that means 1 sample per scan line should be plenty good enough for musical purposes.

As for playing digitized sound, I think the best method (the one requiring the fewest CPU cycles per scan line) would be to use the DPC+ chip and set up multiple queues, with up to 256 values (samples) in each queue. For best results, use 5-bit samples as suggested by Tjoppen, sampled at a frequency of 15700 Hz. For example, if the digitized sound data takes up 2000 bytes, you could set up 10 queues, each having 200 values. Then the kernel would read from each queue sequentially, one queue per scan line-- LDA #queue1, STA AUDV0, LDA #queue2, STA AUDV0, LDA #queue3, etc. If the value read from the queue is 15 or less, store it in AUDV0 and set AUDV1 to 0. If the value read from the queue is greater than 15, store it in AUDV0 and set AUDV1 to 15. To make things work out nicer for the kernel, it would probably be best to use a set number of queues-- something that divides evenly into 262, or 260, or 264, or however many scan lines are drawn by the kernel.

Michael


Attached Thumbnails

  • G9.gif

Attached Files



#11 chjmartin2 OFFLINE  

chjmartin2

    Moonsweeper

  • Topic Starter
  • 322 posts
  • Location:Massachusetts

Posted Mon Jun 27, 2011 9:11 PM

As for playing digitized sound, I think the best method (the one requiring the fewest CPU cycles per scan line) would be to use the DPC+ chip and set up multiple queues, with up to 256 values (samples) in each queue. For best results, use 5-bit samples as suggested by Tjoppen, sampled at a frequency of 15700 Hz. For example, if the digitized sound data takes up 2000 bytes, you could set up 10 queues, each having 200 values. Then the kernel would read from each queue sequentially, one queue per scan line-- LDA #queue1, STA AUDV0, LDA #queue2, STA AUDV0, LDA #queue3, etc. If the value read from the queue is 15 or less, store it in AUDV0 and set AUDV1 to 0. If the value read from the queue is greater than 15, store it in AUDV0 and set AUDV1 to 15. To make things work out nicer for the kernel, it would probably be best to use a set number of queues-- something that divides evenly into 262, or 260, or 264, or however many scan lines are drawn by the kernel.

Michael


I am just thinking, but there has to be some way to utilize the built-in functionality of the TIA to go ahead and reconstruct a waveform. You could take each possible output for a given rate and then calculate the wave response and compare that to your target waveform. You could also expand that to the frequency domain over some bit-rate window. You would have to write a player first, that just sent patterned data to both channels to calculate the maximum rate, then you would use that rate to determine how much time you could playback. If you used bankswitching, then there is some max, which I don't know. (Not that good on Atari 2600, well, er, anything really.) So you have so much data, say 128 KB, and that equals 8 seconds, now you know your bit rate. Then you could figure out what the best possible combination of register values are to come the closest to reproduce your target waveform, by evaluating the wave form over some chosen frame and then evaluating each combination of TIA audio registers in that same period at that same rate.

Crazy?

#12 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,666 posts
  • begin 644 contest

Posted Mon Jun 27, 2011 9:21 PM


As for playing digitized sound, I think the best method (the one requiring the fewest CPU cycles per scan line) would be to use the DPC+ chip and set up multiple queues, with up to 256 values (samples) in each queue. For best results, use 5-bit samples as suggested by Tjoppen, sampled at a frequency of 15700 Hz. For example, if the digitized sound data takes up 2000 bytes, you could set up 10 queues, each having 200 values. Then the kernel would read from each queue sequentially, one queue per scan line-- LDA #queue1, STA AUDV0, LDA #queue2, STA AUDV0, LDA #queue3, etc. If the value read from the queue is 15 or less, store it in AUDV0 and set AUDV1 to 0. If the value read from the queue is greater than 15, store it in AUDV0 and set AUDV1 to 15. To make things work out nicer for the kernel, it would probably be best to use a set number of queues-- something that divides evenly into 262, or 260, or 264, or however many scan lines are drawn by the kernel.

Michael


I am just thinking, but there has to be some way to utilize the built-in functionality of the TIA to go ahead and reconstruct a waveform. You could take each possible output for a given rate and then calculate the wave response and compare that to your target waveform. You could also expand that to the frequency domain over some bit-rate window. You would have to write a player first, that just sent patterned data to both channels to calculate the maximum rate, then you would use that rate to determine how much time you could playback. If you used bankswitching, then there is some max, which I don't know. (Not that good on Atari 2600, well, er, anything really.) So you have so much data, say 128 KB, and that equals 8 seconds, now you know your bit rate. Then you could figure out what the best possible combination of register values are to come the closest to reproduce your target waveform, by evaluating the wave form over some chosen frame and then evaluating each combination of TIA audio registers in that same period at that same rate.

Crazy?

Someone tried that. It could replicate some sound effects but it could not replicate any sort of voice.

#13 SeaGtGruff OFFLINE  

SeaGtGruff

    Quadrunner

  • 5,558 posts
  • Location:Georgia, USA

Posted Tue Jun 28, 2011 1:14 PM

If the value read from the queue is 15 or less, store it in AUDV0 and set AUDV1 to 0. If the value read from the queue is greater than 15, store it in AUDV0 and set AUDV1 to 15.

I just realized this isn't quite right. It should be:

If the value read from the queue is greater than 15, increment it (add 1 to it), store it in AUDV0, and set AUDV1 to 15. That means it would be easiest to read from the queue using either the X or Y register:

   LDX #queue    ; 2
   CPX #16       ; 2
   BCS over_15   ; 2++
   LDA #0        ; 2
   JMP store_amp ; 3
over_15
   LDA #15       ; 2
   INX           ; 2
store_amp
   STA AUDV1     ; 3
   STX AUDV0     ; 3
I believe that works out to 17 cycles no matter which path is taken.

Michael

#14 SeaGtGruff OFFLINE  

SeaGtGruff

    Quadrunner

  • 5,558 posts
  • Location:Georgia, USA

Posted Wed Jun 29, 2011 7:43 AM

I don't know why I didn't think of this before, it's only 12 cycles, uses only the accumulator, and requires no branches or jumps:

   LDA #queue
   LSR
   STA AUDV0
   ADC #0
   STA AUDV1
Michael




#15 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • 211 posts

Posted Fri Jul 1, 2011 6:30 AM

I don't know why I didn't think of this before, it's only 12 cycles, uses only the accumulator, and requires no branches or jumps:

   LDA #queue
   LSR
   STA AUDV0
   ADC #0
   STA AUDV1
Michael

That's a clever solution - put half in both AUDVx and adjust one of them using the low bit.
My previous ideas was either the same "if (A & 0x10) AUDV1 = 0x0F;" as above, or something like "AUDV1 = -(A >> 4);L".
I might steal that snippet for a demo this summer..

I'd also consider dithering, which could increase the quality to something akin to 6- or 7-bit. Haven't done any preliminary calculations for that yet though.

Edited by Tjoppen, Fri Jul 1, 2011 6:31 AM.


#16 SeaGtGruff OFFLINE  

SeaGtGruff

    Quadrunner

  • 5,558 posts
  • Location:Georgia, USA

Posted Fri Jul 1, 2011 11:43 AM

I'd also consider dithering, which could increase the quality to something akin to 6- or 7-bit. Haven't done any preliminary calculations for that yet though.

I don't see how that would work, unless you're playing each sample for an extended period-- such as sampling at a rate of 15700 Hz (once per line), but updating AUDV0 and AUDV1 twice per line, such that the playback rate is 31400 and each sample is played for two playback periods.

If you dither at two times the sample rate, you could presumably get 61 possible values (31 + 30). If you dither at three times the sample rate, you could presumably get 91 possible values (31 + 30 + 30). If you dither at four times the sample rate, you could presumably get 121 possible values (31 + 30 + 30 + 30). But I don't know if the actual results would be a higher-quality sound, or if it would just introduce a bunch of additional harmonics.

Besides, if you're going to update AUDV0 and AUDV1 at a faster rate, it might be better to just stick with the "pseudo 5-bit" samples, but sample at the higher frequency, because increasing the sample rate might do more to improve the quality of the sound than keeping the same rate slower but increasing the number of bits. Maybe I'll create some WAV files this weekend to test both scenarios and see which sounds better.

Michael

#17 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,666 posts
  • begin 644 contest

Posted Fri Jul 1, 2011 1:13 PM

I think Supercat tried some experiments with dithering. The result was the noise sounded far worse than the digital artifacts it was intended to mask.

Anyway, I think what helps most in regard to improving sound quality is increasing the sample rate and not the bit depth, especially since it's hard to get even a single bit improvement and even that probably won't work well since it's on another channel.

Probably the most fruitful improvement would be figuring out a good audio compression method so the sample rate may be increased and/or more audio data may fit in the binary.

#18 roland p OFFLINE  

roland p

    River Patroller

  • 2,448 posts
  • $23
  • Location:The Netherlands

Posted Sun Jul 3, 2011 1:00 AM

Could the joystick port be used? Setting all lines as outputs and connecting an r2r ladder dac would give you 8-bit resolution?
Major downside is that no joysticks could be used, but maybe it's a nice idea for a demo.

#19 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • 211 posts

Posted Sun Jul 3, 2011 1:15 PM

Today I toyed around with a few 5-bit codec ideas. Something DPCM-like seemed like the best idea.
After some experimentation I came up with a table driven ADPCM approach that gets down to a single bit per sample using a 62 byte LUT.
In other words, for every sample value (0-30) one looks up the next value in the table depending on whether the current bit in the stream is set: next_sample = ADPCMTable[(sample << 1) | bit].
The table could probably use some restructuring - I haven't written the decoder yet.

I tried to attach some samples and code, but the forums have way too restrictive rules (seriously, I can't attach a .c file?). I uploaded here instead:
http://www.acc.umu.s...ppen/files/vcs/

Anyway, the encoder does a fairly exhaustive search for values in the LUT.
It obviously can't try every one of the 31^62 combination, so it starts with a decent guesstimate and does local optimization from there.
Also, it lacks any sort of psychoacoustic model and only does a token attempt at noise shaping.
It expects mono 16-bit PCM WAVs.

Example table:
;62 entries
;1 bits per sample
;rms = 1.71
ADPCMTable
	.byte 3
	.byte 16
	.byte 0
	.byte 5
	.byte 0
	.byte 6
	.byte 0
	.byte 7
	.byte 4
	.byte 8
	.byte 5
	.byte 9
	.byte 6
	.byte 11
	.byte 7
	.byte 11
	.byte 8
	.byte 12
	.byte 9
	.byte 13
	.byte 10
	.byte 14
	.byte 11
	.byte 15
	.byte 9
	.byte 13
	.byte 12
	.byte 16
	.byte 13
	.byte 9
	.byte 12
	.byte 16
	.byte 15
	.byte 20
	.byte 15
	.byte 18
	.byte 17
	.byte 20
	.byte 15
	.byte 18
	.byte 19
	.byte 22
	.byte 19
	.byte 22
	.byte 19
	.byte 22
	.byte 20
	.byte 18
	.byte 22
	.byte 26
	.byte 22
	.byte 29
	.byte 23
	.byte 26
	.byte 24
	.byte 30
	.byte 25
	.byte 30
	.byte 26
	.byte 30
	.byte 27
	.byte 30

Edited by Tjoppen, Sun Jul 3, 2011 1:16 PM.


#20 SeaGtGruff OFFLINE  

SeaGtGruff

    Quadrunner

  • 5,558 posts
  • Location:Georgia, USA

Posted Sun Jul 3, 2011 4:24 PM

I tried to attach some samples and code, but the forums have way too restrictive rules (seriously, I can't attach a .c file?).

If you want to post C source or assembly source, you can rename the extension to .TXT-- or, if you don't want to lose the original extension, add .TXT to it (e.g., FILE.C.TXT, or FILE.ASM.TXT) so it's obvious what the extension should be, then anyone who downloads the source file can just remove the .TXT extension.

Michael

#21 Mrshoujo OFFLINE  

Mrshoujo

    Combat Commando

  • 9 posts

Posted Thu Aug 4, 2011 1:09 PM

Quadrun

Open Sesame (seconds 2-4, it's a bit hard to understand)

Berzerk, Voice Enhanced

This is my test for Frantic

I Want My Momma also has a short digital audio clip.

#22 neotokeo2001 OFFLINE  

neotokeo2001

    River Patroller

  • 3,823 posts
  • Location:Palm Beach

Posted Fri Aug 5, 2011 6:39 PM

Army of Darkness has some voice after the title screen. Programmed by s0c7.



#23 Antichambre OFFLINE  

Antichambre

    Combat Commando

  • 2 posts

Posted Wed Sep 7, 2011 6:55 AM

Thank you for these useful information! :)

#24 philipj OFFLINE  

philipj

    Moonsweeper

  • 427 posts
  • Location:Birmingham, Alabama

Posted Tue Sep 20, 2011 11:11 PM

Very useful indeed. :)

#25 iesposta OFFLINE  

iesposta

    River Patroller

  • 3,664 posts
  • Retro-gaming w/my VCS
  • Location:Pennsylvania

Posted Wed Sep 21, 2011 10:58 PM

I've had fun with this digitizing sound tool:
http://www.atariage....07-wav2ataripl/





0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users