Jump to content
IGNORED

Voice Playback


Tursi

Recommended Posts

Some time ago artrag mentioned a voice converter he had created that ran on the MSX and produced really nice voice at 60Hz. He was kind enough to adapt it for the PSG in the TI-99/4A and ColecoVision, and after lots of work-induced delays, I've put together a small package for those interested in using it.

 

With only three voices, the results are not quite as nice as the MSX, but most voice is still quite legible. I've posted a sample YouTube (it's just a quick and dirty test, but it demonstrates both good samples and not-so-good samples, so you can hear the range.) (Sample is running on a TI-99/4A, but the ColecoVision results are the same.)

 

https://www.youtube.com/watch?v=wkBShy-EFkI

 

Playback takes very little CPU (just unpacking 6 bytes per frame) and a fair amount of memory (360 bytes per second). It's quite good for adding short voice samples!

 

The actual converter runs under Matlab so requires the Matlab runtime and 64-bit Windows to execute. Alternately, the Matlab script is included so you can run it on your choice of platform if you have the ability to run Matlab scripts. I experimented with Octave and although it didn't run out-of-the-box, I eventually got an early version processing there.

 

For playback, I've included assembly playback code for both the TI-99/4A and the ColecoVision (the ColecoVision code is hand-optimized from SDCC output and runs fine linked into C programs). There's also a VGM converter with C source in case you have a need to VGM audio files (for instance, for my VGM compressor).

 

Anyway, hope you enjoy! Archive is posted on my site:

http://harmlesslion.com/software/artvoice

 

 

  • Like 5
Link to comment
Share on other sites

I used the exact method in wav2cv to create a voice saying "no no no" in Coleco Reversi; you can hear it when you try to play an invalid move.

 

Oh, excellent! Did you use FFTs? Did you manage to take advantage of the noise channel? I went to your page to try and download, but the devkit is currently showing 404.

Link to comment
Share on other sites

 

Oh, excellent! Did you use FFTs? Did you manage to take advantage of the noise channel? I went to your page to try and download, but the devkit is currently showing 404.

 

Noise channel excluded (because I couldn't figure it out back then how to make it works with noise), WAV2CV works with FFT to extract frequencies from an uncompressed wave file. Did all the coding in VisualBASIC 5 I believe and it was a great practice for my computer knowledge of that time.

 

As you know, transforming a wave sample into 3 channels beeping sounds is not great but it does work good enough for some cases like sound effects, music, but barely with voices.

 

I am sure your tool reaches the next step and WAV2CV is probably available if you want to give it a try. Warning : it formats the conversion result into Marcel deKogel's sound data format which can be manually converted into other formats; more information in the documentations about ColecoVision programming I wrote a long time ago.

 

Sorry for the inconvenience of my website losing its zip files, it seems to be a glitch or a new policy of the web hosting I'm using; I should fix that.

Edited by newcoleco
Link to comment
Share on other sites

Well, Artrag did all the work of conversion, and a lot of the heavy lifting is buried deep in the "fxpefac" function of the voicebox library for MATLAB. (http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/doc/voicebox/fxpefac.html) It seems to be a pretty clever function, though it's apparently tuned specifically for voice.

 

I worked on my own tool about six months ago, when I got voice by accident. But I didn't get it quite as solid as this tool. I did try to get noise detection going in mine as well, but it never worked for me. :)

Link to comment
Share on other sites

The technique of extracting the highest tones from voice works fine and has solid basis in literature on voice coding.

Voice segments are classified as voiced and unvoiced.

Voiced segments are almost periodic and can be characterized by a main tone named pitch and it's multiples said formats.

Unvoiced segments are aperiodic and it's formats aren't at multiple frequencies.

Early codec where able to decide if classify a segment as voiced or unvoiced. The former was coded by the pitch and the amplitudes of the upper harmonics (formats), the latter coding all couples of frequency and amplitude.

Link to comment
Share on other sites

fxpefac is able to estimate the pitch and returns the probability the segment is voiced or unvoiced.

Actually this latter information is not used as with only 3 channels all segments are coded in the same manner independently by relationship among formants.

The main problems the technique suffer on psg chips is that they produce square waves and not tones.

This generates unwanted harmonics that for voiced segments can add in destructive manner.

The SCC experience, where you can generate pure tones shows the improvement passing from square waves to tones. Also the 2 extra channels help, but the difference is in the square waves vs tones.

Edited by artrag
  • Like 1
Link to comment
Share on other sites

Anyway another possible improvement in intelligibly would be to add noise in the coding of unvoiced segments.

 

fxpefac could tell IF, but the how is something different and in the end I resolved in coding also these segments as tones with unrelated frequencies

Edited by artrag
Link to comment
Share on other sites

Here is an in game sample with some digital voice effects that I've put together using the voice playback.

 

https://www.youtube.com/watch?v=-GXnP8PYMnk&feature=youtu.be

 

Cool. For some reason I was expecting "Achtung!" from Castle Wolfenstein 3D

https://youtu.be/lN9Sg3l59-I?t=34

 

Or some from the original Castle Wolfenstein:

Link to comment
Share on other sites

is it possible to have a 32bits version of the SN76489.EXE file, I'm currently working with a 32bits version of Windows XP :(

And another question, it seems to be for 60Hz frequency, is it adapt for European PAL 50hz console ?

Edited by alekmaul
Link to comment
Share on other sites

I have to see if I can compile it in my Windows XP laptop but you have to wait for September.

About 50/60hz you would get a small slow down of the voice speed but it does not affect intelligibly or pitch.

In case I can do a version with a switch on command line.

Edited by artrag
Link to comment
Share on other sites

I have to see if I can compile it in my Windows XP laptop but you have to wait for September.

About 50/60hz you would get a small slow down of the voice speed but it does not affect intelligibly or pitch.

In case I can do a version with a switch on command line.

Well, regarding new version of SN exe, thanks a lot in advance, will wait September !

About 50/60Hz, I also did patch of my VBL function to avoid slow / fast speed regarding the 0x0069 memory entry. It was just a question about that.

I think that If I update the voice as I do for music, it will be ok.

Link to comment
Share on other sites

Sadly I don't have a Matlab compiler, so we'll have to wait on that. I have started porting the code to C but it will take me a while to get time to finish that task.

No problem Tursi, can wait for that ;)

I think Bagman will have an updated version with your voice playback driver :P

Link to comment
Share on other sites

It does not work for me. I can now run it with my windows XP but I have this message with the wav files I put in wavs directory :

Num. channels = 3

 

ans =

 

file#0 reward_base.wav

 

 

Undefined function 'fxpefac' for input arguments of type 'double'.

 

Error in sn76489 (line 49)

 

 

 

MATLAB:UndefinedFunction

And size is odd : 294K for 32 bits version and 5M for 64bits version :o !

Edited by alekmaul
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...