Jump to content

Photo

bB AtariVox Support, Part 3... More Natural Sounding Voices


2 replies to this topic

#1 RevEng OFFLINE  

RevEng

    River Patroller

  • 3,479 posts
  • bit player
  • Location:Canada

Posted Sun Feb 13, 2011 9:20 AM

bB AtariVox Support, Part 3... More Natural Sounding Voices

  • In the previous installment, I mentioned that taking output directly from Phrase-a-lator or batari's Speak To Me will produce voice output on par with basic text to speech systems, like SAM or Speak'N'Spell.

    There are a few reasons why the phoneme codes from these sources contribute to a less-natural sound...

  • Odd Phoneme Selections... Sometimes one of the phoneme selections may be lacking. e.g. "Atari" may be pronounced like "Ah-tari" instead of the more common "Uh-tari".

    This is especially likely to happen if Speak To Me didn't find the word in the dictionary, and had to make a best guess using its algorithm.
  • Bad Phenome Pacing... Sometimes the phoneme choices are appropriate, but the length of each phoneme in the word doesn't sound natural.
  • Lack of Pitch Changes... When people say words, the pitch of the words rises and falls. Questions end on a higher pitch, statements end on a lower pitch, etc. Phonemes generated from programs or the dictionary all default to the same monotone pitch.

All of these issues can be eliminated or improved through a bit of trial and error. The tool I like to do that with best is vdub_bobby's AtariVox Editor, which runs directly on the 2600. Check out Bob's instructions in that entry, as they explain how to manipulate the editor. Be sure to read them, as the editor can be a bit tricky at first.

Or, if you prefer, you can just replace the codes in the example program from the previous installment with your own codes.

Fixing it


We're going to be using phoneme and control codes from page 15 and 16 of the SpeakJet manual. You may want to keep that handy for reference.

We'll start by loading up the phonemes in Bob's editor. First we delete all of the codes on the screen, and then enter in the following codes, which are the "game over" phonemes from the previous installment:

 \SLOW		$08
 \GE		$B2
 \EYIY		$9A
 \MM		$8C
 \SLOW		$08
 \OW		$89
 \FAST		$07
 \VV		$A6
 \AXRR		$97


Odd Phoneme Selections
In our example, the phonemes chosen sound just fine. But if one was wrong, substitutution with other phonemes is just a matter of trial-and-error.

If you're really stuck, take a look at the SpeakJet dictionary and look for words with parts pronounced the same as the word you're working on.


Bad Phoneme Pacing
The first thing you probably noticed when you entered the phoneme codes were the \SLOW and \FAST codes. We didn't discuss them before, but unsurprisingly these codes instruct the SpeakJet chip that the following phoneme should be said slower or faster than usual.

Start by previewing the sound in the editor. One immediate flaw is the words blend together.

You can tell the AtariVox to pause by using the appropriately named PAUSE code. We'll remendy our blended words by adding a "PAUSE2" code after the \MM phoneme.

The other problem is a bit more subtle; the speed of some of the phonemes isn't quite natural. I can't really give a rule-of-thumb here, other than to say the words a lot to yourself and see what bits are short and what bits are long. After some playing I came up with the following, which sounds more natural to my ear...

 \GE		$B2
 \EYIY		$9A
 \SLOW		$08
 \MM		$8C
 \PAUSE2	$02
 \SLOW		$08
 \OW		$89
 \SLOW		$08
 \VV		$A6
 \AXRR		$97


I was happy with the overall speed, but if you want to adjust it you can insert a SPEED code at the very beginning of the data.

Lack of Pitch Changes
There are 2 kinds of codes to change the pitch of a phoneme - the first kind is the STRESS and RELAX codes. These raise and lower the pitch of the phoneme that follows them.

I found these codes didn't have a very strong effect and I wanted something dramatic, so I instead went with the second kind of code... the PITCH code. PITCH allows you to specify very subtle or very stong changes (including singing effects). It's also important to know that the change lasts until the next PITCH code you send.

If you say "Game Over" to yourself a few times like an announce might, you'll likely find that "Game" starts off at an average pitch and rises up near the end, while "Over" starts off off at an average pitch and drops near the end.

 PITCH=$48	$16
		$48
 \GE		$B2
 \EYIY		$9A
 PITCH=$56	$16
		$56
 \SLOW		$08
 \MM		$8C
 \PAUSE2	$02
 \SLOW		$08
 PITCH=$48	$16
		$48
 \OW		$89
 \SLOW		$08
 \VV		$A6
 PITCH=$44	$16
		$44
 \AXRR		$97


Wrap-Up


At the bottom of this post you'll find an mp3 attached that has the 3 versions of "game over" AtariVox output, so those following along without a Vox can see the difference.

If any AtariVox veterans have any tips to share, please add to the thread and share your wisdom!

AtariVox is the creation of Richard Hutchinson, and the drivers were created by Alex Herbert. My thanks to these guys for their amazing contribution to the 2600!

Attached File  game over comparisons.mp3   106.88KB   184 downloads

Edited by RevEng, Sun Feb 13, 2011 6:05 PM.


#2 diogoandrei OFFLINE  

diogoandrei

    Chopper Commander

  • 210 posts
  • Location:Brazil

Posted Sun Feb 13, 2011 5:48 PM

I found these codes didn't have a very strong effect and I wanted something dramatic, so I instead went with the second kind of code... the PITCH code. PITCH allows you to specify very subtle or very stong changes (including singing effects). It's also important to know that the change lasts until the next PITCH code you send.


I definitely need AtariVox in my life. I am already dreaming on charting the pitches into music notes and then cross referencing them with TIA frequencies.

#3 RevEng OFFLINE  

RevEng

    River Patroller

  • Topic Starter
  • 3,479 posts
  • bit player
  • Location:Canada

Posted Sun Feb 13, 2011 7:05 PM

You might also be interested in the fact that its possible to directly program the 5 internal oscillators in the SpeakJet.

The control is pretty basic; you can change the envelope type (saw/sine/triangle/square) for all of them but not individually. The oscillators can't do anything fancy like modulating each other (FM synthesis would have rocked) but on the plus side their frequency can be set from 1 to 3999Hz, and 2 of them can be mixed with noise generation.

I briefly played around with setting up two frequencies to phase against each other; there's definitely some untapped potential for unique sound effects with oscillator control.

You'll also likely be interested in Richard Hutchinson's AtariVox Christmas Carols! (at the bottom of the linked page) :)

Edited by RevEng, Sun Feb 13, 2011 7:40 PM.





0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users