Jump to content
IGNORED

Speech synthesis


mizapf

Recommended Posts

F'up here...

 


I know these specs; they do not help too much because they just describe the setup of the LPC frames but do not reveal what a particular coefficent value means for the output. My incentive to write SPEECODER back in 1989 was that you could analyze existing vocabulary and learn what various similar sounds have in common. I never invested the time for that, though.

 

With respect to finding phonemes for German speech, you need not even go as far as "ü"; it proved already difficult to find some way to create the closed "e" as in "geh!" (go!) oder "See" (lake) without getting an English "say" or "sea", neither of which is suitable.

 

As I already said, "ü" is the rounded version of "i" (say "kiss" with your lips pushed forward), but how does that show in the coefficients?

Edited by mizapf
  • Like 1
Link to comment
Share on other sites

I've found similar difficulties in using phonemes to reproduce é in french....always mispronounced as "eh" in english, it is actually much closer to the i in "pig" or "dig"...but not quite

 

And you're right (I likely looked like an idiot, but I confirmed that if you say kiss while puckering for a kiss, it comes out as a near perfect ü)...

 

Of course when it comes to cool sounding speech, there are no cooler sounding words in any language than those containing double vowels spoken in Finnish...it's more akin to singing than speaking!

Link to comment
Share on other sites

  • 2 weeks later...

Hi,

 

I some hear more differences against your your list, here just what I got:

 

AADB - Great move, folks

 

ABF9 - How did when I looked, isn´t it.

 

A47D - Onward and upward

 

A9B4 - ..........to that one

 

B1E2 - Better then next time.

 

1B63C (missing in the list) - Heeeelp

Link to comment
Share on other sites

Interesting that I never heard "Help!" but just wondered about the "p" at the end (thought it was accidentally recorded). I still don't really hear it, maybe because the "H" is very faint.

Edited by mizapf
Link to comment
Share on other sites

OK then, I updated http://www.ninerpedia.org/index.php?title=Alpiner, since there are no new messages here - seems as if there is some consent.

 

However, I still fail to hear some of the phrases correctly; tried different volumes, no avail. We had a similar discussion some time ago, concerning "GOOD", which, in fact, I can actually hear.

 

I have to read the phrase in the intended wording and at the same time listen to the synthesizer to have some success here. This is, to some extent, an acoustical illusion, just like the well-known optical illusions. You hear what you expect to hear.

 

ABF9 - Harder than it look(ed), idn't it? (I get the "isn't" with some efforts, but after 10 tries I still don't hear the "s" after "look")

A47D - On word and up word. (Dialectal? I though both rhyme with "forward", like "fore-wad", so "on-wad" or "up-wad")

B1E2 - ... right next time. (Fail to hear "better" after 10 tries, giving up. Same with "luck": I hear an "i" somewhere, but no "k")

Link to comment
Share on other sites

I went over them again and I believe they are correct, with one possible exception: "AADB - Great moves sport!". After listening again, I think it may simply be "Great move sport!".

 

The subtle/missing sounds are tough. I've always attributed them to LPC-10 compression artifacts. The reconstructed speech definitely has problems with certain sounds.

 

Listening to synthesized speech is like getting used to a local accent. After a while you just hear the words and not the sounds. It's only when you stop and listen (or hit the play button over and over and over) that you realize that what you're hearing isn't always as easily recognizable as what you're interpreting it to be.

Link to comment
Share on other sites

As for non-English sounds, the Spanish Moonmine edition indeed offers some examples, but not really convincing. Maybe it was the speaker's fault already, or the phonemes cannot be reproduced well.

 

objeto = [ɔb'χɛtɔ] (χ in IPA is like (a)ch in German or (lo)ch in Scottish)

 

What I hear is [ɔb'ʃɛtɔ] (ʃ is English "sh"). This would not be useful for "ach". Also, there is

 

bajo = ['baχɔ]

 

but you can hear something like ['baho] instead. Also here, you can get the illusion that you actually heard the correct phonemes.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...