Jump to content
IGNORED

Software Based Speech


snicklin

Recommended Posts

11 hours ago, Irgendwer said:

Examine the density of the color stripes...

Yes, of course, C is faster, but... I've spent my professional live developing embedded sensors' software for a lot of microcontrollers,  from Z80 & 8031 legacy to 32 bits arms... The 6502 is the slowest in my opinion. I program it in assembler since early 80's, my first real program was a 2k font editor, written on a sheet of paper, converted to hex code with a mnemonic table and poked in memory with basic instruction DATA. I've from time to time tried to use CC65 believing that it will do the job, I've always given up. If there is nothing more powerful than CC65, I prefer to use assembler, even if maintenance and documentation are awful.

Link to comment
Share on other sites

1 hour ago, pfeuh said:

Yes, of course, C is faster, but... I've spent my professional live developing embedded sensors' software for a lot of microcontrollers,  from Z80 & 8031 legacy to 32 bits arms... The 6502 is the slowest in my opinion. I program it in assembler since early 80's, my first real program was a 2k font editor, written on a sheet of paper, converted to hex code with a mnemonic table and poked in memory with basic instruction DATA. I've from time to time tried to use CC65 believing that it will do the job, I've always given up. If there is nothing more powerful than CC65, I prefer to use assembler, even if maintenance and documentation are awful.

Could you give some examples of the code that has low performance? We may take a look and propose how to make it fast in CC65.

Generally there are only a few rules for writing effective code with CC65:

- compile with optimization turned on (people often don't turn it on and complain about the speed)

- do not use stack (make --static-locals to change stack variables into global ones)

- use static memory (do not allocate dynamically)

- avoid arrays of structs

- do not use long data types (preferably unsigned char for everything) whenever possible

- for indexing arrays of size <256 use unsigned char instead of pointers

- avoid printf function and similar ones (they are huuuge)

Edited by ilmenit
Link to comment
Share on other sites

After reading old documentation on speech synthesis, I'm convinced that intelligible speech can be done on Pokey without the overhead of sampled phonemes. You need to identify the main frequency components and type of noise (f, s, t sounds) needed and change them at regular intervals through the phoneme. The poly sounds could be really good for getting basic combinations of frequencies too.

Link to comment
Share on other sites

17 hours ago, Bryan said:

After reading old documentation on speech synthesis, I'm convinced that intelligible speech can be done on Pokey without the overhead of sampled phonemes. You need to identify the main frequency components and type of noise (f, s, t sounds) needed and change them at regular intervals through the phoneme. The poly sounds could be really good for getting basic combinations of frequencies too.

That's bold statement. If we could do sine waves and control phase, it's true. But we can't.

Edited by R0ger
Link to comment
Share on other sites

2 hours ago, R0ger said:

That's bold statement. If we could do sine waves and control phase, it's true. But we can't.

Aaaa, ooo, uuu, iii, eee, can probably be done, even with just four operators (Pokey channels). Think Alley Cat. But if it'll end in intelligible speech, hard to say, but I think not. Additive sound generators generally work with more waves.

 

 

Link to comment
Share on other sites

20 hours ago, Bryan said:

After reading old documentation on speech synthesis, I'm convinced that intelligible speech can be done on Pokey without the overhead of sampled phonemes. You need to identify the main frequency components and type of noise (f, s, t sounds) needed and change them at regular intervals through the phoneme. The poly sounds could be really good for getting basic combinations of frequencies too.

Anyway I wonder what have you read, as I could basically find nothing.

Link to comment
Share on other sites

You're right, but they use the same principle @Bryan proposed. Determine the three most dominant frequencies in a certain time slice and replay those with your soundchip. SID in this case. IIRC their time slice is 20ms, i.e. one PAL frame.

 

Here it is used to replay an a-capella song, but it could just as well be used to play phonemes and concatenate them for words and sentences.

 

Pokey could do four instead of three channels, but with square waves and not sine waves. And phase alignment will be a problem, too.

 

Edited by ivop
typo
Link to comment
Share on other sites

2 hours ago, R0ger said:

Anyway I wonder what have you read, as I could basically find nothing.

Just the datasheets, app notes and emulation info on old voice chips. Looking at the requirements.

19 minutes ago, ivop said:

You're right, but they use the same principle @Bryan proposed. Determine the three most dominant frequencies in a certain time slice and replay those with your soundchip. SID in this case. IIRC their time slice is 20ms, i.e. one PAL frame.

 

Here it is used to replay an a-capella song, but it could just as well be used to play phonemes and concatenate them for words and sentences.

 

Pokey could do four instead of three channels, but with square waves and not sine waves. And phase alignment will be a problem, too.

 

Well, I believe you can align a voice by:

1. knowing the cycles since Pokey was reset (using WSYNC alignment or maybe triggering an interrupt off a voice?)

2. cramming a short pitch value (perhaps at 0 volume) to delay the next square wave.

Cycle counting would be needed, but at least locking to scan lines makes it deterministic.

 

I've also thought about the sounds Alley Cat makes. It's possible there's enough poly sounds already in there that would be suitable for a crude robot voice if you could find them all.

Link to comment
Share on other sites

12 minutes ago, ivop said:

IIRC Mahoney uses the gate bit extensively. "We" could use STIMER to reset phase.

 

Just thinking here... There may be a way to get some crude alignment on 2 voices with AUDCTL, especially if you can base some pitches off the 15KHz clock. Load the intended pitch, but flip to the 1.8MHz clock until you want the wave to start. You'll get ultrasonic (silent) waves until you flip back and then the slow wave starts.

 

EDIT: I guess that's no better than cramming a divisor of 0, though except that it would affect 2 voices at once. Although flipping to the fast clock would run out the timer on the current pitch quickly since you can't align anything until the counter expires.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...