BlueWizard LPC-file

Nick99 · January 16, 2021

I have played around with BlueWizard and wonder how to use the LPC data in, let's say XB?

After reading all I could find on this forum I still have no clue how to use the data.

pixelpedant · January 16, 2021

Well, if it gives you an LPC speech string which is structured correctly, you can just feed it directly to CALL SAY. The way you can feed a value generated by CALL SPGET to it.

Patterns in this context always begin with 0110000000000000 and end with the 1111 somewhere in the last byte, where 01100000 is the Speak-External command and 1111 is the stop code. But each byte is inverted and the actual values contained within the frames are of variable length and are not aligned with the bytes themselves. So the actual speech frames are utterly obscured until you flip the bytes.

I haven't gotten around to playing with BlueWizard either (mainly because I don't have a Mac). Should do, now that it's been ported to Python. But it'd strike me as odd if it weren't already correctly structured for use. I'd just check that it begins with 0110000000000000 and the first non-zero values in the final byte are 1111.

If so, you should be good to go to just feed it to CALL SAY as a string.

Nick99 · January 17, 2021

Thanks @pixelpedant!

The lpc-file begins with ee,ad,d9,27,bd,93,84,a7... ends with ...9b,71,f7,70,db, I guess I could use it in this format in assembly, just as in the QBox tutorial on youtube, DATA >ee,>ad,>d9 and so on?

Or convert all the data to binary and put in a XB program, looks like I have a lot of work to do, convert everything or learn assembly. ?

pixelpedant · January 17, 2021

Yeah, so that looks like it's just the LPC data itself without the necessary additional values CALL SAY will need. However, those are easy enough to add in.

Assuming this is just the raw LPC data, you'd just need to add three bytes to the start of the pattern. Namely,

CHR$(96)

CHR$(0)

CHR$(X) where X = the number of bytes of LPC data which follow.

So presuming a string variable LPC$ contains the raw LPC pattern,

SAYOUT$=CHR$(96)&CHR$(0)&CHR$(LEN(LPC$))&LPC$
CALL SAY(,SAYOUT$)

should give you your XB output.

It'll still be missing the stop nibble if >D9 is the final value in the source pattern as you mention. But this isn't obligatory for successful output of a single string.

Could add CHR$(15) to the end on point of principle, though, so it's there. In this case, the length byte would also need to be modified such that the result is

SAYOUT$=CHR$(96)&CHR$(0)&CHR$(LEN(LPC$)+1)&LPC$&CHR$(15)
CALL SAY(,SAYOUT$)

I really should play with this program myself, though.

+mizapf · January 17, 2021

Please send me the LPC data; maybe I can do something with it in Speecoder (my program for analyzing speech data).

+mizapf · January 17, 2021

Took a bit ... solved with Speecoder, then created a small Extended Basic program to recreate the strings. Note that a string must not exceed 255 characters, so I had to split it.

The process that I used can obviously be abbreviated, and you don't need Speecoder.

You have to split the file in pieces, and add the bytes 0x60, high byte of length, low byte of length. Also, at the ends of the pieces, a stop code should be added. It could suffice to add "255,255".


10 PRINT "READ PART 1" :: RESTORE 1000
20 GOSUB 500
30 S1$=A$
40 PRINT "READ PART 2" :: RESTORE 1500
50 GOSUB 500
60 S2$=A$
70 PRINT "READ PART 3" :: RESTORE 2000
80 GOSUB 500
90 S3$=A$
100 CALL SAY(,S1$,,S2$,,S3$)
110 END
500 READ A,B,C
510 A$=CHR$(A)&CHR$(B)&CHR$(C)
520 FOR I=1 TO C :: READ X :: A$=A$&CHR$(X) :: NEXT I
530 RETURN
1000 DATA 96,0,166
1010 DATA 128,254,230,54,173,50
1020 DATA 75,88,186,155,178,140
1030 DATA 172,133,239,102,210,42
1040 DATA 83,135,127,232,169,240
1050 DATA 72,226,130,233,186,220
1060 DATA 211,137,11,134,153,140
1070 DATA 236,52,238,159,102,43
1080 DATA 60,211,184,96,232,171
1090 DATA 136,74,108,244,169,182
1100 DATA 34,218,113,128,138,105
1110 DATA 181,14,39,7,42,49
1120 DATA 41,170,157,14,40,71
1130 DATA 151,46,119,215,160,106
1140 DATA 199,180,92,137,66,186
1150 DATA 221,52,243,196,4,152
1160 DATA 102,218,160,45,124,69
1170 DATA 70,108,135,150,56,110
1180 DATA 153,142,15,82,204,68
1190 DATA 101,165,57,112,137,19
1200 DATA 81,147,120,64,165,150
1210 DATA 86,59,157,2,85,223
1220 DATA 238,145,74,20,209,245
1230 DATA 148,123,218,70,192,84
1240 DATA 85,8,152,50,19,1
1250 DATA 83,117,50,188,203,73
1260 DATA 179,114,84,200,20,91
1270 DATA 114,199,245,192,83,106
1280 DATA 205,109,53,15
1500 DATA 96,0,203
1510 DATA 1,169,102,84,43,101
1520 DATA 11,168,155,117,141,78
1530 DATA 132,128,101,182,25,52
1540 DATA 236,117,106,166,38,192
1550 DATA 52,59,228,159,102,38
1560 DATA 195,83,11,33,133,150
1570 DATA 216,80,93,248,236,87
1580 DATA 179,83,77,99,115,92
1590 DATA 139,42,213,139,44,126
1600 DATA 212,187,212,36,34,187
1610 DATA 81,171,146,205,132,166
1620 DATA 218,173,50,17,161,135
1630 DATA 155,54,203,196,228,159
1640 DATA 102,218,61,221,160,102
1650 DATA 216,237,176,72,130,254
1660 DATA 110,183,93,59,177,106
1670 DATA 82,104,243,46,85,69
1680 DATA 206,185,197,119,212,12
1690 DATA 177,196,54,155,113,60
1700 DATA 196,236,58,188,218,145
1710 DATA 146,139,110,211,12,89
1720 DATA 68,105,50,93,45,108
1730 DATA 1,194,135,235,12,245
1740 DATA 68,136,26,166,219,212
1750 DATA 29,163,127,154,157,212
1760 DATA 72,141,212,41,43,141
1770 DATA 195,145,113,83,73,233
1780 DATA 14,197,201,78,105,164
1790 DATA 39,149,52,35,167,150
1800 DATA 232,116,28,204,18,70
1810 DATA 180,66,49,10,154,30
1820 DATA 55,143,152,200,28,182
1830 DATA 75,45,18,33,126,248
1840 DATA 73,79,119,252,0
2000 DATA 96,0,131
2010 DATA 168,73,110,196,171,100
2020 DATA 179,38,251,17,159,81
2030 DATA 236,182,18,70,188,71
2040 DATA 85,83,139,93,179,26
2050 DATA 199,78,45,118,76,170
2060 DATA 105,162,187,136,118,177
2070 DATA 146,5,0,8,141,37
2080 DATA 53,58,20,49,63,251
2090 DATA 54,235,82,148,228,236
2100 DATA 71,189,91,177,145,83
2110 DATA 108,174,78,218,72,201
2120 DATA 186,85,61,36,1,160
2130 DATA 109,216,9,139,178,101
2140 DATA 236,150,167,44,50,73
2150 DATA 248,90,158,16,207,36
2160 DATA 133,237,54,67,178,186
2170 DATA 12,174,154,80,239,74
2180 DATA 59,160,226,66,117,58
2190 DATA 113,131,75,118,201,14
2200 DATA 57,78,42,217,162,167
2210 DATA 153,32,36,167,136,153
2220 DATA 100,13,0,240,0

laugh.dsk

Edited January 17, 2021 by mizapf

Nick99 · January 17, 2021

Thank you very much @mizapf !

+FarmerPotato · January 20, 2021

On 1/17/2021 at 3:39 AM, pixelpedant said:

It'll still be missing the stop nibble if >D9 is the final value in the source pattern as you mention. But this isn't obligatory for successful output of a single string.

Could add CHR$(15) to the end on point of principle, though, so it's there. In this case, the length byte would also need to be modified such that the result is

BlueWizard has a checkbox for "Add stop command". Use it.

Adding a 1111 stop cmd as CHR$(15) is not reliable. The stop cmd must begin in the first free bit. The speech chip will interpret the next 4 bits after the final bit. If it gets all 1s, it stops. If it sees 4 0s, that's a silent frame, and it continues. Otherwise, the frame has more parameters and it eats more bits...

I think bad things happen when you send garbage.

One resource (not the exact chip)

TMS5220 Voice Synthesis Processor (VSP) Data Manual

http://bitsavers.org/components/ti/_dataBooks/TMS5220_Voice_Synthesis_Processor_Data_Manual_-_preliminary_Jun81.pdf

See figure 5 on page 9 for the frame sizes and content.

+mizapf · January 20, 2021

By the way, does BlueWizard produce TMS5220 or TMS5200 LPC code? Our speech synthesizer is a TMS5200/TMS0285; the output will sound a bit different.

+FarmerPotato · January 20, 2021

Just now, mizapf said:

By the way, does BlueWizard produce TMS5220 or TMS5200 LPC code? Our speech synthesizer is a TMS5200/TMS0285; the output will sound a bit different.

It has the TMS5220 coding table.

I kludged in the table from MAME.

In tms5110r.hxx, there is T0285_2501E_coeff.

When I do that, BlueWizard doesn't work so well at fitting (probably this was never tested!)

I have not heard it on real hardware yet. My one test on js99er.net is with a 5220 coded file. It sounds recognizable, but too faint to tell if it is any good. It sounds fine in BlueWizard playback, but that's a closed loop!

I have a problem where I didn't speak with a level volume. BlueWizard scaled the gain to mostly small numbers 1-7, with highest volume at the start of each word. I can see that in Audacity (where I record), so it's my fault. But there are still a lot of options in BlueWizard that you might tweak.

+FarmerPotato · January 20, 2021

On 1/17/2021 at 12:40 PM, mizapf said:

Took a bit ... solved with Speecoder, then created a small Extended Basic program to recreate the strings. Note that a string must not exceed 255 characters, so I had to split it.

The process that I used can obviously be abbreviated, and you don't need Speecoder.

You have to split the file in pieces, and add the bytes 0x60, high byte of length, low byte of length. Also, at the ends of the pieces, a stop code should be added. It could suffice to add "255,255".

laugh.dsk 360 kB · 5 downlo

You got lucky with your first split. Here is my frame-by-frame decode. The stop frame begins in the last byte, bit 0, (the LSBit)

Spoiler


*   time frame addr bit type energy  rpt pitch k1 k2 k3 k4 k5 k6 k7 k8 k9 k10
*      0     1    0  0  zero      0
*     25     2    0  4  voiced    1   0   63   22 14 13  9  6 10  9  4  6  4
*     50     3    6  6  voiced    8   0   52   23 14 12 10  6  9  8  4  6  5
*     75     4   13  0  voiced   10   0   15   23 12 12  9  6 10  9  4  5  3
*    100     5   19  2  voiced    8   0   63   24 11 12 10  8  7  8  4  4  4
*    125     6   25  4  voiced    7   0   32   25 14 11 10  7  7  9  3  4  4
*    150     7   31  6  voiced    7   0   32   24 12 12  9  8  9 11  4  5  4
*    175     8   38  0  voiced    7   0   63   25 12 13 10  8  7  9  4  5  4
*    200     9   44  2  voiced    7   0   32   24 11 14 10  8  8 10  4  4  3
*    225    10   50  4  voiced    6   0   23   25 10 13 10  8  8 11  3  4  3
*    250    11   56  6  voiced    8   0    2   20 12 11  5  6 11  8  3  4  4
*    275    12   63  0  voiced   14   0    2   20 17  9  2  8 10 11  3  4  5
*    300    13   69  2  voiced   12   0    2   19 17  7  4 11 10  7  3  5  6
*    325    14   75  4  voiced   11   0    2   21 13 12  6  5 10  7  2  4  4
*    350    15   81  6  voiced    5   0    4   23 13 13  9  6  6  7  4  4  3
*    375    16   88  0  unvoiced  2   0    0   25 12 12 11
*    400    17   91  5  voiced    6   0    5   22 16 15 10  8  9  8  4  3  3
*    425    18   97  7  voiced    7   0    5   20 17 12  7  6  9  9  3  4  3
*    450    19  104  1  voiced   14   0    4   20 12 12  8 10  9 10  4  5  4
*    475    20  110  3  voiced   14   0    3   20 17 12  8  8 10 12  4  4  3
*    500    21  116  5  voiced   12   0    2   20 21 10  5 10 11  7  1  3  4
*    525    22  122  7  voiced   10   0    2   21 15 11  7  7  8  9  2  4  4
*    550    23  129  1  voiced    5   0    8   23 11 12 10  7  7  9  3  3  3
*    575    24  135  3  unvoiced  1   0    0   25 10 10 10
*    600    25  139  0  unvoiced  1   0    0   25  9  9  9
*    625    26  142  5  unvoiced  1   0    0   25 10 11  9
*    650    27  146  2  voiced    3   0    7   23  9 12  9  6  6 10  3  4  2
*    675    28  152  4  voiced   10   0    9   18 17 11  4  9 13 12  3  5  3
*    700    29  158  6  voiced   12   0    7   18 18 11  5  9 13 11  2  5  4
*    725    30  165  0  stop     15
* BITS/SEC: 1765.3

With the next split, the last nybble was 0, which is a safe split when you add 0,240.

Spoiler


*   time frame addr bit type energy  rpt pitch k1 k2 k3 k4 k5 k6 k7 k8 k9 k10
*      0     1    0  0  voiced    1   0   44   18 14 12  4  7 10 10  2  3  3
*     25     2    6  2  voiced    3   0   44   19 15 12  4  7 12 12  2  4  3
*     50     3   12  4  voiced    7   0   54   20 16 12  4  7 11 12  2  5  2
*     75     4   18  6  voiced   11   0   21   20 13 13  6  6 10 12  3  4  3
*    100     5   25  0  voiced    7   0   21   20 13 12  6  4 10 11  1  3  1
*    125     6   31  2  voiced    1   0   59   20  8 11  7  4  6 10  2  3  2
*    150     7   37  4  zero      0
*    175     8   38  0  zero      0
*    200     9   38  4  zero      0
*    225    10   39  0  voiced    1   0    5   17 20  9  5  8 11  8  2  4  2
*    250    11   45  2  voiced    3   0   31   19 15 11  6  6 11 10  2  4  2
*    275    12   51  4  voiced    9   0   19   19 15 12  5  7 11 11  2  4  3
*    300    13   57  6  voiced    6   0   19   18 17 11  3 10 11  9  1  3  3
*    325    14   64  0  voiced    1   0   20   19 11 11  5  5  7  8  2  2  2
*    350    15   70  2  zero      0
*    375    16   70  6  zero      0
*    400    17   71  2  voiced    1   0   54   24 13 12  8  6  8 10  3  3  2
*    425    18   77  4  voiced    6   0   27   22 19 12 10  6  8  9  4  4  4
*    450    19   83  6  voiced    8   0   62   22 19 12  8  4  7  9  4  4  4
*    475    20   90  0  voiced   10   0   13   23 13  9  8  4  9 10  5  6  4
*    500    21   96  2  voiced   12   0   14   21 12  8  5  7 11 10  4  5  5
*    525    22  102  4  voiced   12   0    2   20 14  8  5  5 12 11  4  4  3
*    550    23  108  6  voiced   11   0    3   20 19  7  4  9 11  8  2  3  4
*    575    24  115  0  voiced    7   0   18   20 19  6  8 11 12 11  1  4  4
*    600    25  121  2  voiced    1   0    4   19 18  8  8 12 12  9  1  5  3
*    625    26  127  4  zero      0
*    650    27  128  0  zero      0
*    675    28  128  4  zero      0
*    700    29  129  0  zero      0
*    725    30  129  4  stop     15
* BITS/SEC: 1386.7

Spoiler


*   time frame addr bit type energy  rpt pitch k1 k2 k3 k4 k5 k6 k7 k8 k9 k10
*      0     1    0  0  voiced    8   0    4   21 12 12  5  5 10  9  2  3  3
*     25     2    6  2  voiced    4   0    2   23 12 13  7  5  8 11  4  4  2
*     50     3   12  4  unvoiced  1   0    0   26 12 13 11
*     75     4   16  1  voiced    3   0    2   24 13 14 11  9  5  9  4  5  3
*    100     5   22  3  unvoiced  2   0    0   25 12 13 12
*    125     6   26  0  voiced    2   0   63   25 12 12 12  9  8  7  4  5  3
*    150     7   32  2  voiced    4   0   16   18 16 11  4  8 13  8  2  5  3
*    175     8   38  4  voiced   10   0   15   19 15 13  5  9 11  9  2  5  4
*    200     9   44  6  voiced   11   0   13   19 17 13  6  8 10 10  2  5  3
*    225    10   51  0  voiced   13   0    9   20 15 12  5  7 11 10  2  5  4
*    250    11   57  2  voiced    9   0    8   19 14 12  5  6 10 10  2  3  3
*    275    12   63  4  voiced    3   0   16   22 10 11  7  6 10  9  4  4  2
*    300    13   69  6  voiced    2   0   11   24 14 12 11  6  6  9  4  4  3
*    325    14   76  0  voiced    2   0   63   25 12 12 11  7  7  9  3  5  4
*    350    15   82  2  voiced    1   0   44   24 13 13 11  8  6  8  4  4  4
*    375    16   88  4  voiced    1   0   63   23 13 13 11  7  5 11  4  4  3
*    400    17   94  6  voiced    5   0   44   18 16 11  6  7 11 10  2  5  2
*    425    18  101  0  voiced   10   0   19   19 19 11  4  7 13 12  2  5  4
*    450    19  107  2  voiced   12   0   17   20 17 11  6  6 12 12  3  4  3
*    475    20  113  4  voiced   12   0   17   19 14 11  8  7 10 11  3  4  2
*    500    21  119  6  voiced    5   0   19   20 11 11  6  5  9  8  2  3  2
*    525    22  126  0  voiced    2   0   20   22  9  9  7  5  6  8  3  3  2
*    550    23  132  2  zero      0
*    575    24  132  6  voiced    1   0    7   24 14 11  9  8  5  7  4  4  2
*    600    25  139  0  voiced    1   0   10   24 12 11 11  6  5  7  3  4  3
*    625    26  145  2  voiced    1   0   63   25 12 13 12  9  5  8  4  5  3
*    650    27  151  4  voiced    1   0   21   25  9 10  9  6  3  8  3  4  2
*    675    28  157  6  voiced    6   0   29   18 20  9  4 11 11  8  2  4  3
*    700    29  164  0  voiced    9   0   27   18 18 12  4 11 12  9  2  4  4
*    725    30  170  2  voiced   11   0   24   19 18 11  4  8 11  9  3  4  3
*    750    31  176  4  voiced    8   0   25   20 16 12  4  5 10  8  2  4  3
*    775    32  182  6  voiced    1   0   32   22 11 12  7  6  7  8  4  3  1
*    800    33  189  0  voiced    1   0   25   24 13 11 10  5  6  8  4  4  2
*    825    34  195  2  voiced    1   0   15   24 15 12  9  7  9  7  3  4  3
*    850    35  201  4  stop     15
* BITS/SEC: 1846.9

FYI, 2000 bits/sec is the worst-case if all frames are voiced.

I noticed the pitch values are wild. It could be a bug in my code. Or this sample could be a very bad fit to BlueWizard or the speech model itself

Nick99 · January 20, 2021

I tested laugh.dsk on my TI and it is very near what I hear in blue wizard, the reason for trying with a laugh is to see if the speech can produce sounds to my game project (baby language, to be specific).

I may not succeed with the game itself due to lack of time to get back in to re-learn XB, I may do all I can and ask for help. ?

GDMike · January 20, 2021

I also ran this, I can hear the processing sounds, noise as the data is spoken in the background. I think, from what I remember, it's normal but the speech spoken would be better if that processing "hum" sound weren't there.

When I adjust the volume of the amplifier down, I actually lose the speech sound, so I found that not too much of an option.

The laugh is very cool.

Edited January 20, 2021 by GDMike

+mizapf · January 20, 2021

Actually, I used my Speecoder tool to place the splits and stop codes, so this was not really a matter of luck.

+mizapf · January 20, 2021

As for Speecoder, it does help a bit, but I had to learn where my tool has its limits. For instance, it refuses to create MERGE files when the resulting string would be longer than 255 bytes, and you have to clip the file by yourself. It might be simple to include a feature which automatically splits the speech strings, but - well - I don't really plan to put hands on my mostly uncommented assembly code from the late 80s.

However, it does its job for assembly source code files, because there is no string length limitation. But in this case where you already have a text version of the LPC data, it would be much easier to write some python script or similar to turn the data into BYTE assembly code lines, without a detour through my program.

+FarmerPotato · January 21, 2021

3 hours ago, mizapf said:

As for Speecoder, it does help a bit, but I had to learn where my tool has its limits.

I need to take a look at what you have in Speecoder! Where can I find it? Oh, here

I've added feature to my C utility, lpctos. It has -b to generate BASIC listings, and -f to take a batch of files at once.

BlueWizard LPC-file

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members