Nick99 Posted January 16, 2021 Share Posted January 16, 2021 I have played around with BlueWizard and wonder how to use the LPC data in, let's say XB? After reading all I could find on this forum I still have no clue how to use the data. Quote Link to comment Share on other sites More sharing options...
pixelpedant Posted January 16, 2021 Share Posted January 16, 2021 Well, if it gives you an LPC speech string which is structured correctly, you can just feed it directly to CALL SAY. The way you can feed a value generated by CALL SPGET to it. Patterns in this context always begin with 0110000000000000 and end with the 1111 somewhere in the last byte, where 01100000 is the Speak-External command and 1111 is the stop code. But each byte is inverted and the actual values contained within the frames are of variable length and are not aligned with the bytes themselves. So the actual speech frames are utterly obscured until you flip the bytes. I haven't gotten around to playing with BlueWizard either (mainly because I don't have a Mac). Should do, now that it's been ported to Python. But it'd strike me as odd if it weren't already correctly structured for use. I'd just check that it begins with 0110000000000000 and the first non-zero values in the final byte are 1111. If so, you should be good to go to just feed it to CALL SAY as a string. 1 Quote Link to comment Share on other sites More sharing options...
Nick99 Posted January 17, 2021 Author Share Posted January 17, 2021 Thanks @pixelpedant! The lpc-file begins with ee,ad,d9,27,bd,93,84,a7... ends with ...9b,71,f7,70,db, I guess I could use it in this format in assembly, just as in the QBox tutorial on youtube, DATA >ee,>ad,>d9 and so on? Or convert all the data to binary and put in a XB program, looks like I have a lot of work to do, convert everything or learn assembly. ? Quote Link to comment Share on other sites More sharing options...
pixelpedant Posted January 17, 2021 Share Posted January 17, 2021 Yeah, so that looks like it's just the LPC data itself without the necessary additional values CALL SAY will need. However, those are easy enough to add in. Assuming this is just the raw LPC data, you'd just need to add three bytes to the start of the pattern. Namely, CHR$(96) CHR$(0) CHR$(X) where X = the number of bytes of LPC data which follow. So presuming a string variable LPC$ contains the raw LPC pattern, SAYOUT$=CHR$(96)&CHR$(0)&CHR$(LEN(LPC$))&LPC$ CALL SAY(,SAYOUT$) should give you your XB output. It'll still be missing the stop nibble if >D9 is the final value in the source pattern as you mention. But this isn't obligatory for successful output of a single string. Could add CHR$(15) to the end on point of principle, though, so it's there. In this case, the length byte would also need to be modified such that the result is SAYOUT$=CHR$(96)&CHR$(0)&CHR$(LEN(LPC$)+1)&LPC$&CHR$(15) CALL SAY(,SAYOUT$) I really should play with this program myself, though. 2 1 Quote Link to comment Share on other sites More sharing options...
+mizapf Posted January 17, 2021 Share Posted January 17, 2021 Please send me the LPC data; maybe I can do something with it in Speecoder (my program for analyzing speech data). 1 Quote Link to comment Share on other sites More sharing options...
+mizapf Posted January 17, 2021 Share Posted January 17, 2021 (edited) Took a bit ... solved with Speecoder, then created a small Extended Basic program to recreate the strings. Note that a string must not exceed 255 characters, so I had to split it. The process that I used can obviously be abbreviated, and you don't need Speecoder. You have to split the file in pieces, and add the bytes 0x60, high byte of length, low byte of length. Also, at the ends of the pieces, a stop code should be added. It could suffice to add "255,255". 10 PRINT "READ PART 1" :: RESTORE 1000 20 GOSUB 500 30 S1$=A$ 40 PRINT "READ PART 2" :: RESTORE 1500 50 GOSUB 500 60 S2$=A$ 70 PRINT "READ PART 3" :: RESTORE 2000 80 GOSUB 500 90 S3$=A$ 100 CALL SAY(,S1$,,S2$,,S3$) 110 END 500 READ A,B,C 510 A$=CHR$(A)&CHR$(B)&CHR$(C) 520 FOR I=1 TO C :: READ X :: A$=A$&CHR$(X) :: NEXT I 530 RETURN 1000 DATA 96,0,166 1010 DATA 128,254,230,54,173,50 1020 DATA 75,88,186,155,178,140 1030 DATA 172,133,239,102,210,42 1040 DATA 83,135,127,232,169,240 1050 DATA 72,226,130,233,186,220 1060 DATA 211,137,11,134,153,140 1070 DATA 236,52,238,159,102,43 1080 DATA 60,211,184,96,232,171 1090 DATA 136,74,108,244,169,182 1100 DATA 34,218,113,128,138,105 1110 DATA 181,14,39,7,42,49 1120 DATA 41,170,157,14,40,71 1130 DATA 151,46,119,215,160,106 1140 DATA 199,180,92,137,66,186 1150 DATA 221,52,243,196,4,152 1160 DATA 102,218,160,45,124,69 1170 DATA 70,108,135,150,56,110 1180 DATA 153,142,15,82,204,68 1190 DATA 101,165,57,112,137,19 1200 DATA 81,147,120,64,165,150 1210 DATA 86,59,157,2,85,223 1220 DATA 238,145,74,20,209,245 1230 DATA 148,123,218,70,192,84 1240 DATA 85,8,152,50,19,1 1250 DATA 83,117,50,188,203,73 1260 DATA 179,114,84,200,20,91 1270 DATA 114,199,245,192,83,106 1280 DATA 205,109,53,15 1500 DATA 96,0,203 1510 DATA 1,169,102,84,43,101 1520 DATA 11,168,155,117,141,78 1530 DATA 132,128,101,182,25,52 1540 DATA 236,117,106,166,38,192 1550 DATA 52,59,228,159,102,38 1560 DATA 195,83,11,33,133,150 1570 DATA 216,80,93,248,236,87 1580 DATA 179,83,77,99,115,92 1590 DATA 139,42,213,139,44,126 1600 DATA 212,187,212,36,34,187 1610 DATA 81,171,146,205,132,166 1620 DATA 218,173,50,17,161,135 1630 DATA 155,54,203,196,228,159 1640 DATA 102,218,61,221,160,102 1650 DATA 216,237,176,72,130,254 1660 DATA 110,183,93,59,177,106 1670 DATA 82,104,243,46,85,69 1680 DATA 206,185,197,119,212,12 1690 DATA 177,196,54,155,113,60 1700 DATA 196,236,58,188,218,145 1710 DATA 146,139,110,211,12,89 1720 DATA 68,105,50,93,45,108 1730 DATA 1,194,135,235,12,245 1740 DATA 68,136,26,166,219,212 1750 DATA 29,163,127,154,157,212 1760 DATA 72,141,212,41,43,141 1770 DATA 195,145,113,83,73,233 1780 DATA 14,197,201,78,105,164 1790 DATA 39,149,52,35,167,150 1800 DATA 232,116,28,204,18,70 1810 DATA 180,66,49,10,154,30 1820 DATA 55,143,152,200,28,182 1830 DATA 75,45,18,33,126,248 1840 DATA 73,79,119,252,0 2000 DATA 96,0,131 2010 DATA 168,73,110,196,171,100 2020 DATA 179,38,251,17,159,81 2030 DATA 236,182,18,70,188,71 2040 DATA 85,83,139,93,179,26 2050 DATA 199,78,45,118,76,170 2060 DATA 105,162,187,136,118,177 2070 DATA 146,5,0,8,141,37 2080 DATA 53,58,20,49,63,251 2090 DATA 54,235,82,148,228,236 2100 DATA 71,189,91,177,145,83 2110 DATA 108,174,78,218,72,201 2120 DATA 186,85,61,36,1,160 2130 DATA 109,216,9,139,178,101 2140 DATA 236,150,167,44,50,73 2150 DATA 248,90,158,16,207,36 2160 DATA 133,237,54,67,178,186 2170 DATA 12,174,154,80,239,74 2180 DATA 59,160,226,66,117,58 2190 DATA 113,131,75,118,201,14 2200 DATA 57,78,42,217,162,167 2210 DATA 153,32,36,167,136,153 2220 DATA 100,13,0,240,0 laugh.dsk Edited January 17, 2021 by mizapf 1 3 Quote Link to comment Share on other sites More sharing options...
Nick99 Posted January 17, 2021 Author Share Posted January 17, 2021 Thank you very much @mizapf ! Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted January 20, 2021 Share Posted January 20, 2021 On 1/17/2021 at 3:39 AM, pixelpedant said: It'll still be missing the stop nibble if >D9 is the final value in the source pattern as you mention. But this isn't obligatory for successful output of a single string. Could add CHR$(15) to the end on point of principle, though, so it's there. In this case, the length byte would also need to be modified such that the result is BlueWizard has a checkbox for "Add stop command". Use it. Adding a 1111 stop cmd as CHR$(15) is not reliable. The stop cmd must begin in the first free bit. The speech chip will interpret the next 4 bits after the final bit. If it gets all 1s, it stops. If it sees 4 0s, that's a silent frame, and it continues. Otherwise, the frame has more parameters and it eats more bits... I think bad things happen when you send garbage. One resource (not the exact chip) TMS5220 Voice Synthesis Processor (VSP) Data Manual http://bitsavers.org/components/ti/_dataBooks/TMS5220_Voice_Synthesis_Processor_Data_Manual_-_preliminary_Jun81.pdf See figure 5 on page 9 for the frame sizes and content. Quote Link to comment Share on other sites More sharing options...
+mizapf Posted January 20, 2021 Share Posted January 20, 2021 By the way, does BlueWizard produce TMS5220 or TMS5200 LPC code? Our speech synthesizer is a TMS5200/TMS0285; the output will sound a bit different. Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted January 20, 2021 Share Posted January 20, 2021 Just now, mizapf said: By the way, does BlueWizard produce TMS5220 or TMS5200 LPC code? Our speech synthesizer is a TMS5200/TMS0285; the output will sound a bit different. It has the TMS5220 coding table. I kludged in the table from MAME. In tms5110r.hxx, there is T0285_2501E_coeff. When I do that, BlueWizard doesn't work so well at fitting (probably this was never tested!) I have not heard it on real hardware yet. My one test on js99er.net is with a 5220 coded file. It sounds recognizable, but too faint to tell if it is any good. It sounds fine in BlueWizard playback, but that's a closed loop! I have a problem where I didn't speak with a level volume. BlueWizard scaled the gain to mostly small numbers 1-7, with highest volume at the start of each word. I can see that in Audacity (where I record), so it's my fault. But there are still a lot of options in BlueWizard that you might tweak. Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted January 20, 2021 Share Posted January 20, 2021 On 1/17/2021 at 12:40 PM, mizapf said: Took a bit ... solved with Speecoder, then created a small Extended Basic program to recreate the strings. Note that a string must not exceed 255 characters, so I had to split it. The process that I used can obviously be abbreviated, and you don't need Speecoder. You have to split the file in pieces, and add the bytes 0x60, high byte of length, low byte of length. Also, at the ends of the pieces, a stop code should be added. It could suffice to add "255,255". laugh.dsk 360 kB · 5 downlo You got lucky with your first split. Here is my frame-by-frame decode. The stop frame begins in the last byte, bit 0, (the LSBit) Spoiler * time frame addr bit type energy rpt pitch k1 k2 k3 k4 k5 k6 k7 k8 k9 k10 * 0 1 0 0 zero 0 * 25 2 0 4 voiced 1 0 63 22 14 13 9 6 10 9 4 6 4 * 50 3 6 6 voiced 8 0 52 23 14 12 10 6 9 8 4 6 5 * 75 4 13 0 voiced 10 0 15 23 12 12 9 6 10 9 4 5 3 * 100 5 19 2 voiced 8 0 63 24 11 12 10 8 7 8 4 4 4 * 125 6 25 4 voiced 7 0 32 25 14 11 10 7 7 9 3 4 4 * 150 7 31 6 voiced 7 0 32 24 12 12 9 8 9 11 4 5 4 * 175 8 38 0 voiced 7 0 63 25 12 13 10 8 7 9 4 5 4 * 200 9 44 2 voiced 7 0 32 24 11 14 10 8 8 10 4 4 3 * 225 10 50 4 voiced 6 0 23 25 10 13 10 8 8 11 3 4 3 * 250 11 56 6 voiced 8 0 2 20 12 11 5 6 11 8 3 4 4 * 275 12 63 0 voiced 14 0 2 20 17 9 2 8 10 11 3 4 5 * 300 13 69 2 voiced 12 0 2 19 17 7 4 11 10 7 3 5 6 * 325 14 75 4 voiced 11 0 2 21 13 12 6 5 10 7 2 4 4 * 350 15 81 6 voiced 5 0 4 23 13 13 9 6 6 7 4 4 3 * 375 16 88 0 unvoiced 2 0 0 25 12 12 11 * 400 17 91 5 voiced 6 0 5 22 16 15 10 8 9 8 4 3 3 * 425 18 97 7 voiced 7 0 5 20 17 12 7 6 9 9 3 4 3 * 450 19 104 1 voiced 14 0 4 20 12 12 8 10 9 10 4 5 4 * 475 20 110 3 voiced 14 0 3 20 17 12 8 8 10 12 4 4 3 * 500 21 116 5 voiced 12 0 2 20 21 10 5 10 11 7 1 3 4 * 525 22 122 7 voiced 10 0 2 21 15 11 7 7 8 9 2 4 4 * 550 23 129 1 voiced 5 0 8 23 11 12 10 7 7 9 3 3 3 * 575 24 135 3 unvoiced 1 0 0 25 10 10 10 * 600 25 139 0 unvoiced 1 0 0 25 9 9 9 * 625 26 142 5 unvoiced 1 0 0 25 10 11 9 * 650 27 146 2 voiced 3 0 7 23 9 12 9 6 6 10 3 4 2 * 675 28 152 4 voiced 10 0 9 18 17 11 4 9 13 12 3 5 3 * 700 29 158 6 voiced 12 0 7 18 18 11 5 9 13 11 2 5 4 * 725 30 165 0 stop 15 * BITS/SEC: 1765.3 With the next split, the last nybble was 0, which is a safe split when you add 0,240. Spoiler * time frame addr bit type energy rpt pitch k1 k2 k3 k4 k5 k6 k7 k8 k9 k10 * 0 1 0 0 voiced 1 0 44 18 14 12 4 7 10 10 2 3 3 * 25 2 6 2 voiced 3 0 44 19 15 12 4 7 12 12 2 4 3 * 50 3 12 4 voiced 7 0 54 20 16 12 4 7 11 12 2 5 2 * 75 4 18 6 voiced 11 0 21 20 13 13 6 6 10 12 3 4 3 * 100 5 25 0 voiced 7 0 21 20 13 12 6 4 10 11 1 3 1 * 125 6 31 2 voiced 1 0 59 20 8 11 7 4 6 10 2 3 2 * 150 7 37 4 zero 0 * 175 8 38 0 zero 0 * 200 9 38 4 zero 0 * 225 10 39 0 voiced 1 0 5 17 20 9 5 8 11 8 2 4 2 * 250 11 45 2 voiced 3 0 31 19 15 11 6 6 11 10 2 4 2 * 275 12 51 4 voiced 9 0 19 19 15 12 5 7 11 11 2 4 3 * 300 13 57 6 voiced 6 0 19 18 17 11 3 10 11 9 1 3 3 * 325 14 64 0 voiced 1 0 20 19 11 11 5 5 7 8 2 2 2 * 350 15 70 2 zero 0 * 375 16 70 6 zero 0 * 400 17 71 2 voiced 1 0 54 24 13 12 8 6 8 10 3 3 2 * 425 18 77 4 voiced 6 0 27 22 19 12 10 6 8 9 4 4 4 * 450 19 83 6 voiced 8 0 62 22 19 12 8 4 7 9 4 4 4 * 475 20 90 0 voiced 10 0 13 23 13 9 8 4 9 10 5 6 4 * 500 21 96 2 voiced 12 0 14 21 12 8 5 7 11 10 4 5 5 * 525 22 102 4 voiced 12 0 2 20 14 8 5 5 12 11 4 4 3 * 550 23 108 6 voiced 11 0 3 20 19 7 4 9 11 8 2 3 4 * 575 24 115 0 voiced 7 0 18 20 19 6 8 11 12 11 1 4 4 * 600 25 121 2 voiced 1 0 4 19 18 8 8 12 12 9 1 5 3 * 625 26 127 4 zero 0 * 650 27 128 0 zero 0 * 675 28 128 4 zero 0 * 700 29 129 0 zero 0 * 725 30 129 4 stop 15 * BITS/SEC: 1386.7 Spoiler * time frame addr bit type energy rpt pitch k1 k2 k3 k4 k5 k6 k7 k8 k9 k10 * 0 1 0 0 voiced 8 0 4 21 12 12 5 5 10 9 2 3 3 * 25 2 6 2 voiced 4 0 2 23 12 13 7 5 8 11 4 4 2 * 50 3 12 4 unvoiced 1 0 0 26 12 13 11 * 75 4 16 1 voiced 3 0 2 24 13 14 11 9 5 9 4 5 3 * 100 5 22 3 unvoiced 2 0 0 25 12 13 12 * 125 6 26 0 voiced 2 0 63 25 12 12 12 9 8 7 4 5 3 * 150 7 32 2 voiced 4 0 16 18 16 11 4 8 13 8 2 5 3 * 175 8 38 4 voiced 10 0 15 19 15 13 5 9 11 9 2 5 4 * 200 9 44 6 voiced 11 0 13 19 17 13 6 8 10 10 2 5 3 * 225 10 51 0 voiced 13 0 9 20 15 12 5 7 11 10 2 5 4 * 250 11 57 2 voiced 9 0 8 19 14 12 5 6 10 10 2 3 3 * 275 12 63 4 voiced 3 0 16 22 10 11 7 6 10 9 4 4 2 * 300 13 69 6 voiced 2 0 11 24 14 12 11 6 6 9 4 4 3 * 325 14 76 0 voiced 2 0 63 25 12 12 11 7 7 9 3 5 4 * 350 15 82 2 voiced 1 0 44 24 13 13 11 8 6 8 4 4 4 * 375 16 88 4 voiced 1 0 63 23 13 13 11 7 5 11 4 4 3 * 400 17 94 6 voiced 5 0 44 18 16 11 6 7 11 10 2 5 2 * 425 18 101 0 voiced 10 0 19 19 19 11 4 7 13 12 2 5 4 * 450 19 107 2 voiced 12 0 17 20 17 11 6 6 12 12 3 4 3 * 475 20 113 4 voiced 12 0 17 19 14 11 8 7 10 11 3 4 2 * 500 21 119 6 voiced 5 0 19 20 11 11 6 5 9 8 2 3 2 * 525 22 126 0 voiced 2 0 20 22 9 9 7 5 6 8 3 3 2 * 550 23 132 2 zero 0 * 575 24 132 6 voiced 1 0 7 24 14 11 9 8 5 7 4 4 2 * 600 25 139 0 voiced 1 0 10 24 12 11 11 6 5 7 3 4 3 * 625 26 145 2 voiced 1 0 63 25 12 13 12 9 5 8 4 5 3 * 650 27 151 4 voiced 1 0 21 25 9 10 9 6 3 8 3 4 2 * 675 28 157 6 voiced 6 0 29 18 20 9 4 11 11 8 2 4 3 * 700 29 164 0 voiced 9 0 27 18 18 12 4 11 12 9 2 4 4 * 725 30 170 2 voiced 11 0 24 19 18 11 4 8 11 9 3 4 3 * 750 31 176 4 voiced 8 0 25 20 16 12 4 5 10 8 2 4 3 * 775 32 182 6 voiced 1 0 32 22 11 12 7 6 7 8 4 3 1 * 800 33 189 0 voiced 1 0 25 24 13 11 10 5 6 8 4 4 2 * 825 34 195 2 voiced 1 0 15 24 15 12 9 7 9 7 3 4 3 * 850 35 201 4 stop 15 * BITS/SEC: 1846.9 FYI, 2000 bits/sec is the worst-case if all frames are voiced. I noticed the pitch values are wild. It could be a bug in my code. Or this sample could be a very bad fit to BlueWizard or the speech model itself Quote Link to comment Share on other sites More sharing options...
Nick99 Posted January 20, 2021 Author Share Posted January 20, 2021 I tested laugh.dsk on my TI and it is very near what I hear in blue wizard, the reason for trying with a laugh is to see if the speech can produce sounds to my game project (baby language, to be specific). I may not succeed with the game itself due to lack of time to get back in to re-learn XB, I may do all I can and ask for help. ? 3 Quote Link to comment Share on other sites More sharing options...
GDMike Posted January 20, 2021 Share Posted January 20, 2021 (edited) I also ran this, I can hear the processing sounds, noise as the data is spoken in the background. I think, from what I remember, it's normal but the speech spoken would be better if that processing "hum" sound weren't there. When I adjust the volume of the amplifier down, I actually lose the speech sound, so I found that not too much of an option. The laugh is very cool. Edited January 20, 2021 by GDMike Quote Link to comment Share on other sites More sharing options...
+mizapf Posted January 20, 2021 Share Posted January 20, 2021 Actually, I used my Speecoder tool to place the splits and stop codes, so this was not really a matter of luck. 3 Quote Link to comment Share on other sites More sharing options...
+mizapf Posted January 20, 2021 Share Posted January 20, 2021 As for Speecoder, it does help a bit, but I had to learn where my tool has its limits. For instance, it refuses to create MERGE files when the resulting string would be longer than 255 bytes, and you have to clip the file by yourself. It might be simple to include a feature which automatically splits the speech strings, but - well - I don't really plan to put hands on my mostly uncommented assembly code from the late 80s. However, it does its job for assembly source code files, because there is no string length limitation. But in this case where you already have a text version of the LPC data, it would be much easier to write some python script or similar to turn the data into BYTE assembly code lines, without a detour through my program. Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted January 21, 2021 Share Posted January 21, 2021 3 hours ago, mizapf said: As for Speecoder, it does help a bit, but I had to learn where my tool has its limits. I need to take a look at what you have in Speecoder! Where can I find it? Oh, here I've added feature to my C utility, lpctos. It has -b to generate BASIC listings, and -f to take a batch of files at once. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.