Jump to content
Sign in to follow this  
pixelpedant

What parameters and limitations (in particular, unacceptable values) are associated with characters/bytes contained in XB (or for that matter, Console BASIC) DATA statements?

Recommended Posts

So there are scenarios where it would be really nice for the sake of portability to be able to dump sequences of semi-arbitrary byte/string data not strictly limited to the set of visible characters into DATA statements, in XB.  XB256 does this sort of thing with compressed strings (of screen image data, etc.).  My main question towards this end, consequently is: what special characters/values/ranges need to be avoided (e.g., as values with special significance to the file and DATA statement structure itself), in this sort of scenario?  What byte values are illegal or reserved, in an XB DATA statements (or program text in general)?  I suppose this option is only really relevant to scenarios where bytes will never be of certain illegal values (or the data has been encoded to avoid them).  But the one I'm thinking about right now (Text-to-Speech disk allophone data) might happen to be one such case (only a relatively small subset of values are ever going to be used).  I'm actually pretty sure I've seen someone do this before with allophone data, but I can't remember where. 

 

I suppose it's also worth asking whether this answer differs in Console BASIC.  Since at the moment, I'm dealing with allophone data in XB.  But more often, I've dealt with it in Console BASIC (because TE2). 

Share this post


Link to post
Share on other sites
1 hour ago, pixelpedant said:

So there are scenarios where it would be really nice for the sake of portability to be able to dump sequences of semi-arbitrary byte/string data not strictly limited to the set of visible characters into DATA statements, in XB.

Just an idea how to prevent any problems with "escape" characters or how you'd like to name any of those characters outside of the "normal" character set (32..95 or max. 32..127):

 

A german TV station (WDR - Westdeutscher Rundfunk) transmitted software in the 80s by a scanline of the TV signal outside of the visible area (like "VideoText" is transmitted, too). To prevent problems with "special" characters they used a translation they called "sixeln" (derived from the number "six"). It was quite easy:

 

Let's assume 3 bytes of data:

>F3, >41, >07

This is in binary notation:

XXXXooXX, oXoooooX, oooooXXX

Now cut this line of 3*8 = 24 bit into 4 parts of 6 bit each:

XXXXoo, XXoXoo, oooXoo, oooXXX

interpret every 6 bit as a new byte of 8 bit (left padded with zero) and add 32 to each value:

ooXXXXoo = >3C, +32 = >5C

ooXXoXoo = >34, +32 = >54

oooooXoo = >40, +32 = >60

oooooXXX = >07, +32 = >27

 

Now you have a sequence of 4 Bytes instead of 3, but every byte has a value in the range of 32..92 and will not collide with any restrictions. And decoding is as easy as encoding.

 

Just my $0.02

 

Michael

 

Share this post


Link to post
Share on other sites

... which is pretty close to Base64, with the exception that Base64 uses a table for mapping the 6-in-8 bytes to characters.

Share this post


Link to post
Share on other sites

RXB 2001 to RXB 2015 has CALL PSAVE and CALL PLOAD for 8K saved to disk or hard drive.

A CALL LOAD can save or load these values to Lower 8K, or faster use RXB CALL MOVES to fetch any length to be moves somewhere else like VDP to be read by Speech Synthesizer.

 

Share this post


Link to post
Share on other sites
Posted (edited)
4 hours ago, pixelpedant said:

So there are scenarios where it would be really nice for the sake of portability to be able to dump sequences of semi-arbitrary byte/string data not strictly limited to the set of visible characters into DATA statements, in XB.  XB256 does this sort of thing with compressed strings (of screen image data, etc.).  My main question towards this end, consequently is: what special characters/values/ranges need to be avoided (e.g., as values with special significance to the file and DATA statement structure itself), in this sort of scenario?  What byte values are illegal or reserved, in an XB DATA statements (or program text in general)?  I suppose this option is only really relevant to scenarios where bytes will never be of certain illegal values (or the data has been encoded to avoid them).  But the one I'm thinking about right now (Text-to-Speech disk allophone data) might happen to be one such case (only a relatively small subset of values are ever going to be used).  I'm actually pretty sure I've seen someone do this before with allophone data, but I can't remember where. 

 

I suppose it's also worth asking whether this answer differs in Console BASIC.  Since at the moment, I'm dealing with allophone data in XB.  But more often, I've dealt with it in Console BASIC (because TE2). 

The answer is yes, you can put any byte into a DATA statement. Of course the trick is getting it there. You can't type it in with the editor so you have to resort to various subterfuges.

Here is a 1 line program:

10 DATA "HELLO"

Here is how it tokenizes:

00 0A      FF DF                     09                      93           C7           05                     48 45 4C 4C 4F     00    

line 10     address of code     9 bytes of code    DATA     "string"       length string      H   E   L   L   0      end of line

 

You can use Classic99's debugger to put any ascii character you want into the string and it works fine. What I did for XB256 was to come up with a way to create a DATA statement and then save it as part of a MERGE format file.

I believe this could be saved and then loaded with TI BASIC. Remember that TI BASIC keeps the program in VDP memory.

 

I should add that you can find out a lot about XB and BASIC using Classic99's debugger.

Edited by senior_falcon
  • Like 2

Share this post


Link to post
Share on other sites

Thank you for explaining the data structure!  And confirming you can dump just any damn data you feel like into a DATA statement (given the means). 

 

It just seems to me that in a scenario where you've got, say, 120 bytes of allophone data you want to read and use all at once at some point in a program, it's appealing to just have a single DATA line inside the program from which you read it, instead of either

 

1) having an extra supporting data file just for the sake of it

or

2) listing 120 decimal values in DATA statements, which you then CHR$() and concatenate one by one. 

 

 

  • Like 1

Share this post


Link to post
Share on other sites

Likewise, I just had to put an assy program statement together with BYTE instead of data. I get confused.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...