Variable name/program space relationship

Opry99er · January 4, 2018

Just trying to wrap my head around something here and I can't seem to find the answer via the search function (although I know the information exists here somewhere).

When you have an XB program which contains variables, and those variables each have names, and each of these named variables have a certain number of letters in the variable name.......

How does one calculate the number of bytes a variable takes up in program space?

G=5

vs.

GORILLA=5

So here we have two variables, each containing the same number. Can someone give me the breakdown of the program space used for each of these?

Additionally, if G is referenced 10 times in a program, and GORILLA is referenced 10 times in a program, is the initial allocation of memory for that particular variable the biggest usage, or is program space eaten up each time that variable is referenced?

IIRC, each variable is tokenized and then references to that initial variable take less space than the initial variable allocation... but I can't find the answer I'm looking for.

If I have a variable GORILLA defined in Line 100, but then modify or reference it in lines 160, 170, 180, and 300, am I chewing up unnecessary program space each time it is referenced (vs if it was called G) or do the following references only hit the program space as a token?

***Trying to weigh readability against optimization with a large program I need to down-size)

Please give me a hand. Thanks in advance!

+adamantyr · January 4, 2018

Generally shorter variable names are better, 1 to 2 characters at most. For one thing, long variable names means less instructions per line in Extended BASIC.

+OLD CS1 · January 4, 2018

For large programs, I eventually adopted a programming method for variables I picked up while programming on the Commodore 64 in parallel: two-character variables. The shorter the variable names the fewer characters get stored in the BASIC space.

I also started using more arrays. Now, I am not certain if using arrays is any quicker as a function of name search versus index look-up, but it seemed cleaner to me.

Opry99er · January 4, 2018

Thanks very much, guys!

Yes... there is no doubt that keeping variable names short and sweet is preferable. On arrays, those can definitely be useful when your variables have commonality (for me, conceptually).

The primary thing I'm trying to decide here is whether going through and doing a variable re-write will achieve my goal, space-wise. In the above example of "GORILLA":

If the variable GORILLA takes up 7 bytes plus a token (just grasping at straws here), then that variable is taking up 8 bytes, just in the name alone. That would mean that if I defined it once and referenced or updated it 10 times throughout the code, that would be a total of 78 bytes taken up just through the name alone.....

With the example of "G" instead, that would be 12 bytes. (1 byte plus token to start, 1 byte per reference).

78 vs 12 is significant.

If, however, each subsequent reference to that variable only takes up 1 byte (once interpreted), then the gain is not so significant. It would be 18 vs. 12.

So, now that I've talked my way through that, I guess the question is whether the bytes used up in the physical BASIC code are the PROGRAM space, or is the interpreted GPL "object code" (for lack of a better term) what makes up actual PROGRAM space?

Due to the fact that you must RUN your program prior to getting an accurate reading when using the SIZE command, my inclination is to believe that it is either the interpreted code, or it is a combination of the two.

sometimes99er · January 4, 2018

78 vs 12 is significant.

If, however, each subsequent reference to that variable only takes up 1 byte (once interpreted), then the gain is not so significant. It would be 18 vs. 12.

It is like you describe in the first case.

The lines are stored in memory from top and backwards.

Edited January 4, 2018 by sometimes99er

+Lee Stewart · January 6, 2018

It is like you describe in the first case.

The lines are stored in memory from top and backwards.

Same in XB, but in high RAM for the program with expansion RAM. The variable name is not tokenized (as shown by @sometimes99er). “GORILLA” is 6 characters longer than ‘G’, so each reference to GORILLA adds 6 bytes to the program size, before and after running it. The variable table in VRAM is, of course, only 6 bytes longer because there is only ever that one entry for the variable.

...lee

+TheBF · January 6, 2018

Same in XB, but in high RAM for the program with expansion RAM. The variable name is not tokenized (as shown by @sometimes99er). “GORILLA” is 6 characters longer than ‘G’, so each reference to GORILLA adds 6 bytes to the program size, before and after running it. The variable table in VRAM is, of course, only 6 bytes longer because there is only ever that one entry for the variable.

...lee

So that would mean the Basic interpreter has to search for each variable string in the list every time it's referenced in the program?

If that's the case then 1 character variables would be preferred.

That is kind of sad.

+OLD CS1 · January 6, 2018

I always wondered why BASIC languages do a string search for variable names. I would think a good way to store variables would be a hash-table of names with a collision flag.

+adamantyr · January 6, 2018

I always wondered why BASIC languages do a string search for variable names. I would think a good way to store variables would be a hash-table of names with a collision flag.

Most other BASICS at the time had pretty limited variable name size, TI was unique in that regard allowing up to 15 characters. From a usability standpoint, they're making it WAY easier to program. But from an efficiency standpoint it's just terrible.

Another thing lacking in TI BASIC is an integer variable type. Several other BASIC's had one, using a % symbol to indicate them.

It goes without saying that a good number of us here could write a WAY more efficient and usable BASIC. Or you can use RXB which our awesome rockstar Rich Gilbertson created.

+mizapf · January 6, 2018

Need not say more ...

+OLD CS1 · January 7, 2018

Need not say more ...

Word. I found the integer variable type very useful. I used them to quickly pass information to and from ML routines. In my BBS program I used an IRQ routine to monitor one integer variable for a register number to update with the contents of another integer. For instance, set B1% with the value, then set B0% with the register number. This is useful because some registers are found across different memory addresses, and different bits mean different things, so I do not have to track multiple tables.

But I digress...

1980gamer · January 7, 2018

This is interesting!

I try to shorten vars as much as possible. However, when I look at things I did long long ago... They are not always descriptive enough or they are....

Strange? I find things like FUC. WTF? Oh, Fuel Unit Consumption!

With Classic99 and pasting from notepad, I use longer vars and then find and replace to shorter vars if needed. ( a lot more REM's as well. )

Never thought about how much memory I could actually save by shortening vars. but it has saved me in the past.

I now try to reuse them as much as possible too.

RXB · January 7, 2018

I always wondered why BASIC languages do a string search for variable names. I would think a good way to store variables would be a hash-table of names with a collision flag.

In GPL SEARCH (especially in XB) the same routine is used for many functions.

1. Variable names string/numeric/Subprograms/Definitions(DEF)

2. Similar to modules when you start the TI99/4A it looks at the HEADER for Powerup/Cartridge/DSR/Subprograms(XB)//Interupts/Ti Basic CALLs.

TI Intern for example:

XML >16 (Search variable name), leads back to GPL
15D6 06A0 BL @>15E0 Search name
15D8 15E0
15DA 006A DATA >006A Return reset condition bit
15DC 0460 B @>00CE Return set condition bit
15DE 00CE
15E0 C120 MOV @>833E,4 Pointer fetch var list
15E2 833E
15E4 1312 JEQ >160A No list, end reset condition bit
15E6 D0E0 MOVB @>8359,3 Fetch length byte
15E8 8359
15EA 04C7 CLR 7
15EC 0584 INC 4
15EE D7E0 MOVB @>83E9,*15 Write VDP address
15F0 83E9
15F2 1000 JMP >15F4
15F4 D7C4 MOVB 4,*15
15F6 020A LI 10,>8800 VDP read data
15F8 8800
15FA 90DA CB *10,3 Compare length of variable
15FC 1308 JEQ >160E Right, check name
15FE D19A MOVB *10,6 Address next variable
1600 1000 JMP >1602
1602 D81A MOVB *10,@>83ED
1604 83ED
1606 C106 MOV 6,4 New address in R4
1608 16F1 JNE >15EC Go on
160A C2DB MOV *11,11 Fetch return
160C 045B B *11 Return
160E D19A MOVB *10,6 Address next variable
1610 1000 JMP >1612
1612 D81A MOVB *10,@>83ED
1614 83ED
1616 1000 JMP >1618
1618 D15A MOVB *10,5 Address name of variable
161A D803 MOVB 3,@>83EF Length byte in R7 Lbyte
161C 83EF
161E D09A MOVB *10,2
1620 D7C2 MOVB 2,*15 Write address VDP
1622 1000 JMP >1624
1624 D7C5 MOVB 5,*15
1626 0202 LI 2,>834A FAC
1628 834A
162A 9C9A CB *10,*2+ Compare name
162C 16EC JNE >1606 Next variable
162E 0607 DEC 7
1630 15FC JGT >162A Until length end
1632 0604 DEC 4
1634 C804 MOV 4,@>834A Address on FAC shows to value of variables
1636 834A
1638 046B B @>0002(11) Return +2
163A 0002

Now a hash tag would have more efficiency but you can not use a hash tag for everything as they take up more memory and are even boundaries.

Tokenism commands are more effective at reducing strains on memory limits for limited memory.

I guess there is a trade off but I do not fault TI for the chosen method as I love GPL.

+TheBF · January 7, 2018

You can see the performance difference with this little program.

It's about 30% slower with the really long names.

100 REM  big vars test
110 PRINT "Short variables..."
120 FOR I=1 TO 1000
130 V=V+I
140 NEXT I
150 PRINT "Done!"
160 PRINT "Long Variables..."
170 FOR LONGINDEXNAME=1 TO 1000
180 LONGVARIABLE=LONGVARIABLE+LONGINDEXNAME
190 NEXT LONGINDEXNAME
200 PRINT "Done!"
210 END

Casey · January 7, 2018

I always wondered why BASIC languages do a string search for variable names. I would think a good way to store variables would be a hash-table of names with a collision flag.

Atari BASIC actually did tokenize variables, with the limitation being you could only have 128 unique variables in a program. Another approach to the same problem...

+TheBF · January 7, 2018

Atari BASIC actually did tokenize variables, with the limitation being you could only have 128 unique variables in a program. Another approach to the same problem...

That was my thought on improving it. Use an index number and a type number to I.D. the variable in the final program.

But heck if you go that far you could just record the memory address of the variable in the program like Forth does.

It would be one 16 bit integer.

All water under the bridge at this stage, but interesting to know when writing TI BASIC programs what to avoid.

Opry99er · January 8, 2018

Thanks for all the detailed replies, folks.

I will be doing Search and Replace to drop it like it's hot.

Should save me..... at least 1K for all references to all variables shortened to single-letter variable names with a nice commented table. I only need to do it one time, and then I'm putting this thing out to pasture.

Much obliged

RXB · January 8, 2018

You can see the performance difference with this little program.

It's about 30% slower with the really long names.

100 REM  big vars test
110 PRINT "Short variables..."
120 FOR I=1 TO 1000
130 V=V+I
140 NEXT I
150 PRINT "Done!"
160 PRINT "Long Variables..."
170 FOR LONGINDEXNAME=1 TO 1000
180 LONGVARIABLE=LONGVARIABLE+LONGINDEXNAME
190 NEXT LONGINDEXNAME
200 PRINT "Done!"
210 END

Did you test XB instead it uses a XML ROM routine that is much better written then the TI Basic one.

sometimes99er · January 8, 2018

I will be doing Search and Replace to drop it like it's hot.

I think the problem disappears if you compile, but I'm not sure. Like the variable name, any length, is replaced with a 16-bit location pointer.

Edited January 8, 2018 by sometimes99er

+OLD CS1 · January 8, 2018

Atari BASIC actually did tokenize variables, with the limitation being you could only have 128 unique variables in a program. Another approach to the same problem...

Neat! Was that also a Microsoft BASIC? Also, what up with your avatar... 3.0??

+mizapf · January 8, 2018

3.0 = 99/8

(I hope no one of my students is lurking here.)

Casey · January 8, 2018

Neat! Was that also a Microsoft BASIC? Also, what up with your avatar... 3.0??

No, Atari BASIC was very different from Microsoft BASIC. To make it more confusing, there was an Atari Microsoft BASIC, but it worked just like most other Microsoft BASICs. But the BASIC that came with the machine used a token table for variables. This had its own set of issues. The benefit was that the length of the variable name didn't impact the runtime, but you could only have 128 unique variables and sometimes their place in the variable table mattered. I recall seeing some magazine listings instructing people to LIST their programs to tape or disk, NEW, and then ENTER the program back in so that the variables were in the right place in the table.

And yes, mizapf is correct. At one time many years ago I was fortunate enough to have a 99/8 for a while and I took a screen shot from it back then of its title screen, and that's where my avatar came from.

+TheBF · January 8, 2018

Did you test XB instead it uses a XML ROM routine that is much better written then the TI Basic one.

I didn't. When I get back to it I will try it with XB. (or be my guest and try it with XB and RXB and let us know what happens)

Thanks for the insight Rich.

+TheBF · January 9, 2018

You can see the performance difference with this little program.

It's about 30% slower with the really long names.

100 REM  big vars test
110 PRINT "Short variables..."
120 FOR I=1 TO 1000
130 V=V+I
140 NEXT I
150 PRINT "Done!"
160 PRINT "Long Variables..."
170 FOR LONGINDEXNAME=1 TO 1000
180 LONGVARIABLE=LONGVARIABLE+LONGINDEXNAME
190 NEXT LONGINDEXNAME
200 PRINT "Done!"
210 END

Results (SECS) on CLASSIC99:

TI BASIC Short Long

--------------------------------------------------------------

8.5 12.4 48% slower

--------------------------------------------------------------

XB 11.5 14.4 25% slower

So XB look up is better but wow! XB sucks in general speed in this test.

RXB · January 9, 2018

The routine is written in Assembly for TI Basic and XB, but the reason it is slow is that both move it into VDP RAM.

XB has it in upper 24K RAM but moves it into VDP....stupid thinking TI.

TI Basic does everything from VDP and makes it worse by making copies in VDP compounding the speed issues.

Speed wise the issue is using VDP that that TI Basic or XB is slow, you are attacking the wrong issue here.

Variable name/program space relationship

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members