Jump to content

Photo

Variable name/program space relationship


30 replies to this topic

#1 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 9,797 posts
  • Location:Hustisford, WI

Posted Wed Jan 3, 2018 11:43 PM

Just trying to wrap my head around something here and I can't seem to find the answer via the search function (although I know the information exists here somewhere).

 

When you have an XB program which contains variables, and those variables each have names, and each of these named variables have a certain number of letters in the variable name.......

 

 

How does one calculate the number of bytes a variable takes up in program space?

 

 

G=5

 

vs.

 

GORILLA=5

 

 

So here we have two variables, each containing the same number.  Can someone give me the breakdown of the program space used for each of these?

 

Additionally, if G is referenced 10 times in a program, and GORILLA is referenced 10 times in a program, is the initial allocation of memory for that particular variable the biggest usage, or is program space eaten up each time that variable is referenced?

 

IIRC, each variable is tokenized and then references to that initial variable take less space than the initial variable allocation... but I can't find the answer I'm looking for.  

 

 

If I have a variable GORILLA defined in Line 100, but then modify or reference it in lines 160, 170, 180, and 300, am I chewing up unnecessary program space each time it is referenced (vs if it was called G) or do the following references only hit the program space as a token?

 

 

***Trying to weigh readability against optimization with a large program I need to down-size)

 

Please give me a hand.  Thanks in advance!

 

 

 

 

 

 

 



#2 adamantyr OFFLINE  

adamantyr

    Stargunner

  • 1,358 posts

Posted Wed Jan 3, 2018 11:48 PM

Generally shorter variable names are better, 1 to 2 characters at most. For one thing, long variable names means less instructions per line in Extended BASIC.

#3 OLD CS1 OFFLINE  

OLD CS1

    Technomancer

  • 5,585 posts
  • Technology Samurai
  • Location:Tallahassee, FL

Posted Thu Jan 4, 2018 12:03 AM

For large programs, I eventually adopted a programming method for variables I picked up while programming on the Commodore 64 in parallel: two-character variables.  The shorter the variable names the fewer characters get stored in the BASIC space.

 

I also started using more arrays.  Now, I am not certain if using arrays is any quicker as a function of name search versus index look-up, but it seemed cleaner to me.



#4 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • Topic Starter
  • 9,797 posts
  • Location:Hustisford, WI

Posted Thu Jan 4, 2018 12:43 AM

Thanks very much, guys!

 

Yes... there is no doubt that keeping variable names short and sweet is preferable.  On arrays, those can definitely be useful when your variables have commonality (for me, conceptually).  

 

The primary thing I'm trying to decide here is whether going through and doing a variable re-write will achieve my goal, space-wise.  In the above example of "GORILLA":

 

If the variable GORILLA takes up 7 bytes plus a token (just grasping at straws here), then that variable is taking up 8 bytes, just in the name alone.  That would mean that if I defined it once and referenced or updated it 10 times throughout the code, that would be a total of 78 bytes taken up just through the name alone.....

 

With the example of "G" instead, that would be 12 bytes.  (1 byte plus token to start, 1 byte per reference).

 

78 vs 12 is significant.

 

If, however, each subsequent reference to that variable only takes up 1 byte (once interpreted), then the gain is not so significant.  It would be 18 vs. 12.

 

 

So, now that I've talked my way through that, I guess the question is whether the bytes used up in the physical BASIC code are the PROGRAM space, or is the interpreted GPL "object code" (for lack of a better term) what makes up actual PROGRAM space?

 

 

Due to the fact that you must RUN your program prior to getting an accurate reading when using the SIZE command, my inclination is to believe that it is either the interpreted code, or it is a combination of the two.



#5 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 4,145 posts

Posted Thu Jan 4, 2018 12:53 AM

78 vs 12 is significant.

 

If, however, each subsequent reference to that variable only takes up 1 byte (once interpreted), then the gain is not so significant.  It would be 18 vs. 12.

 

It is like you describe in the first case.
 

gorilla.png
 
The lines are stored in memory from top and backwards.


Edited by sometimes99er, Thu Jan 4, 2018 12:57 AM.


#6 Lee Stewart OFFLINE  

Lee Stewart

    River Patroller

  • 3,772 posts
  • Location:Silver Run, Maryland

Posted Sat Jan 6, 2018 10:52 AM

 

It is like you describe in the first case.
 

gorilla.png
 
The lines are stored in memory from top and backwards.

 

Same in XB, but in high RAM for the program with expansion RAM.  The variable name is not tokenized (as shown by sometimes99er ).  “GORILLA” is 6 characters longer than ‘G’, so each reference to GORILLA adds 6 bytes to the program size, before and after running it.  The variable table in VRAM is, of course, only 6 bytes longer because there is only ever that one entry for the variable.

 

...lee



#7 TheBF OFFLINE  

TheBF

    Dragonstomper

  • 761 posts
  • Location:The Great White North

Posted Sat Jan 6, 2018 12:55 PM

 

Same in XB, but in high RAM for the program with expansion RAM.  The variable name is not tokenized (as shown by sometimes99er ).  “GORILLA” is 6 characters longer than ‘G’, so each reference to GORILLA adds 6 bytes to the program size, before and after running it.  The variable table in VRAM is, of course, only 6 bytes longer because there is only ever that one entry for the variable.

 

...lee

 

So that would mean the Basic interpreter has to search for each variable string in the list every time it's referenced in the program?

If that's the case then 1 character variables would be preferred. 

 

 

 

That is kind of sad.



#8 OLD CS1 OFFLINE  

OLD CS1

    Technomancer

  • 5,585 posts
  • Technology Samurai
  • Location:Tallahassee, FL

Posted Sat Jan 6, 2018 1:31 PM

I always wondered why BASIC languages do a string search for variable names.  I would think a good way to store variables would be a hash-table of names with a collision flag.


  • RXB likes this

#9 adamantyr OFFLINE  

adamantyr

    Stargunner

  • 1,358 posts

Posted Sat Jan 6, 2018 1:57 PM

I always wondered why BASIC languages do a string search for variable names.  I would think a good way to store variables would be a hash-table of names with a collision flag.

 

Most other BASICS at the time had pretty limited variable name size, TI was unique in that regard allowing up to 15 characters. From a usability standpoint, they're making it WAY easier to program. But from an efficiency standpoint it's just terrible.

 

Another thing lacking in TI BASIC is an integer variable type. Several other BASIC's had one, using a % symbol to indicate them.

 

It goes without saying that a good number of us here could write a WAY more efficient and usable BASIC. Or you can use RXB which our awesome rockstar Rich Gilbertson created. ;)



#10 mizapf OFFLINE  

mizapf

    River Patroller

  • 3,383 posts
  • Location:Germany

Posted Sat Jan 6, 2018 2:08 PM

Need not say more ...

 

 

Attached Files



#11 OLD CS1 OFFLINE  

OLD CS1

    Technomancer

  • 5,585 posts
  • Technology Samurai
  • Location:Tallahassee, FL

Posted Sat Jan 6, 2018 9:57 PM

Need not say more ...

 

 

 

Word.  I found the integer variable type very useful.  I used them to quickly pass information to and from ML routines.  In my BBS program I used an IRQ routine to monitor one integer variable for a register number to update with the contents of another integer.  For instance, set B1% with the value, then set B0% with the register number.  This is useful because some registers are found across different memory addresses, and different bits mean different things, so I do not have to track multiple tables.

 

But I digress...



#12 1980gamer OFFLINE  

1980gamer

    Dragonstomper

  • 955 posts
  • Location:Charlton, MA

Posted Sat Jan 6, 2018 11:10 PM

This is interesting!

 

I try to shorten vars as much as possible.   However, when I look at things I did long long ago...  They are not always descriptive enough or they are....

Strange?   I find things like FUC.   WTF?   Oh,  Fuel Unit Consumption!  

 

With Classic99 and pasting from notepad,  I use longer vars and then find and replace to shorter vars if needed.  ( a lot more REM's as well. )

 

Never thought about how much memory I could actually save by shortening vars.  but it has saved me in the past.

 

I now try to reuse them as much as possible too.



#13 RXB OFFLINE  

RXB

    River Patroller

  • 3,322 posts
  • Location:Vancouver, Washington, USA

Posted Sun Jan 7, 2018 12:35 AM

I always wondered why BASIC languages do a string search for variable names.  I would think a good way to store variables would be a hash-table of names with a collision flag.

In GPL SEARCH (especially in XB) the same routine is used for many functions.

1. Variable names string/numeric/Subprograms/Definitions(DEF)

2. Similar to modules when you start the TI99/4A it looks at the HEADER for Powerup/Cartridge/DSR/Subprograms(XB)//Interupts/Ti Basic CALLs.

 

TI Intern for example:

XML >16 (Search variable name), leads back to GPL
15D6 06A0 BL @>15E0 Search name
15D8 15E0
15DA 006A DATA >006A Return reset condition bit
15DC 0460 B @>00CE Return set condition bit
15DE 00CE
15E0 C120 MOV @>833E,4 Pointer fetch var list
15E2 833E
15E4 1312 JEQ >160A No list, end reset condition bit
15E6 D0E0 MOVB @>8359,3 Fetch length byte
15E8 8359
15EA 04C7 CLR 7
15EC 0584 INC 4
15EE D7E0 MOVB @>83E9,*15 Write VDP address
15F0 83E9
15F2 1000 JMP >15F4
15F4 D7C4 MOVB 4,*15
15F6 020A LI 10,>8800 VDP read data
15F8 8800
15FA 90DA CB *10,3 Compare length of variable
15FC 1308 JEQ >160E Right, check name
15FE D19A MOVB *10,6 Address next variable
1600 1000 JMP >1602
1602 D81A MOVB *10,@>83ED
1604 83ED
1606 C106 MOV 6,4 New address in R4
1608 16F1 JNE >15EC Go on
160A C2DB MOV *11,11 Fetch return
160C 045B B *11 Return
160E D19A MOVB *10,6 Address next variable
1610 1000 JMP >1612
1612 D81A MOVB *10,@>83ED
1614 83ED
1616 1000 JMP >1618
1618 D15A MOVB *10,5 Address name of variable
161A D803 MOVB 3,@>83EF Length byte in R7 Lbyte
161C 83EF
161E D09A MOVB *10,2
1620 D7C2 MOVB 2,*15 Write address VDP
1622 1000 JMP >1624
1624 D7C5 MOVB 5,*15
1626 0202 LI 2,>834A FAC
1628 834A
162A 9C9A CB *10,*2+ Compare name
162C 16EC JNE >1606 Next variable
162E 0607 DEC 7
1630 15FC JGT >162A Until length end
1632 0604 DEC 4
1634 C804 MOV 4,@>834A Address on FAC shows to value of variables
1636 834A
1638 046B B @>0002(11) Return +2
163A 0002

Now a hash tag would have more efficiency but you can not use a hash tag for everything as they take up more memory and are even boundaries.

Tokenism commands are more effective at reducing strains on memory limits for limited memory.

 

I guess there is a trade off but I do not fault TI for the chosen method as I love GPL.



#14 TheBF OFFLINE  

TheBF

    Dragonstomper

  • 761 posts
  • Location:The Great White North

Posted Sun Jan 7, 2018 11:04 AM

You can see the performance difference with this little program.

It's about 30% slower with the really long names.

100 REM  big vars test
110 PRINT "Short variables..."
120 FOR I=1 TO 1000
130 V=V+I
140 NEXT I
150 PRINT "Done!"
160 PRINT "Long Variables..."
170 FOR LONGINDEXNAME=1 TO 1000
180 LONGVARIABLE=LONGVARIABLE+LONGINDEXNAME
190 NEXT LONGINDEXNAME
200 PRINT "Done!"
210 END



#15 Casey OFFLINE  

Casey

    Moonsweeper

  • 296 posts

Posted Sun Jan 7, 2018 11:43 AM

I always wondered why BASIC languages do a string search for variable names.  I would think a good way to store variables would be a hash-table of names with a collision flag.

 

Atari BASIC actually did tokenize variables, with the limitation being you could only have 128 unique variables in a program.  Another approach to the same problem...



#16 TheBF OFFLINE  

TheBF

    Dragonstomper

  • 761 posts
  • Location:The Great White North

Posted Sun Jan 7, 2018 2:19 PM

 

Atari BASIC actually did tokenize variables, with the limitation being you could only have 128 unique variables in a program.  Another approach to the same problem...

 

That was my thought on improving it. Use an index number and a type number to I.D. the variable in the final program.

 

But heck if you go that far you could just record the memory address of the variable in the program like Forth does.

It would be one 16 bit integer.

 

All water under the bridge at this stage, but interesting to know when writing TI BASIC programs what to avoid.



#17 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • Topic Starter
  • 9,797 posts
  • Location:Hustisford, WI

Posted Sun Jan 7, 2018 8:03 PM

Thanks for all the detailed replies, folks.  

 

I will be doing Search and Replace to drop it like it's hot.

 

Should save me..... at least 1K for all references to all variables shortened to single-letter variable names with a nice commented table.  I only need to do it one time, and then I'm putting this thing out to pasture.

 

 

Much obliged



#18 RXB OFFLINE  

RXB

    River Patroller

  • 3,322 posts
  • Location:Vancouver, Washington, USA

Posted Sun Jan 7, 2018 8:49 PM

 

You can see the performance difference with this little program.

It's about 30% slower with the really long names.

100 REM  big vars test
110 PRINT "Short variables..."
120 FOR I=1 TO 1000
130 V=V+I
140 NEXT I
150 PRINT "Done!"
160 PRINT "Long Variables..."
170 FOR LONGINDEXNAME=1 TO 1000
180 LONGVARIABLE=LONGVARIABLE+LONGINDEXNAME
190 NEXT LONGINDEXNAME
200 PRINT "Done!"
210 END

Did you test XB instead it uses a XML ROM routine that is much better written then the TI Basic one.



#19 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 4,145 posts

Posted Mon Jan 8, 2018 1:52 AM

I will be doing Search and Replace to drop it like it's hot.

 

I think the problem disappears if you compile, but I'm not sure. Like the variable name, any length, is replaced with a 16-bit location pointer. ;)


Edited by sometimes99er, Mon Jan 8, 2018 1:55 AM.


#20 OLD CS1 OFFLINE  

OLD CS1

    Technomancer

  • 5,585 posts
  • Technology Samurai
  • Location:Tallahassee, FL

Posted Mon Jan 8, 2018 2:41 AM

 

Atari BASIC actually did tokenize variables, with the limitation being you could only have 128 unique variables in a program.  Another approach to the same problem...

 

Neat!  Was that also a Microsoft BASIC?  Also, what up with your avatar... 3.0??



#21 mizapf OFFLINE  

mizapf

    River Patroller

  • 3,383 posts
  • Location:Germany

Posted Mon Jan 8, 2018 3:48 AM

3.0 = 99/8
(I hope no one of my students is lurking here.)

#22 Casey OFFLINE  

Casey

    Moonsweeper

  • 296 posts

Posted Mon Jan 8, 2018 6:42 AM

 

Neat!  Was that also a Microsoft BASIC?  Also, what up with your avatar... 3.0??

No, Atari BASIC was very different from Microsoft BASIC.  To make it more confusing, there was an Atari Microsoft BASIC, but it worked just like most other Microsoft BASICs.  But the BASIC that came with the machine used a token table for variables.  This had its own set of issues.  The benefit was that the length of the variable name didn't impact the runtime, but you could only have 128 unique variables and sometimes their place in the variable table mattered.  I recall seeing some magazine listings instructing people to LIST their programs to tape or disk, NEW, and then ENTER the program back in so that the variables were in the right place in the table.  

 

And yes, mizapf is correct.  At one time many years ago I was fortunate enough to have a 99/8 for a while and I took a screen shot from it back then of its title screen, and that's where my avatar came from.



#23 TheBF OFFLINE  

TheBF

    Dragonstomper

  • 761 posts
  • Location:The Great White North

Posted Mon Jan 8, 2018 8:12 AM

Did you test XB instead it uses a XML ROM routine that is much better written then the TI Basic one.

 

I didn't.  When I get back to it I will try it with XB.  (or be my guest and try it with XB and RXB and let us know what happens)

 

Thanks for the insight Rich.


  • RXB likes this

#24 TheBF OFFLINE  

TheBF

    Dragonstomper

  • 761 posts
  • Location:The Great White North

Posted Mon Jan 8, 2018 9:25 PM

 

You can see the performance difference with this little program.

It's about 30% slower with the really long names.

100 REM  big vars test
110 PRINT "Short variables..."
120 FOR I=1 TO 1000
130 V=V+I
140 NEXT I
150 PRINT "Done!"
160 PRINT "Long Variables..."
170 FOR LONGINDEXNAME=1 TO 1000
180 LONGVARIABLE=LONGVARIABLE+LONGINDEXNAME
190 NEXT LONGINDEXNAME
200 PRINT "Done!"
210 END

 

Results (SECS)  on CLASSIC99:

 

TI BASIC      Short       Long

--------------------------------------------------------------

                      8.5          12.4         48% slower

--------------------------------------------------------------

XB               11.5           14.4        25% slower

 

So XB look up is better but wow!  XB sucks in general speed in this test.



#25 RXB OFFLINE  

RXB

    River Patroller

  • 3,322 posts
  • Location:Vancouver, Washington, USA

Posted Tue Jan 9, 2018 11:45 AM

The routine is written in Assembly for TI Basic and XB, but the reason it is slow is that both move it into VDP RAM.

XB has it in upper 24K RAM but moves it into VDP....stupid thinking TI.

TI Basic does everything from VDP and makes it worse by making copies in VDP compounding the speed issues.

 

Speed wise the issue is using VDP that that TI Basic or XB is slow, you are attacking the wrong issue here.






0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users