Jump to content
IGNORED

A worse programmers questions


Sid1968

Recommended Posts

10 hours ago, Sid1968 said:

OK, i "developed" an interesting program with amazing results... because this time TI-Basic is the winner (except VIC-20).

 

10 FOR I=1 TO 15000
20 A=I*(2+I)
30 NEXT I
40 PRINT A

 

Result: 225030000

 

                               TI-Basic          Mechatronic Extended Basic          RXB 2015          VIC-20 Basic

------------------------------------------------------------------------------------------------------------
Time in Seconds:      189                             203                                   202                     91

 

 

The Extended Basic Versions are slower.  Why??? Can someone explain me that results on the TI-99/4A?

Please doublecheck that results and add other Interpreter/Compiler/Assembler results.

 

Kind Regards

Sid1968

In RXB I have CALL HPUT(row,column,string) this takes the place of PRINT due to print has to scroll entire screen.

So XB has DISPLAY AT(row,col):string but does not use 2 characters on left or right of screen thus like PRINT only shows 28 character per line.

RXB CALL HPUT can display 32 per line and there is also CALL VPUT(row,column,string), these are much better tools then PRINT or DISPLAY AT

PRINT harkens back to the days of mainframes that used paper only thus the reason for the scroll each line like paper output.

 

Now the reason for PRINT to be slower then TI Basic is it has many more optional features then TI Basic does, and it has to check off that list.

Thus why RXB is slower then TI Basic at PRINT.

RXB also has CALL MOVES("$V",32,string-variable,row-by-address) this would move a string of 32 characters from string-variable to screen address.

Matter of fact RXB CALL MOVES can move any type of memory to any type of memory in the TI. i.e. RAM to RAM or RAM to VDP or GROM to RAM.....

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

Thank you Richard for your explanation. In this second Basicprogram the TI-99/4A only computes about more or less 3 Minutes and than prints only one time the result. Here the Expanded Basic was slower than the TI-Basic. In the first Basicprogramm every value of "I" was printed and there were the Expanded Basic Versions much faster than the TI-Basic. In short, if a basic program prints a lot Expanded Basic wins against TI-Basic.

 

I do understand the problems of the XBs in the second Basicprogramm absolutely not. The main time of the 3 Minutes the TI-99/4A is only computing the addition and multiplication of integers. But in my opinion exactly that should normaly go in the Expanded Basic with optimized code (is it assembler?) much faster than in TI-Basic. For what reason not? Is it possible that this is simply a kind of bug in the Expanded Basic Versions... or something that somebody forgot to care about? Maybe its possible to fix that in RXB. But at first we should find out why this is so.

 

Richard, how would you write this two testprograms in RXB (maybe you want to add programcode to recognize if 32 KB Ram is available and use it):

 

Testprogram 1 (Prints a lot) (Victory for the Extended Basic Versions):

10 FOR I=1 TO 1000

20 PRINT I;

30 NEXT I

 

 

Testprogram 2 (Computes all the time the addition and multiplication of integers. Prints only one time the result) (Victory for TI-Basic):

10 FOR I=1 TO 15000
20 A=I*(2+I)
30 NEXT I
40 PRINT A

 

Kind Regards

Sid1968

 

 

PS: I dont want to insult anybody by showing the VIC-20 results.

Edited by Sid1968
Link to comment
Share on other sites

29 minutes ago, Sid1968 said:

Thank you Richard for your explanation. In this second Basicprogram the TI-99/4A only computes about more or less 3 Minutes and than prints only one time the result. Here the Expanded Basic was slower than the TI-Basic. In the first Basicprogramm every value of "I" was printed and there were the Expanded Basic Versions much faster than the TI-Basic. In short, if a basic program prints a lot Expanded Basic wins against TI-Basic.

 

I do understand the problems of the XBs in the second Basicprogramm absolutely not. The main time of the 3 Minutes the TI-99/4A is only computing the addition and multiplication of integers. But in my opinion exactly that should normaly go in the Expanded Basic with optimized code (is it assembler?) much faster than in TI-Basic. For what reason not? Is it possible that this is simply a kind of bug in the Expanded Basic Versions... or something that somebody forgot to care about? Maybe its possible to fix that in RXB. But at first we should find out why this is so.

 

Richard, how would you write this two testprograms in RXB (maybe you want to add programcode to recognize if 32 KB Ram is available and use it):

 

Testprogram 1 (Prints a lot) (Victory for the Extended Basic Versions):

10 FOR I=1 TO 1000

20 PRINT I;

30 NEXT I

 

 

Testprogram 2 (Computes all the time the addition and multiplikation of integers. Prints only one time the result) (Victory for TI-Basic):

10 FOR I=1 TO 15000
20 A=I*(2+I)
30 NEXT I
40 PRINT A

 

Kind Regards

Sid1968

 

 

PS: I dont want to insult anybody by showing the VIC-20 results.

The VIC-20 can do the Basic display of a For Next Loop fast and display it, but the graphics look like subpar to TI graphics.

Can the Vic-20 do this?

 

  • Like 1
Link to comment
Share on other sites

6 minutes ago, Sid1968 said:

Is that your voice? Richard... for that i have a TI-99/4A ;-)

In this thread its more exiting what language is the fastest on the TI-99/4A.

Yea that is me, I also make a GPL (Graphic Programmable Language) totortials too.

Look for after you click the link.

RXB 2015E 

GPL HOW 2 

Or my Youtube channel:

https://www.youtube.com/channel/UCULwPKqrRFCtNv5_xMuOqQw?view_as=subscriber

 

  • Thanks 1
Link to comment
Share on other sites

10 hours ago, Lee Stewart said:

 

The difference, which is not much, likely has to do with the fact that TI Basic ROM routines are located in console ROM and run on the 16-bit bus, whereas ROM routines unique to XB run on the 8-bit bus. The GROMs are also similarly situated, but I do not know whether that makes anywhere near the speed difference the placement of ROMs has.

 

...lee

Nah, all GROMs in the system run at the speed of the slowest GROM on the bus. :)

 

  • Like 2
  • Thanks 2
Link to comment
Share on other sites

2 hours ago, RXB said:

Everything is in the RXB 2015E package on the Resource page of Atari Age

 

Thank you, Richard. I guess you mistunderstood me because of my bad english. So i try it again... ;-)

 

The basicprogramcode of the two testprograms above is in standard basic and maybe not optimized for RXB. You wrote about an alternative for the print command and that code can be put into 32KB ram. I want to ask you if you can optimize the two testprograms above to work in RXB best.

 

Lee, it would be kind if you would optimize the two testprograms for fbForth.

 

I cannot do this by myself, because i have no device to save the programs to, but mainly i know your very interesting languages not good enough.

 

Kind Regards

Sid1968

Edited by Sid1968
Link to comment
Share on other sites

8 hours ago, Tursi said:

Nah, all GROMs in the system run at the speed of the slowest GROM on the bus. :)

 

So, how would you explain the results of the two testprograms?

 

Tursi, would you translate the two basicprograms in optimizes Assemblercode and share it with us. How long does Assembler needs to run that programs?

Edited by Sid1968
Link to comment
Share on other sites

1 hour ago, Sid1968 said:

So, how would you explain the results of the two testprograms?

 

Tursi, would you translate the two basicprograms in optimizes Assemblercode and share it with us. How long does Assembler needs to run that programs?

I think likely Lee already covered it, although I'm not an expert on the various BASICs. But the TI BASIC interpreter runs in the console's GPL interpreter, which runs from 16-bit zero wait-state fast ROM. Extended BASIC, however, seems to have its own GPL interpreter, which by necessity runs from the 8 bit cartridge port (this seems to be because the console interpreter has some hard-coded ties to TI BASIC... but again, I'm shaky on my understanding of exactly how XB patches in there).

 

But, if so, code that relies heavily on functions coded in assembly language, such as many of the math functions, will probably run faster in TI BASIC than Extended BASIC.

 

I'm not your guy for porting to assembly, I already stepped out of the benchmarking debates the first time around. But have a look over here for a comparison of various ways of running code: 

 

 

  • Thanks 1
Link to comment
Share on other sites

17 minutes ago, Tursi said:

But, if so, code that relies heavily on functions coded in assembly language, such as many of the math functions, will probably run faster in TI BASIC than Extended BASIC.

I guess nobody would say that assembler would solve that problems slower than the others. Maybe another misunderstandig of my bad english. My main concern actually is, to compare the codes... and maybe the RXB can profit of that in the future. In the moment nobody can say excactly what causes the testresults. So it could be interesting to see how assembler and fbForth solves it. Perhaps optimized RXB 2015 code would solve that already too? Lets find it out with Richards help.

 

So i hope that i could make my intentions clear... i dont want to disgrace any language but help a little to make RXB maybe a little better, by showing the points it could get better. Dont tar and fether me if i could not express my intentions good enough... most of the day i speak german. ;-)

Edited by Sid1968
  • Like 1
Link to comment
Share on other sites

Remark to my last posting:

 

In my opinion the results of the second testprogam, results either in the system-architectural of the TI-99/4A or in a weakness in the Extended Basic Versions. To find out if its the system-architectural of the TI-99/4A its necessary to try that second basictestprogram in other languages. For that its necessary to receive the results from Assembler (Tursi) and fbForth (Lee). Further its necessary to try if extended Basic could make it better if the second basictestprogram gets rewritten in extended Basic optimized Code (Richard. Because RXB is the only Extended Basic Version still in developement).

 

If all languages results in more or less long calculation times it must be a problem of the system-architectural of the TI-99/4A. If not RXB needs an improvement and the translated Codes of the second Basictestprogram to Assembler and fbForth could maybe help thereby.

 

Edited by Sid1968
  • Like 1
Link to comment
Share on other sites

 

I once made this, as a joke. It's written in 100% assembly, and it's actually scrolling the entire screen. I think the version on the video is waiting for vertical sync, so it's 'only' scrolling the screen 60 times per second. My point is that this test is really about how fast the screen can be scrolled. Counting the variable from 1 to 1000 is insignificant compared to the time it takes to scroll the screen in any language. Printing a number takes longer than counting, but it's only displaying 4 characters (max) compared to moving 768 when it scrolls.  

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

It is difficult to write an exact duplicate of your calculation loop because all numbers in TI Basic and the XBs are handled as 8-byte (64 bits), radix-100, floating-point numbers—including the index numbers of FOR...NEXT loops. Not so in fbForth, where you must go out of your way to invoke floating point numbers and calculations.

 

Here are three different fbForth versions of your loop that calculate  I * (I + 2) in the loop. The first simply does the calculations with 16-bit integers and gives the wrong answer because 225000000 overflows 16 bits. It takes 9 seconds:

: SNLOOP  ( -- n )    \ loop with single-cell, 16-bit numbers
   0              \ 0 to stack for first drop in loop
   15000 0 DO     \ loop 15000 times
      DROP        \ drop num on stack
      I 2 +       \ index (I) + 2
      I *         \ I*(I+2)
   LOOP
   CR             \ next line
   .              \ print last loop result
;

 

The second calculates a double (32-bit) number for the result using U* , a multiplication operator that multiplies two 16-bit numbers to produce a 32-bit result. U* must be a good bit more efficient (I must take a look at how the two words are coded!) than * because this routine took only 7 seconds:

: DNLOOP  ( -- d )    \ loop with double-cell, 32-bit numbers
   0 0            \ double num 0 for first drop in loop
   15000 0 DO     \ loop 15000 times
      DROP DROP   \ drop double num on stack
      I 2 +       \ index (I) + 2
      I U*        \ I*(I+2)..U* leaves a double num result
   LOOP
   CR             \ next line
   D.             \ print double num from last loop result
;

 

Finally, here is the floating-point-based routine, which still uses 16-bit integers for the loop processing and the addition within the loop. It takes a whopping 59 seconds:

: FPNLOOP  ( -- f )   \ loop with 4-cell, 64-bit, radix-100 floating point (FP) numbers
   0 0 0 0        \ FP 0 for first drop in loop
   15000 0 DO     \ loop 15000 times
      FDROP       \ drop FP num on stack
      I 2 +       \ index (I) + 2
      S->F        \ convert to FP
      I S->F      \ I to FP
      F*          \ FP multiply I*(I+2)
   LOOP
   CR             \ next line
   F.             \ print FP num from last loop result
;

BTW, the result for the second and third loops is 224999999 because the loops start at 0—a habit of us Forth programmers. Sorry about just realizing that. :-D

 

...lee

  • Thanks 1
Link to comment
Share on other sites

31 minutes ago, Sid1968 said:

Thx. In testprogram 2 no scrolling is needed.

 

10 FOR I=1 TO 15000
20 A=I*(2+I)
30 NEXT I
40 PRINT A

 

Can you code that in assembler and test it?

As Lee pointed out, I could code it with 32-bit integers, but it wouldn't be a fair comparison to floating point. And I have no idea how to do this with floating point in assembly. It's something I would never ever use in a game.

 

My guess is that the integer version would take about 2-3 seconds since the forth version took 7. 

 

Link to comment
Share on other sites

The probability that Extended BASIC uses the same routines for floating point math as TI BASIC does is high. Other things are most certainly coded in the 8-bit wide ROM in the Extended BASIC module, a ROM which also is bank switched, which takes even more time, when that's needed.

The PME, for example, the virtual processor that runs the p-system, calls the floating point math routines in console ROM for floating point arithmetic. It does not to do math with integers, though.

This is the same approach as you usually would do in an assembly only program, if you need floating point math there. There are two memory locations, in scratch pad RAM, reserved for two values. They are called FAC (Floating point ACcumulator) and ARG (ARGument). Place two values in them and call FloatADD, to get them added together and the result stored in FAC.

There are several other floating point routines available too.

You could write your own floating point routines too, of course. But usually it's better to stay with integers, perhaps with multiple precision, as long as you can, and only resort to floating point number when you have a compelling need. Then you should know that the floating point math performed by the TI 99/4A is as good as that on TI's calculators (perhaps it's the same routines), and they are very good. So even if you don't get top speed, you do get very good accuracy.

 

I've posted details about the PME here before. To execute a function in p-code, for which there's a direct replacement in the vocabulary of the TMS 9900, the overhead is seven times. That is, the p-code ADD can be coded in one single machine instruction, A *SP+,*SP. But to figure out what to do, the interpreter executes seven instructions.

For p-codes requiring several machine instructions to be executed, like a subprogram call, the relative overhead is less, since about the same amount of instructions are used for decoding, but more useful work is done once the decoding part is done with.

 

Printing on the screen, on the other hand, isn't fast in the p-system. Since it emulates an 80 character wide screen, it has to keep track of both the 24 line 80 column screen memory, and which part of that it should actually show on the 24 line 40 column screen.

Edited by apersson850
  • Thanks 1
Link to comment
Share on other sites

1 hour ago, Lee Stewart said:

It is difficult to write an exact duplicate of your calculation loop because all numbers in TI Basic and the XBs are handled as 8-byte (64 bits), radix-100, floating-point numbers—including the index numbers of FOR...NEXT loops. Not so in fbForth, where you must go out of your way to invoke floating point numbers and calculations.

 

Here are three different fbForth versions of your loop that calculate  I * (I + 2) in the loop. The first simply does the calculations with 16-bit integers and gives the wrong answer because 225000000 overflows 16 bits. It takes 9 seconds:


: SNLOOP  ( -- n )    \ loop with single-cell, 16-bit numbers
   0              \ 0 to stack for first drop in loop
   15000 0 DO     \ loop 15000 times
      DROP        \ drop num on stack
      I 2 +       \ index (I) + 2
      I *         \ I*(I+2)
   LOOP
   CR             \ next line
   .              \ print last loop result
;

 

The second calculates a double (32-bit) number for the result using U* , a multiplication operator that multiplies two 16-bit numbers to produce a 32-bit result. U* must be a good bit more efficient (I must take a look at how the two words are coded!) than * because this routine took only 7 seconds:


: DNLOOP  ( -- d )    \ loop with double-cell, 32-bit numbers
   0 0            \ double num 0 for first drop in loop
   15000 0 DO     \ loop 15000 times
      DROP DROP   \ drop double num on stack
      I 2 +       \ index (I) + 2
      I U*        \ I*(I+2)..U* leaves a double num result
   LOOP
   CR             \ next line
   D.             \ print double num from last loop result
;

 

Finally, here is the floating-point-based routine, which still uses 16-bit integers for the loop processing and the addition within the loop. It takes a whopping 59 seconds:


: FPNLOOP  ( -- f )   \ loop with 4-cell, 64-bit, radix-100 floating point (FP) numbers
   0 0 0 0        \ FP 0 for first drop in loop
   15000 0 DO     \ loop 15000 times
      FDROP       \ drop FP num on stack
      I 2 +       \ index (I) + 2
      S->F        \ convert to FP
      I S->F      \ I to FP
      F*          \ FP multiply I*(I+2)
   LOOP
   CR             \ next line
   F.             \ print FP num from last loop result
;

BTW, the result for the second and third loops is 224999999 because the loops start at 0—a habit of us Forth programmers. Sorry about just realizing that. :-D

 

...lee

Thank you Lee, for your very accurate tests!!! fbForth is amazing. Mmmhhh.... I am very exited for Richards results. Could it be that RXB should get expanded with more mathematic routines? I show you here more tests i made. You see that the differences between TI-Basic and RXB takes sometimes up to 40 seconds. For me this looks not like a problem of the system-architectural of the TI-99/4A, but of a problem in the XB-Versions. Maybe the mathematic functions should get rewriten or eked, e.g. one command for integer another for floatingpoint. And maybe this commands should look if a 32KB RAM is available to work there faster.

 

Here we go... TI-Basic VS RXB2015. Only the the way how "A" is calculated was changed. It begins simple to look where the problems are. Using brackets leads to much higher calculation time in TI-Basic, even if it calculates faster then RXB at all... For that look at Testprograms 4 and 5. They have the same result. In 4 i use no brackets, in 5 i use brackets even if they are not needed to get the result of 4 only to test the bracket-effect.

 

Testprogram 3:
---------------
10 FOR I=1 TO 15000
20 A=I*7
30 NEXT I
40 PRINT A

 

Result: 105000
TI-Basic: 122
RXB 2015: 152

 


Testprogram 4:
---------------
10 FOR I=1 TO 15000
20 A=I+7
30 NEXT I
40 PRINT A

 

Result: 15007
TI-Basic: 119
RXB 2015: 149

 

 

Testprogram 4:
---------------
10 FOR I=1 TO 15000
20 A=I+7*I
30 NEXT I
40 PRINT A

 

Result: 120000
TI-Basic: 155
RXB 2015: 193

 

 

Testprogram 5:
---------------
10 FOR I=1 TO 15000
20 A=I+(7*I)
30 NEXT I
40 PRINT A

 

Result: 120000
TI-Basic: 185
RXB 2015: 198

 

 

Testprogram 6:
---------------
10 FOR I=1 TO 15000
20 A=(I+7)*I
30 NEXT I
40 PRINT A

 

Result: 225105000
TI-Basic: 192
RXB 2015: 204

 

Edited by Sid1968
Link to comment
Share on other sites

On 9/22/2019 at 5:05 AM, Lee Stewart said:

 

The difference, which is not much, likely has to do with the fact that TI Basic ROM routines are located in console ROM and run on the 16-bit bus, whereas ROM routines unique to XB run on the 8-bit bus. The GROMs are also similarly situated, but I do not know whether that makes anywhere near the speed difference the placement of ROMs has.

 

...lee

As stated by Tursi no this is not true GPL does not execute at different speeds in a Cart or GROM in console.

As I have explained on another posts and other times, the reason some routines in TI Basic are faster then TI Extended Basic (or RXB or SuperXB or XB2.7) is that 

Extended Basic has DISPLAY and DISPLAY AT(row,col)  added to the routines of PRINT, thus as more options are checked this slows the XB version of PRINT.

It is like your boat the more weight you add the slower you can row it.

 

Seeing as how there is NO DISPLAY AT in TI Basic it has a hell of a shorter list to scan for in total and each command has many fewer options that it needs to look for...

For 20 years I have written GPL for TI Basic and XB (RXB) along with other GPL projects like ET at Sea where I created the first usable GPL code and released it.

(Thanks to Tursi and some others for refining that code to be much more usable we see today.)

 

A perfect example is RND in TI Basic and XB and RXB. I put the simple less complicated RND from TI Basic into RXB.

Extended Basic had this huge complicated RND generator that used Floating Point up to 9 digits so RND in XB was just so freaking slow.

XB suffers from much of this as a problem, but then it also has sprites and other features that make TI Basic look so primitive and bad.

Edited by RXB
  • Thanks 1
Link to comment
Share on other sites

 

Quote

 

 

You show me in the XB ROMs where this interpreter is please?

What I see is routines to deal with a OS ROM that is designed for TI Basic and EA not XB.

Thus what I see is the XB ROMs deal with a different way to get results but must use different EQUATES to pull this off.

And proof is what you see in the EA manual using TI Basic vs XB.

SOURCE1.txt SOURCE2.txt

Edited by RXB
BLANK
Link to comment
Share on other sites

2 hours ago, Asmusr said:

As Lee pointed out, I could code it with 32-bit integers, but it wouldn't be a fair comparison to floating point. And I have no idea how to do this with floating point in assembly. It's something I would never ever use in a game.

 

My guess is that the integer version would take about 2-3 seconds since the forth version took 7. 

 

At the time hardware people and software people were involved in speed and accuracy if you look at magazines and articles.

Thus some like TI were fixated on accuracy over speed as they were selling TI Calculators, while others only cared about speed not accuracy.

Some computers were very accurate with Floating Point and numbers in general, while others were crap at accuracy but were faster as a result.

 

Not a mystery as to results if you look at the times and the controversies. It 100% explains all the differences.

  • Thanks 1
Link to comment
Share on other sites

7 minutes ago, Sid1968 said:

Thank you Richard. Is there no way to fix the speedproblems in RXB?

Working on it for years.

If you use Classic99 try this tests:

 

10 A=RND

20 C=C+1

30 GOTO 10

 

Run this in TI Basic for 1 hour, then XB for 1 hour and then RXB for 1 hour.

You will find XB is the very slowest, TI Basic second and RXB just eats both for breakfast.

Sadly you can not run all three at same time in Windows 10 anymore, as Windows 10 only runs 1 program in foreground only, yes Microsoft SUCKS!

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...