Sid1968 Posted September 22, 2019 Author Share Posted September 22, 2019 (edited) I normally thought that the Expanded Basic Versions are always much faster, because some/all routines are in assembler. Can you please run that program in fbForth and give us the programcode? Edited September 22, 2019 by Sid1968 Quote Link to comment Share on other sites More sharing options...
RXB Posted September 22, 2019 Share Posted September 22, 2019 10 hours ago, Sid1968 said: OK, i "developed" an interesting program with amazing results... because this time TI-Basic is the winner (except VIC-20). 10 FOR I=1 TO 15000 20 A=I*(2+I) 30 NEXT I 40 PRINT A Result: 225030000 TI-Basic Mechatronic Extended Basic RXB 2015 VIC-20 Basic ------------------------------------------------------------------------------------------------------------ Time in Seconds: 189 203 202 91 The Extended Basic Versions are slower. Why??? Can someone explain me that results on the TI-99/4A? Please doublecheck that results and add other Interpreter/Compiler/Assembler results. Kind Regards Sid1968 In RXB I have CALL HPUT(row,column,string) this takes the place of PRINT due to print has to scroll entire screen. So XB has DISPLAY AT(row,col):string but does not use 2 characters on left or right of screen thus like PRINT only shows 28 character per line. RXB CALL HPUT can display 32 per line and there is also CALL VPUT(row,column,string), these are much better tools then PRINT or DISPLAY AT PRINT harkens back to the days of mainframes that used paper only thus the reason for the scroll each line like paper output. Now the reason for PRINT to be slower then TI Basic is it has many more optional features then TI Basic does, and it has to check off that list. Thus why RXB is slower then TI Basic at PRINT. RXB also has CALL MOVES("$V",32,string-variable,row-by-address) this would move a string of 32 characters from string-variable to screen address. Matter of fact RXB CALL MOVES can move any type of memory to any type of memory in the TI. i.e. RAM to RAM or RAM to VDP or GROM to RAM..... 1 1 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 22, 2019 Author Share Posted September 22, 2019 (edited) Thank you Richard for your explanation. In this second Basicprogram the TI-99/4A only computes about more or less 3 Minutes and than prints only one time the result. Here the Expanded Basic was slower than the TI-Basic. In the first Basicprogramm every value of "I" was printed and there were the Expanded Basic Versions much faster than the TI-Basic. In short, if a basic program prints a lot Expanded Basic wins against TI-Basic. I do understand the problems of the XBs in the second Basicprogramm absolutely not. The main time of the 3 Minutes the TI-99/4A is only computing the addition and multiplication of integers. But in my opinion exactly that should normaly go in the Expanded Basic with optimized code (is it assembler?) much faster than in TI-Basic. For what reason not? Is it possible that this is simply a kind of bug in the Expanded Basic Versions... or something that somebody forgot to care about? Maybe its possible to fix that in RXB. But at first we should find out why this is so. Richard, how would you write this two testprograms in RXB (maybe you want to add programcode to recognize if 32 KB Ram is available and use it): Testprogram 1 (Prints a lot) (Victory for the Extended Basic Versions): 10 FOR I=1 TO 1000 20 PRINT I; 30 NEXT I Testprogram 2 (Computes all the time the addition and multiplication of integers. Prints only one time the result) (Victory for TI-Basic): 10 FOR I=1 TO 15000 20 A=I*(2+I) 30 NEXT I 40 PRINT A Kind Regards Sid1968 PS: I dont want to insult anybody by showing the VIC-20 results. Edited September 22, 2019 by Sid1968 Quote Link to comment Share on other sites More sharing options...
RXB Posted September 22, 2019 Share Posted September 22, 2019 29 minutes ago, Sid1968 said: Thank you Richard for your explanation. In this second Basicprogram the TI-99/4A only computes about more or less 3 Minutes and than prints only one time the result. Here the Expanded Basic was slower than the TI-Basic. In the first Basicprogramm every value of "I" was printed and there were the Expanded Basic Versions much faster than the TI-Basic. In short, if a basic program prints a lot Expanded Basic wins against TI-Basic. I do understand the problems of the XBs in the second Basicprogramm absolutely not. The main time of the 3 Minutes the TI-99/4A is only computing the addition and multiplication of integers. But in my opinion exactly that should normaly go in the Expanded Basic with optimized code (is it assembler?) much faster than in TI-Basic. For what reason not? Is it possible that this is simply a kind of bug in the Expanded Basic Versions... or something that somebody forgot to care about? Maybe its possible to fix that in RXB. But at first we should find out why this is so. Richard, how would you write this two testprograms in RXB (maybe you want to add programcode to recognize if 32 KB Ram is available and use it): Testprogram 1 (Prints a lot) (Victory for the Extended Basic Versions): 10 FOR I=1 TO 1000 20 PRINT I; 30 NEXT I Testprogram 2 (Computes all the time the addition and multiplikation of integers. Prints only one time the result) (Victory for TI-Basic): 10 FOR I=1 TO 15000 20 A=I*(2+I) 30 NEXT I 40 PRINT A Kind Regards Sid1968 PS: I dont want to insult anybody by showing the VIC-20 results. The VIC-20 can do the Basic display of a For Next Loop fast and display it, but the graphics look like subpar to TI graphics. Can the Vic-20 do this? 1 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 22, 2019 Author Share Posted September 22, 2019 (edited) Is that your voice? Richard... for that i have a TI-99/4A In this thread its more exiting what language is the fastest on the TI-99/4A. Kind Regards Sid1968 Edited September 22, 2019 by Sid1968 Quote Link to comment Share on other sites More sharing options...
RXB Posted September 22, 2019 Share Posted September 22, 2019 6 minutes ago, Sid1968 said: Is that your voice? Richard... for that i have a TI-99/4A In this thread its more exiting what language is the fastest on the TI-99/4A. Yea that is me, I also make a GPL (Graphic Programmable Language) totortials too. Look for after you click the link. RXB 2015E GPL HOW 2 Or my Youtube channel: https://www.youtube.com/channel/UCULwPKqrRFCtNv5_xMuOqQw?view_as=subscriber 1 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 22, 2019 Author Share Posted September 22, 2019 Just subscribed to your youtubechannel. Can you please give us the RXB Basiccode for the two programs above? Quote Link to comment Share on other sites More sharing options...
Tursi Posted September 22, 2019 Share Posted September 22, 2019 10 hours ago, Lee Stewart said: The difference, which is not much, likely has to do with the fact that TI Basic ROM routines are located in console ROM and run on the 16-bit bus, whereas ROM routines unique to XB run on the 8-bit bus. The GROMs are also similarly situated, but I do not know whether that makes anywhere near the speed difference the placement of ROMs has. ...lee Nah, all GROMs in the system run at the speed of the slowest GROM on the bus. 2 2 Quote Link to comment Share on other sites More sharing options...
RXB Posted September 23, 2019 Share Posted September 23, 2019 7 hours ago, Sid1968 said: Just subscribed to your youtubechannel. Can you please give us the RXB Basiccode for the two programs above? Everything is in the RXB 2015E package on the Resource page of Atari Age 1 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 23, 2019 Author Share Posted September 23, 2019 (edited) 2 hours ago, RXB said: Everything is in the RXB 2015E package on the Resource page of Atari Age Thank you, Richard. I guess you mistunderstood me because of my bad english. So i try it again... The basicprogramcode of the two testprograms above is in standard basic and maybe not optimized for RXB. You wrote about an alternative for the print command and that code can be put into 32KB ram. I want to ask you if you can optimize the two testprograms above to work in RXB best. Lee, it would be kind if you would optimize the two testprograms for fbForth. I cannot do this by myself, because i have no device to save the programs to, but mainly i know your very interesting languages not good enough. Kind Regards Sid1968 Edited September 23, 2019 by Sid1968 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 23, 2019 Author Share Posted September 23, 2019 (edited) 8 hours ago, Tursi said: Nah, all GROMs in the system run at the speed of the slowest GROM on the bus. So, how would you explain the results of the two testprograms? Tursi, would you translate the two basicprograms in optimizes Assemblercode and share it with us. How long does Assembler needs to run that programs? Edited September 23, 2019 by Sid1968 Quote Link to comment Share on other sites More sharing options...
Tursi Posted September 23, 2019 Share Posted September 23, 2019 1 hour ago, Sid1968 said: So, how would you explain the results of the two testprograms? Tursi, would you translate the two basicprograms in optimizes Assemblercode and share it with us. How long does Assembler needs to run that programs? I think likely Lee already covered it, although I'm not an expert on the various BASICs. But the TI BASIC interpreter runs in the console's GPL interpreter, which runs from 16-bit zero wait-state fast ROM. Extended BASIC, however, seems to have its own GPL interpreter, which by necessity runs from the 8 bit cartridge port (this seems to be because the console interpreter has some hard-coded ties to TI BASIC... but again, I'm shaky on my understanding of exactly how XB patches in there). But, if so, code that relies heavily on functions coded in assembly language, such as many of the math functions, will probably run faster in TI BASIC than Extended BASIC. I'm not your guy for porting to assembly, I already stepped out of the benchmarking debates the first time around. But have a look over here for a comparison of various ways of running code: 1 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 23, 2019 Author Share Posted September 23, 2019 (edited) 17 minutes ago, Tursi said: But, if so, code that relies heavily on functions coded in assembly language, such as many of the math functions, will probably run faster in TI BASIC than Extended BASIC. I guess nobody would say that assembler would solve that problems slower than the others. Maybe another misunderstandig of my bad english. My main concern actually is, to compare the codes... and maybe the RXB can profit of that in the future. In the moment nobody can say excactly what causes the testresults. So it could be interesting to see how assembler and fbForth solves it. Perhaps optimized RXB 2015 code would solve that already too? Lets find it out with Richards help. So i hope that i could make my intentions clear... i dont want to disgrace any language but help a little to make RXB maybe a little better, by showing the points it could get better. Dont tar and fether me if i could not express my intentions good enough... most of the day i speak german. Edited September 23, 2019 by Sid1968 1 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 23, 2019 Author Share Posted September 23, 2019 (edited) Remark to my last posting: In my opinion the results of the second testprogam, results either in the system-architectural of the TI-99/4A or in a weakness in the Extended Basic Versions. To find out if its the system-architectural of the TI-99/4A its necessary to try that second basictestprogram in other languages. For that its necessary to receive the results from Assembler (Tursi) and fbForth (Lee). Further its necessary to try if extended Basic could make it better if the second basictestprogram gets rewritten in extended Basic optimized Code (Richard. Because RXB is the only Extended Basic Version still in developement). If all languages results in more or less long calculation times it must be a problem of the system-architectural of the TI-99/4A. If not RXB needs an improvement and the translated Codes of the second Basictestprogram to Assembler and fbForth could maybe help thereby. Edited September 23, 2019 by Sid1968 1 Quote Link to comment Share on other sites More sharing options...
Asmusr Posted September 23, 2019 Share Posted September 23, 2019 I once made this, as a joke. It's written in 100% assembly, and it's actually scrolling the entire screen. I think the version on the video is waiting for vertical sync, so it's 'only' scrolling the screen 60 times per second. My point is that this test is really about how fast the screen can be scrolled. Counting the variable from 1 to 1000 is insignificant compared to the time it takes to scroll the screen in any language. Printing a number takes longer than counting, but it's only displaying 4 characters (max) compared to moving 768 when it scrolls. 2 1 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 23, 2019 Author Share Posted September 23, 2019 (edited) Thx. In testprogram 2 no scrolling is needed. 10 FOR I=1 TO 15000 20 A=I*(2+I) 30 NEXT I 40 PRINT A Can you code that in assembler and test it? Edited September 23, 2019 by Sid1968 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted September 23, 2019 Share Posted September 23, 2019 It is difficult to write an exact duplicate of your calculation loop because all numbers in TI Basic and the XBs are handled as 8-byte (64 bits), radix-100, floating-point numbers—including the index numbers of FOR...NEXT loops. Not so in fbForth, where you must go out of your way to invoke floating point numbers and calculations. Here are three different fbForth versions of your loop that calculate I * (I + 2) in the loop. The first simply does the calculations with 16-bit integers and gives the wrong answer because 225000000 overflows 16 bits. It takes 9 seconds: : SNLOOP ( -- n ) \ loop with single-cell, 16-bit numbers 0 \ 0 to stack for first drop in loop 15000 0 DO \ loop 15000 times DROP \ drop num on stack I 2 + \ index (I) + 2 I * \ I*(I+2) LOOP CR \ next line . \ print last loop result ; The second calculates a double (32-bit) number for the result using U* , a multiplication operator that multiplies two 16-bit numbers to produce a 32-bit result. U* must be a good bit more efficient (I must take a look at how the two words are coded!) than * because this routine took only 7 seconds: : DNLOOP ( -- d ) \ loop with double-cell, 32-bit numbers 0 0 \ double num 0 for first drop in loop 15000 0 DO \ loop 15000 times DROP DROP \ drop double num on stack I 2 + \ index (I) + 2 I U* \ I*(I+2)..U* leaves a double num result LOOP CR \ next line D. \ print double num from last loop result ; Finally, here is the floating-point-based routine, which still uses 16-bit integers for the loop processing and the addition within the loop. It takes a whopping 59 seconds: : FPNLOOP ( -- f ) \ loop with 4-cell, 64-bit, radix-100 floating point (FP) numbers 0 0 0 0 \ FP 0 for first drop in loop 15000 0 DO \ loop 15000 times FDROP \ drop FP num on stack I 2 + \ index (I) + 2 S->F \ convert to FP I S->F \ I to FP F* \ FP multiply I*(I+2) LOOP CR \ next line F. \ print FP num from last loop result ; BTW, the result for the second and third loops is 224999999 because the loops start at 0—a habit of us Forth programmers. Sorry about just realizing that. ...lee 1 Quote Link to comment Share on other sites More sharing options...
Asmusr Posted September 23, 2019 Share Posted September 23, 2019 31 minutes ago, Sid1968 said: Thx. In testprogram 2 no scrolling is needed. 10 FOR I=1 TO 15000 20 A=I*(2+I) 30 NEXT I 40 PRINT A Can you code that in assembler and test it? As Lee pointed out, I could code it with 32-bit integers, but it wouldn't be a fair comparison to floating point. And I have no idea how to do this with floating point in assembly. It's something I would never ever use in a game. My guess is that the integer version would take about 2-3 seconds since the forth version took 7. Quote Link to comment Share on other sites More sharing options...
apersson850 Posted September 23, 2019 Share Posted September 23, 2019 (edited) The probability that Extended BASIC uses the same routines for floating point math as TI BASIC does is high. Other things are most certainly coded in the 8-bit wide ROM in the Extended BASIC module, a ROM which also is bank switched, which takes even more time, when that's needed. The PME, for example, the virtual processor that runs the p-system, calls the floating point math routines in console ROM for floating point arithmetic. It does not to do math with integers, though. This is the same approach as you usually would do in an assembly only program, if you need floating point math there. There are two memory locations, in scratch pad RAM, reserved for two values. They are called FAC (Floating point ACcumulator) and ARG (ARGument). Place two values in them and call FloatADD, to get them added together and the result stored in FAC. There are several other floating point routines available too. You could write your own floating point routines too, of course. But usually it's better to stay with integers, perhaps with multiple precision, as long as you can, and only resort to floating point number when you have a compelling need. Then you should know that the floating point math performed by the TI 99/4A is as good as that on TI's calculators (perhaps it's the same routines), and they are very good. So even if you don't get top speed, you do get very good accuracy. I've posted details about the PME here before. To execute a function in p-code, for which there's a direct replacement in the vocabulary of the TMS 9900, the overhead is seven times. That is, the p-code ADD can be coded in one single machine instruction, A *SP+,*SP. But to figure out what to do, the interpreter executes seven instructions. For p-codes requiring several machine instructions to be executed, like a subprogram call, the relative overhead is less, since about the same amount of instructions are used for decoding, but more useful work is done once the decoding part is done with. Printing on the screen, on the other hand, isn't fast in the p-system. Since it emulates an 80 character wide screen, it has to keep track of both the 24 line 80 column screen memory, and which part of that it should actually show on the 24 line 40 column screen. Edited September 23, 2019 by apersson850 1 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 23, 2019 Author Share Posted September 23, 2019 (edited) 1 hour ago, Lee Stewart said: It is difficult to write an exact duplicate of your calculation loop because all numbers in TI Basic and the XBs are handled as 8-byte (64 bits), radix-100, floating-point numbers—including the index numbers of FOR...NEXT loops. Not so in fbForth, where you must go out of your way to invoke floating point numbers and calculations. Here are three different fbForth versions of your loop that calculate I * (I + 2) in the loop. The first simply does the calculations with 16-bit integers and gives the wrong answer because 225000000 overflows 16 bits. It takes 9 seconds: : SNLOOP ( -- n ) \ loop with single-cell, 16-bit numbers 0 \ 0 to stack for first drop in loop 15000 0 DO \ loop 15000 times DROP \ drop num on stack I 2 + \ index (I) + 2 I * \ I*(I+2) LOOP CR \ next line . \ print last loop result ; The second calculates a double (32-bit) number for the result using U* , a multiplication operator that multiplies two 16-bit numbers to produce a 32-bit result. U* must be a good bit more efficient (I must take a look at how the two words are coded!) than * because this routine took only 7 seconds: : DNLOOP ( -- d ) \ loop with double-cell, 32-bit numbers 0 0 \ double num 0 for first drop in loop 15000 0 DO \ loop 15000 times DROP DROP \ drop double num on stack I 2 + \ index (I) + 2 I U* \ I*(I+2)..U* leaves a double num result LOOP CR \ next line D. \ print double num from last loop result ; Finally, here is the floating-point-based routine, which still uses 16-bit integers for the loop processing and the addition within the loop. It takes a whopping 59 seconds: : FPNLOOP ( -- f ) \ loop with 4-cell, 64-bit, radix-100 floating point (FP) numbers 0 0 0 0 \ FP 0 for first drop in loop 15000 0 DO \ loop 15000 times FDROP \ drop FP num on stack I 2 + \ index (I) + 2 S->F \ convert to FP I S->F \ I to FP F* \ FP multiply I*(I+2) LOOP CR \ next line F. \ print FP num from last loop result ; BTW, the result for the second and third loops is 224999999 because the loops start at 0—a habit of us Forth programmers. Sorry about just realizing that. ...lee Thank you Lee, for your very accurate tests!!! fbForth is amazing. Mmmhhh.... I am very exited for Richards results. Could it be that RXB should get expanded with more mathematic routines? I show you here more tests i made. You see that the differences between TI-Basic and RXB takes sometimes up to 40 seconds. For me this looks not like a problem of the system-architectural of the TI-99/4A, but of a problem in the XB-Versions. Maybe the mathematic functions should get rewriten or eked, e.g. one command for integer another for floatingpoint. And maybe this commands should look if a 32KB RAM is available to work there faster. Here we go... TI-Basic VS RXB2015. Only the the way how "A" is calculated was changed. It begins simple to look where the problems are. Using brackets leads to much higher calculation time in TI-Basic, even if it calculates faster then RXB at all... For that look at Testprograms 4 and 5. They have the same result. In 4 i use no brackets, in 5 i use brackets even if they are not needed to get the result of 4 only to test the bracket-effect. Testprogram 3: --------------- 10 FOR I=1 TO 15000 20 A=I*7 30 NEXT I 40 PRINT A Result: 105000 TI-Basic: 122 RXB 2015: 152 Testprogram 4: --------------- 10 FOR I=1 TO 15000 20 A=I+7 30 NEXT I 40 PRINT A Result: 15007 TI-Basic: 119 RXB 2015: 149 Testprogram 4: --------------- 10 FOR I=1 TO 15000 20 A=I+7*I 30 NEXT I 40 PRINT A Result: 120000 TI-Basic: 155 RXB 2015: 193 Testprogram 5: --------------- 10 FOR I=1 TO 15000 20 A=I+(7*I) 30 NEXT I 40 PRINT A Result: 120000 TI-Basic: 185 RXB 2015: 198 Testprogram 6: --------------- 10 FOR I=1 TO 15000 20 A=(I+7)*I 30 NEXT I 40 PRINT A Result: 225105000 TI-Basic: 192 RXB 2015: 204 Edited September 23, 2019 by Sid1968 Quote Link to comment Share on other sites More sharing options...
RXB Posted September 23, 2019 Share Posted September 23, 2019 (edited) On 9/22/2019 at 5:05 AM, Lee Stewart said: The difference, which is not much, likely has to do with the fact that TI Basic ROM routines are located in console ROM and run on the 16-bit bus, whereas ROM routines unique to XB run on the 8-bit bus. The GROMs are also similarly situated, but I do not know whether that makes anywhere near the speed difference the placement of ROMs has. ...lee As stated by Tursi no this is not true GPL does not execute at different speeds in a Cart or GROM in console. As I have explained on another posts and other times, the reason some routines in TI Basic are faster then TI Extended Basic (or RXB or SuperXB or XB2.7) is that Extended Basic has DISPLAY and DISPLAY AT(row,col) added to the routines of PRINT, thus as more options are checked this slows the XB version of PRINT. It is like your boat the more weight you add the slower you can row it. Seeing as how there is NO DISPLAY AT in TI Basic it has a hell of a shorter list to scan for in total and each command has many fewer options that it needs to look for... For 20 years I have written GPL for TI Basic and XB (RXB) along with other GPL projects like ET at Sea where I created the first usable GPL code and released it. (Thanks to Tursi and some others for refining that code to be much more usable we see today.) A perfect example is RND in TI Basic and XB and RXB. I put the simple less complicated RND from TI Basic into RXB. Extended Basic had this huge complicated RND generator that used Floating Point up to 9 digits so RND in XB was just so freaking slow. XB suffers from much of this as a problem, but then it also has sprites and other features that make TI Basic look so primitive and bad. Edited September 23, 2019 by RXB 1 Quote Link to comment Share on other sites More sharing options...
RXB Posted September 23, 2019 Share Posted September 23, 2019 (edited) Quote You show me in the XB ROMs where this interpreter is please? What I see is routines to deal with a OS ROM that is designed for TI Basic and EA not XB. Thus what I see is the XB ROMs deal with a different way to get results but must use different EQUATES to pull this off. And proof is what you see in the EA manual using TI Basic vs XB. SOURCE1.txt SOURCE2.txt Edited September 23, 2019 by RXB BLANK Quote Link to comment Share on other sites More sharing options...
RXB Posted September 23, 2019 Share Posted September 23, 2019 2 hours ago, Asmusr said: As Lee pointed out, I could code it with 32-bit integers, but it wouldn't be a fair comparison to floating point. And I have no idea how to do this with floating point in assembly. It's something I would never ever use in a game. My guess is that the integer version would take about 2-3 seconds since the forth version took 7. At the time hardware people and software people were involved in speed and accuracy if you look at magazines and articles. Thus some like TI were fixated on accuracy over speed as they were selling TI Calculators, while others only cared about speed not accuracy. Some computers were very accurate with Floating Point and numbers in general, while others were crap at accuracy but were faster as a result. Not a mystery as to results if you look at the times and the controversies. It 100% explains all the differences. 1 Quote Link to comment Share on other sites More sharing options...
Sid1968 Posted September 23, 2019 Author Share Posted September 23, 2019 Thank you Richard. Is there no way to fix the speedproblems in RXB? 1 Quote Link to comment Share on other sites More sharing options...
RXB Posted September 23, 2019 Share Posted September 23, 2019 7 minutes ago, Sid1968 said: Thank you Richard. Is there no way to fix the speedproblems in RXB? Working on it for years. If you use Classic99 try this tests: 10 A=RND 20 C=C+1 30 GOTO 10 Run this in TI Basic for 1 hour, then XB for 1 hour and then RXB for 1 hour. You will find XB is the very slowest, TI Basic second and RXB just eats both for breakfast. Sadly you can not run all three at same time in Windows 10 anymore, as Windows 10 only runs 1 program in foreground only, yes Microsoft SUCKS! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.