A worse programmers questions

Sid1968 · September 24, 2019

54 minutes ago, RXB said:

Look 80% of RXB is exactly the same as XB, I just added some routines and some work slightly faster and most are exactly the same speed.

I do not get this confusion of why XB and RXB are somehow magically different when over 80% is exactly the same?

All Expanded Basic Versions have speedissues. I only call RXB since its still the only XB-Version in developement and we therewith have a possibility to improve it. So actually this thread is about the improvement of RXB. Yourself suggested the rewriting of roms. Rich dont take constructiv critic personal. And Stargunner, this thread would benefit if your critic wont get personal.

Lets calm down and stay friends. ?

Here do you people remember this?: Make Peace not War ?

Edited September 24, 2019 by Sid1968

Asmusr · September 24, 2019

9 minutes ago, Sid1968 said:

Here do you people remember this?: Make Peace not War ?

But it is a war. If not a 100 years war then at least a 30 years war. The war of the BASIC dialects. These discussions go on over and over again on this forum, and can be quite entertaining to follow, but no conclusion or agreement is ever reached. ;-)

Sid1968 · September 24, 2019

4 minutes ago, Asmusr said:

But it is a war. If not a 100 years war then at least a 30 years war. The war of the BASIC dialects. These discussions go on over and over again on this forum, and can be quite entertaining to follow, but no conclusion or agreement is ever reached.

Here we are a community. We discuss but do not fight against each other. Stay relaxed buddy. This is our hobby... we all want to have fun. Would be a pleasure if your postings would get more constructive.

Edited September 24, 2019 by Sid1968

apersson850 · September 24, 2019

I tried Pascal.

The code which BASIC requires around 200 seconds to execute, but Forth does in 9 seconds, executes in 27.878 seconds in Pascal. But then it has the same issue as the integers in Forth: The result is limited to 16 bits, and thus incorrect.

Pascal doesn't have any double precision integers. It does have long integers, though, with up to 36 digits. I've rarely used them, but they are intended for things like calculations with money, where you can't allow round off errors. Since they offer variable length (declare what you need) and prioritize precision, they seem slow as melasses.

var
  i: integer;
  long, a: integer[10];

begin
  for i := 1 to 15000 do
  begin
    long := i;
    a := long+(2*long);
  end;
  writeln(a);
end.

That code ran in 8 minutes, 56.273 seconds. So it's obvious that the priority for long integers was precision, not speed. They are probably stored one digit per byte, and thus even less efficient than the radix 100 format used for floating point numbers in the 99/4A.

But using either real (same as the floating point numbers used by BASIC) or long integers suffer from the need to not only do the math, but also the conversion from the integer loop variable to a real, or long integer, before the math can be done.

This particular task simply suits Forth better. If I would have to implement this in a real Pascal program, and speed was an issue, I'd do a small assembly routine to handle this particular part of the task.

I noticed in a previous post that Forth runs faster with a 32 bit result than with a 16 bit result. I presume that's because the 32 bit result is unsigned? The TMS 9900 does provide the MPY instruction, which will do an unsigned multiply of two 16 bit integers, to render a 32 bit result. But the normal * operation in Forth is probably signed, which requires a few more instructions for the TMS 9900. The TMS 9995 does have a signed multiply, but that doesn't help the 99/4A.

Edited September 24, 2019 by apersson850

Sid1968 · September 24, 2019

12 minutes ago, apersson850 said:
I tried Pascal.

The code which BASIC requires around 200 seconds to execute, but Forth does in 9 seconds, executes in 27.878 seconds in Pascal. But then it has the same issue as the integers in Forth: The result is limited to 16 bits, and thus incorrect.

Pascal doesn't have any double precision integers. It does have long integers, though, with up to 36 digits. I've rarely used them, but they are intended for things like calculations with money, where you can't allow round off errors. Since they offer variable length (declare what you need) and prioritize precision, they seem slow as melasses.
var
  i: integer;
  long, a: integer[10];

begin
  for i := 1 to 15000 do
  begin
    long := i;
    a := long+(2*long);
  end;
  writeln(a);
end.
That code ran in 8 minutes, 56.273 seconds. So it's obvious that the priority for long integers was precision, not speed. They are probably stored one digit per byte, and thus even less efficient than the radix 100 format used for floating point numbers in the 99/4A.

But using either real (same as the floating point numbers used by BASIC) or long integers suffer from the need to not only do the math, but also the conversion from the integer loop variable to a real, or long integer, before the math can be done.

This particular task simply suits Forth better. If I would have to implement this in a real Pascal program, and speed was an issue, I'd do a small assembly routine to handle this particular part of the task.

I noticed in a previous post that Forth runs faster with a 32 bit result than with a 16 bit result. I presume that's because the 32 bit result is unsigned? The TMS 9900 does provide the MPY instruction, which will do an unsigned multiply of two 16 bit integers, to render a 32 bit result. But the normal * operation in Forth is probably signed, which requires a few more instructions for the TMS 9900. The TMS 9995 does have a signed multiply, but that doesn't help the 99/4A.

Thank you! The results of all languages tested shows that the calculationtime is no matter of the systemarchitecture of the TI-99/4A but of the software. Good news. An improvement of XB (RXB) should be possible! ?

apersson850 · September 24, 2019

I haven't keyed it in to run it, but this assembly code would do the same thing, if I'm not mistaken.

	LI	R0, 15000
	CLR	R1
LOOP  INC	R1	10
	MOV	R1,R2	14
	INCT	R2	10
	MPY	R1,R2	52
	DEC	R0	10
	JNE	LOOP	10
	PRINT	R2 and R3

I've included the required clock cycles for each instruction. They assume the program is running in 16 bit wide RAM, no wait states.

One loop is 106 cycles, 15000 loops then 1590000 cycles, which is equivalent to 0.53 seconds. A few more cycles would be needed to execute the start of the loop and the final print of the result, but I didn't care about them. The print procedure is more complex than the loop, but runs only once.

Sid1968 · September 24, 2019

Good work! ?

apersson850 · September 24, 2019

As a comparison, the more efficient TMS 9995 CPU, used in the Myarc Geneve, would do the same thing in 0.28 seconds if both workspace and code would be in external RAM. If the CPU would be allowed to have both workspace (realistic) and code (a bit less realistic, but for sure doable if needed) on chip, then it would execute the loop in 0.19 seconds.

And although it's true that the result depends on the software, that's true for the application software, not for the language itself. The test loop here happens to be very adapted to the instruction set of the TMS 9900, even if that probably was just a coincidence. As can be seen in my code example above, it's very efficient. Not a lot of extra instructions moving things around. Since I knew exactly what to do, I could use the register layout in the best way, to make all operations count, without any need to move things an extra time.

But realizing this is something a programmer must do. When you use machines with such low throughput, compared to today's architectures, you simply have to look for and understand what's the best thing to use. Like realizing that the Forth operator U* in this case is more efficient than the * operator. You also have to realize that using BASIC to run through 15000 loops, where you use 13 digits of precision, plus exponent, for each operation, instead of using only 16 bit integers for everything except one instruction (which still is an integer, and even an unsigned one), is the wrong approach, if you need maximum speed.

So no, improving BASIC in this case isn't the right way to go. And as Rich has already explored, giving BASIC an integer data type (which still doesn't help, as it has to be a 32 bit integer as well), is far too complex to be worth it. Learning how to integrate assembly into the program you use, learn Forth, use the p-system, even make a clone peripheral so you can use the p-system without having the original p-code card (which now is approaching the same level of rarity as hen's teeth), is an effort better spent.

+Lee Stewart · September 24, 2019

2 hours ago, apersson850 said:

I noticed in a previous post that Forth runs faster with a 32 bit result than with a 16 bit result. I presume that's because the 32 bit result is unsigned?

Yup.

...lee

senior_falcon · September 25, 2019

8 hours ago, RXB said:

Dude everything including NUMBERS are running in TI Basic, this is a fact you seem to not understand that I know full well as I know GPL and you do not.

Hell this is even mentioned in EA manual, and I do not get why you are so hostile at me.

If TI Basic is so much faster how come no one is using it over XB or any other XB versions? (Considering for 20 years everyone call TI Basic slow?)

By the way XB and RXB are the same except for some GPL modifications I have done. But that just indicates another attack on my character huh?

How can numbers run in TI BASIC? This makes no sense. If you are running a program in TI BASIC, then it logically follows that everything making up the program runs in BASIC and we all know that it is all kept in VDP memory. You seem to believe that numeric variables are part of the garbage collection or "flush and restart" as you call it. Knowledge of GPL does not automatically make someone an expert on TI BASIC. So to repeat: numeric variables are not part of a garbage collection. What would the point be? They are always 8 bytes long and the interpreter reserves space for them in the prescan. Strings are a different matter, as they can be 1 to 255 bytes long and that can make garbage collection necessary. If you are so inclined you can easily verify this yourself with Classic99.

Get into TI BASIC, open up the debugger and select CPU ram. This will let you look at the scratchpad ram. Enter this program:

10 A$="HELLO WORLD"

20 GOTO 10

Then run it. Look at >831A, called by Heiner Martin "Pointer to end of RAM space used for strings" (Intern, page 130)

It will count down from >37?? to >07?? then a pause while the garbage collection is done, then it starts over at >37??

Now try your demo for random numbers:

10 X=RND

20 C=C+1

30 GOTO 10

Run it and you will see >831A sit there at >379B with no change. This is proof there is no garbage collection in this program.

As far as hostility goes, I am not hostile to you personally. However, I think you should stick to the facts and not claim to be an expert, then make bogus claims and statements.

Asmusr · September 25, 2019

8 hours ago, Sid1968 said:

Here we are a community. We discuss but do not fight against each other. Stay relaxed buddy. This is our hobby... we all want to have fun. Would be a pleasure if your postings would get more constructive.

I'm perfectly relaxed. I was making a joke to lighten the mood, and to explain to an outsider what's usually going on in this community.

Edited September 25, 2019 by Asmusr

wierd_w · September 25, 2019

Nothing gets hardcore program nerds red-faced faster than arguing the teensy details of performance enhancements in language interpreters or compilers. Some will assert "Look, that long integer division with floating point on a processor that lacks a dedicated FPU is gonna be hella expensive, M'kay? That's why I used integer math!" and another will assert "But you did it WRONG!!" or "it breaks my compression program!", or "there's a bug in it!" (because it naturally lacks the same precision as the floating point math routine).

Then there is the arguments over using one language over another.. Whoo.. that gets ugly when it gets started.

Just let them dish it out to each other, it will end up OK in the end. You just gotta remember that labors of love result in people being attached to their work, and thus when it gets criticized, it can feel like personal attacks. Throw into that the often co-morbid social interaction deficits that hardcore nerddom has, and what appears on one side as raging flame, is really "No, but really-- it's not factually correct!" on the other.

This is what I am picking up here. We're adults, we can handle it.

Sid1968 · September 25, 2019

2 hours ago, senior_falcon said:

As far as hostility goes, I am not hostile to you personally. However, I think you should stick to the facts and not claim to be an expert, then make bogus claims and statements.

Stargunner, since i have no FDD or HDD until i will get the SDD 99 i cannot try your fantastic basicversion.

Is it possible that you translate the basictestprogram 2 to work best in your basicversion and give us the calculationtime in seconds and the programm?

10 FOR I=1 TO 15000
20 A=I*(2+I)
30 NEXT I
40 PRINT A

What would you, as an practitioner of Basic, suggest to improve RXB (XB) speedissues. Would you have the great to help improving Richs fantastic project RXB?

Kind Regards

Sid

Edited September 25, 2019 by Sid1968

+Lee Stewart · September 25, 2019

25 minutes ago, Sid1968 said:

Stargunner, ...

FYI, “Stargunner” is @senior_falcon’s AtariAge rank based on his number of posts, not his handle or user name. The user name is in the header bar for the post. That placement can be visually confusing until you get accustomed to it.

...lee

Sid1968 · September 25, 2019

1 hour ago, Lee Stewart said:

FYI, “Stargunner” is @senior_falcon’s AtariAge rank based on his number of posts, not his handle or user name. The user name is in the header bar for the post. That placement can be visually confusing until you get accustomed to it.

...lee

Sorry, i should have known that. So i aks not "Stargunner", but senior_falcon. ?

Edited September 25, 2019 by Sid1968

Sid1968 · September 25, 2019

On VIC-20 and C64 we have the "RUN/STOP" Key to stop a running program. On TI-99/4A in only found Keycombinations that resets the computer. Do the TI-99/4A do have a "RUN/STOP" Key too?

Sid1968 · September 25, 2019

Finally found it: FCRN + 4 ;-)

apersson850 · September 25, 2019

7 hours ago, senior_falcon said:

How can numbers run in TI BASIC?

Isn't he trying to say that the floating point math routines used by TI BASIC are also used by other languages? I presume TI Extended BASIC use them, I presume Forth use them and I know for sure that the PME (p-ssytem) does use them, for its floating point math routines.

senior_falcon · September 25, 2019

7 hours ago, Sid1968 said:

Stargunner, since i have no FDD or HDD until i will get the SDD 99 i cannot try your fantastic basicversion.

You need 32K, a disk drive and an XB cartridge (or some other XB) to use this on a real TI99. I always use Classic99 which will work fine and is the recommended way, mainly due to CPU overdrive which helps development time a lot.

Is it possible that you translate the basictestprogram 2 to work best in your basicversion and give us the calculationtime in seconds and the programm?

10 FOR I=1 TO 15000
20 A=I*(2+I)
30 NEXT I
40 PRINT A

No translation is necessary. In XB256 it will run at the same speed as XB/RXB. When compiled the program takes about 11 seconds. But because the compiler uses integer arithmetic the result is wrong, so for that benchmark this compiler is not the proper tool.

What would you, as an practitioner of Basic, suggest to improve RXB (XB) speedissues. Would you have the great to help improving Richs fantastic project RXB?

As I see it, there are two ways to speed up XB and rewriting the ROMs is not one of them.

As discussed earlier in this thread, using integer arithmetic would greatly speed things up. But, unless you find a way to use both integers and floating point in the same program, you would lose a lot of the versatility of XB. Using just integers would be fastest, but then you'd have the restrictions associated with that.

The best would be to get rid of GPL and write XB totally in assembly. I am not bashing GPL, it is just a fact that GPL needlessly slows everything down. More of XB is written in assembly and this is why it usually is considerably faster than TI BASIC. I understand there is a Myarc XB that is completely in assembly and that it runs around 3x faster than standard XB, but it requires additional memory. It may be possible to adapt this to run with the new cartridges that are available or perhaps SAMS.

Over the years, I have spent way too much time working on XB issues. With the compiler, I have found a way to make XB programs run 30x faster, and with XB256 you can use all 256 characters. I have exactly zero interest in working on any projects of this type in the future. As Roberto Duran famously said at the end of his fight with Sugar Ray Leonard, "No mas."

Kind Regards

Sid

Edited September 25, 2019 by senior_falcon

Sid1968 · September 25, 2019

30 minutes ago, senior_falcon said:

What would you, as an practitioner of Basic, suggest to improve RXB (XB) speedissues. Would you have the great to help improving Richs fantastic project RXB?

As I see it, there are two ways to speed up XB and rewriting the ROMs is not one of them.

As discussed earlier in this thread, using integer arithmetic would greatly speed things up. But, unless you find a way to use both integers and floating point in the same program, you would lose a lot of the versatility of XB. Using just integers would be fastest, but then you'd have the restrictions associated with that.

The best would be to get rid of GPL and write XB totally in assembly. I am not bashing GPL, it is just a fact that GPL needlessly slows everything down. More of XB is written in assembly and this is why it usually is considerably faster than TI BASIC. I understand there is a Myarc XB that is completely in assembly and that it runs around 3x faster than standard XB, but it requires additional memory. It may be possible to adapt this to run with the new cartridges that are available or perhaps SAMS.

Over the years, I have spent way too much time working on XB issues. With the compiler, I have found a way to make XB programs run 30x faster, and with XB256 you can use all 256 characters. I have exactly zero interest in working on any projects of this type in the future. As Roberto Duran famously said at the end of his fight with Sugar Ray Leonard, "No mas."

Thank you senior_falcon for that good analysis. Writing more code in Assembler would have been my suggestion at first. Writing the whole RXB in Assembler seems to be the best solution. But how to get rid of GPL? What should Rich use instead?

GDMike · September 25, 2019

Rxb can almost go GUI but limited to bus speed I think.

(Edit) meaning there are so many "commands" that could actually be graphical. Drag drop, clickable however you wanna define it. With all these commands regarding screen manipulation makes rxb so flexible.

Edited September 25, 2019 by GDMike

Sid1968 · September 25, 2019

3 minutes ago, GDMike said:

Rxb can almost go GUI but limited to bus speed I think.

Please explain this more detailed?

Edited September 25, 2019 by Sid1968

Sid1968 · September 25, 2019

Myarc Extended BASIC II could really get model, because its written in Assembler and because of its ram expansion to speed up. If RXB Remastered Version would use the new SDD 99 it could use up to 32MB!!!

Edited September 25, 2019 by Sid1968

+Lee Stewart · September 25, 2019

4 hours ago, apersson850 said:

Isn't he trying to say that the floating point math routines used by TI BASIC are also used by other languages? I presume TI Extended BASIC use them, I presume Forth use them and I know for sure that the PME (p-ssytem) does use them, for its floating point math routines.

Actually, fbForth 2.0 does not use the console math routines. A few years ago (with permission), I ported the MDOS L10 Floating Point Library (FPL) to fbForth 2.0 to replace them. The FPL consumes 5632 bytes (more than 2/3) of bank 3 of the fbForth 2.0 cartridge and is, of course, all ALC, albeit on the 8-bit bus. I did this for a number of reasons, among them are

The GPL-based transcendental functions are slow.
The GPL-based transcendental functions use the VDP rollout area at VRAM >03C0, which is right in the middle of the screen area for text and bitmap-graphics modes. This is not a problem for the normal operating graphics mode of TI Basic and the TI Extended Basics because the rollout area is not on screen.
The formatting routine for printing numbers in E-notation does not print the exponent for 3-digit exponents because the exponent field only allows for 2 digits, printing 2 asterisks instead.

It was probably the E-notation-formatting reason that finally drove me to attempt the ALC port of the FPL. The only drawback is that the non-transcendental functions in the console run in ALC on the 16-bit bus, making them clearly faster than fbForth 2.0’s 8-bit-bus versions. One day, I may change the FPL to use the console ALC for all but the GPL-based functions, but this would be a daunting task. I do not doubt I can do it—it just may take more energy than I want to expend.

By the way, the GPL-based floating-point math routines include

Convert Number to String
Greatest Integer (INT)
Raise Number to Power
Square Root
Inverse of Natural Logarithm (EXP)
Natural Logarithm (LOG)
Cosine
Sine
Tangent
Arctangent

...lee

+FarmerPotato · September 25, 2019

22 hours ago, wierd_w said:

Perhaps a revisiting of Carmac's "magic number" based approach would be a good compromise?

https://blog.dave.io/0x5f3759df-a-true-magic-number/

Did you mean fixed-point arithmetic? Fixed point has been used in many TI-99/4A applications which roll their own library. In particular Mandelbrot set generators...

This is a link to Carmac's 1/sqrt(x) hack of unknown origin. Not a general purpose idea.

The hack is amazing and somehow akin to the longhand square root computation, because it shifts one of exponent into the mantissa. I remember this in his series of articles on the Quake engine around 1999.

Natively, sqrt() should use log and exp.

A worse programmers questions

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members