Jump to content
IGNORED

π spigot benchmark results


vol

Recommended Posts

I have a project, a multi-platform number π calculator.  It is available for the TI-99/4A with Editor/Assembler cartridge and 32 KB RAM expansion.  The results are a bit surprising for me.  The TI-99/4A Basic performance is likely the slowest among home computers but its processor at 3 MHz shows very good performance.  It beats the Z80@6 MHz, 6502@4MHz, or even VAX-11/730!  I am not an experienced TMS9900 coder so it is possible that my code may be improved a bit.  So let us try to make it faster.  We need to optimize only 33 LOC:

!l4:   s 12,7
       ...
       jne -!l4

It is also interesting to get results from real hardware.  IMHO emulators are a bit faster than hardware.

Edited by vol
  • Like 4
Link to comment
Share on other sites

4 hours ago, vol said:

I have a project, a multi-platform number π calculator.  It is available for the TI-99/4A with Editor/Assembler cartridge and 32 KB RAM expansion.  The results are a bit surprising for me.  The TI-99/4A Basic performance is likely the slowest among home computers but its processor at 3 MHz shows very good performance.  It beats the Z80@6 MHz, 6502@4MHz, or even VAX-11/730!  I am not an experienced TMS9900 coder so it is possible that my code may be improved a bit.  So let us try to make it faster.  We need to optimize only 33 LOC:


!l4:   s 12,7
       ...
       jne -!l4

It is also interesting to get results from real hardware.  IMHO emulators are a bit faster than hardware.

IMHO the Classic99 emulator has faster file i/o but seems very close to real hardware in instruction execution time.

When do we get to see the code?  :) 

 

 

Link to comment
Share on other sites

7 hours ago, TheBF said:

IMHO the Classic99 emulator has faster file i/o but seems very close to real hardware in instruction execution time.

When do we get to see the code?  :)

The Classic99 DSR is simulated, so yeah, file access through that will be much, much faster than reality.

 

If you run the TI DSR (supported for test only), only the actual sector read is simulated - it will still be faster since there's no spinning disk or data transfer, but not as much.

 

  • Like 1
Link to comment
Share on other sites

On 3/2/2021 at 10:52 AM, vol said:

I have a project, a multi-platform number π calculator.  It is available for the TI-99/4A with Editor/Assembler cartridge and 32 KB RAM expansion.  The results are a bit surprising for me.  The TI-99/4A Basic performance is likely the slowest among home computers but its processor at 3 MHz shows very good performance.  It beats the Z80@6 MHz, 6502@4MHz, or even VAX-11/730!  I am not an experienced TMS9900 coder so it is possible that my code may be improved a bit.  So let us try to make it faster.  We need to optimize only 33 LOC:

 

I am not the world's expert on Assembly Language but I think the scroll routine is sub-optimal and in my experience this really affects long benchmark timings on the TI-99 as the scroll is a significant part of the time.

Looks like you are moving one byte at a time?  

 

Typically it is better to read at least one line into a RAM buffer because the VDP can auto-increment the address for us.

Then write the RAM buffer back to VDP, again leveraging the VDP auto-increment feature.

In the extreme the optimal is read lines 2 to 24 to RAM in a big buffer and then write lines 2 to 24 to lines 1 to 23.

 

One can also improve the address selection code, as was taught to me by others here, by using the fact that registers are in memory.

This lets you use the odd address of a register as a source rather than using SWPB twice.

 

       li 5,>4000      *scroll
       li 3,cols
!loop: swpb 3
       limi 0
       movb 3,@vdpwa
       swpb 3
       movb 3,@vdpwa
       movb @vdprd,0
       swpb 5
       movb 5,@vdpwa
       swpb 5
       movb 5,@vdpwa
       movb 0,@vdpwd
       limi 2
       inc 3
       inc 5
       ci 3,cols*rows
       jne -!loop

 

  • Like 1
Link to comment
Share on other sites

Update on the scroll.

I re-wrote your code in Forth Assembler and compared it to my scroll that uses Forth for looping (quite slow) and ASM routines to r/w VDP RAM but I use a 2 line buffer.

The video shows the results on Classic99 using a coarse elapsed timer.  Screen capture shows the results.  SCROLL-SLOW is almost 2X slower than mine which uses a lot of Forth.

NEEDS MOV FROM DSK1.ASM9900
\ NEEDS .S  FROM DSK1.TOOLS

DECIMAL
24   CONSTANT ROWS
40   CONSTANT COLS  \ text mode

HEX
8800 CONSTANT VDPRD
8802 CONSTANT VDPSTS
8C00 CONSTANT VDPWD
8C02 CONSTANT VDPWA

CODE SCROLL-SLOW ( -- )
     R5 4000 LI,
     R3 COLS LI,
     BEGIN,
        R3 SWPB,          \ !loop:
        0 LIMI,
        R3 VDPWA @@ MOVB,
        R3 SWPB,
        R3 VDPWA @@ MOVB,
        VDPRD @@ R0 MOVB,
        R5 SWPB,
        R5 VDPWA @@ MOVB,
        R5 SWPB,
        R5 VDPWA @@ MOVB,
        R0 VDPWD @@ MOVB,
        2 LIMI,
        R3 INC,
        R5 INC,
        R3 COLS ROWS *  CI,
      EQ UNTIL,         \ jne -!loop
      NEXT,
ENDCODE

 

SLOWSCROLL.png

  • Like 1
Link to comment
Share on other sites

23 hours ago, TheBF said:

Update on the scroll.

I re-wrote your code in Forth Assembler and compared it to my scroll that uses Forth for looping (quite slow) and ASM routines to r/w VDP RAM but I use a 2 line buffer.

The video shows the results on Classic99 using a coarse elapsed timer.  Screen capture shows the results.  SCROLL-SLOW is almost 2X slower than mine which uses a lot of Forth.

Thank you.  What a nice idea to use a Forth-like assembler!  IMHO other platforms missed this idea. :( I'm a big RPN fan.
Indeed my scrolling routine might be faster but for my project it is not very important.  The speed of scrolling can noticeably affect only 100 digit results.  For ER calculation, only values of CPU timing (without IO) are used.  BTW I had to bend some rules in favor of the TI-99/4A because I must only use a system routine for displaying characters.  But the TI-99/4A doesn't have such a routine. :( IMHO my routine is faster than Basic PRINT anyway.

My primary interest is to make the number pi calculation faster using all tricks available on every system.

  • Like 1
Link to comment
Share on other sites

OK I wondered how much scrolling was involved. I saw some code that did 2000+ digits so then It would become a problem on TI-99.

 

Forth has had an assembler since it's inception. It's typically 150 to 200 lines of Forth to an write an RPN assembler. And Forth's compiling means Macros come along for free.

It's pretty cool to be able to test code snippets interactively at the command line before committing :) 

  • Like 3
Link to comment
Share on other sites

  • 4 weeks later...

I have made a version of the pi calculator for Extended Basic (it is available for download in the same place).  It works fine under Classic99.  However I tried several other emulators.  Surprisingly the timing results were different.
Let me show results for 1000 digits:
JS99 - 66.5 s
Classic99 - 69.1 s
MAME - 127.3 s
So these results are almost identical for JS99 and Classic99 but MAME shows a completely different number.  I can assume that MAME doesn't emulate properly faster speed access to scratchpad RAM.  I also checked results with stopwatch.  All emulators print the same timings as my stopwatch.  BTW I failed to runt V9t9, it seems that its code has problems with modern Java.
I've got another problem.  I tried various XB cartridges with MAME:

extended_basic_100.rpk
extended_basic_27.rpk
extended_basic_plus.rpk
extended_basic.rpk
I can successfully use only the first of them.  All others give I/O ERROR 00 after
OLD DSK1.PIXB
I use the next line to start MAME
mess ti99_4a -cart extended_basic_100.rpk -ioport peb -ioport:peb:slot2 32kmem -ioport:peb:slot8 hfdc -flop1 pi.dsk
Why don't other XB variants work?  Any hint?  Thank you in advance.

Link to comment
Share on other sites

3 hours ago, vol said:

So these results are almost identical for JS99 and Classic99 but MAME shows a completely different number.  I can assume that MAME doesn't emulate properly faster speed access to scratchpad RAM.  I also checked results with stopwatch.

I would be highly interested how this could possibly happen. Timing precision is the primary goal for MAME, and scratch PAD access is certainly emulated with 0 WS and 16 bit. This is the highest deviation that I ever heard of.

 

I suppose you are using a current MAME release.

 

Edit: Also, please use the ZIP cartridges primarily, not the RPKs, unless there is no ZIP. That is, you have

 

exbasm            Mechatronic/PS Extended Basic
exbasic           Extended Basic
exbas25           Extended Basic v2.5
exbaspl           Extended Basic Plus
exbasic1          Extended Basic v100


and exbasic is guaranteed to work, as I am always using it.

 

Edit2: Please post the pi.dsk file (or send it to me by private message) so that I can test it.

Edited by mizapf
  • Like 1
Link to comment
Share on other sites

Thank you very much.

14 hours ago, mizapf said:

I would be highly interested how this could possibly happen. Timing precision is the primary goal for MAME, and scratch PAD access is certainly emulated with 0 WS and 16 bit. This is the highest deviation that I ever heard of.

 

I suppose you are using a current MAME release.

 

Edit: Also, please use the ZIP cartridges primarily, not the RPKs, unless there is no ZIP. That is, you have

and exbasic is guaranteed to work, as I am always using it.

 

Edit2: Please post the pi.dsk file (or send it to me by private message) so that I can test it.

I use MAME 0.229.  My experience with other platforms (Commodore, Atari, ...) shows that MAME sometimes is very poor for software which requires accurate timings.  So I usually use other more tuned emulators.  However for the TI99-4A MAME proves that it is a very accurate emulator for this case.
The problem has been in Extended Basic v100.  It seems it uses a kind of long custom interupt routine which affects timings so dramatically.  All other XB variants shows the same results under MAME - 68.4s which matches other emus results. I tested cartridges editass, exbasic, exbas25, exbaspl,  exbasm, and superxb.  Only cartridge exbasic1 shows a very slow calculation process.
I had problems with some XB variants because I entered a filename in lowercase. :) It is interesting that XB v100 fixes keyboard in the uppercase mode.

13 hours ago, Tursi said:

I'd like to run it against hardware as well... but I wasn't able to determine what I needed to pull. The DSK image will cover everyone for testing :)

I have attached the disk image.  Type OLD DSK#.PIEA for standard Basic with the E/A cartridge, or OLD DSK#.PIXB for Extended Basic.  It is still interesting to get results from the real iron.  I need timings from 100, 1000, and 3000 digit runs.  Thank you in advance.

pi.zip

  • Like 3
Link to comment
Share on other sites

Good to hear that you found the culprit. I already guessed it could be Extended Basic 1.00.

 

You may know that MAME is actually a framework for many emulations, and as you see, you cannot judge one emulation in MAME by another. Now it's already 14 years ago that I took over the development of the TI family emulations, and a lot of effort over the years went into bringing the emulation closer to the real hardware, so that result would have been a blatant failure. More information about the MAME development is on my website, https://www.mizapf.de/en/ti99/mame

 

  • Like 3
Link to comment
Share on other sites

2 hours ago, Tursi said:

XB 100 existed before the TI99 had a shift key or lowercase in the character set.

 

Not even joking. :)

 

?  <old man voice> When I was a boy we had 5 bit BAUDOT code on the 80M band. Capitol letters is all anybody ever needs gol'dangit.

  • Like 2
  • Haha 2
Link to comment
Share on other sites

On 4/3/2021 at 3:00 AM, Tursi said:

XB 100 existed before the TI99 had a shift key or lowercase in the character set.

 

Not even joking. :)

 

:) I was also confused by Classic99 which fixes uppercase characters when someone works with its Basics (standard or Extended).  MAME does't do so.

Link to comment
Share on other sites

6 hours ago, vol said:

:) I was also confused by Classic99 which fixes uppercase characters when someone works with its Basics (standard or Extended).  MAME does't do so.

I don't know what you mean by "fixes uppercase"... it doesn't do anything of the sort.

 

But if you are talking about disk access -- that's because it has a different disk controller. MAME emulates hardware disk controllers only. Classic99 has an emulation disk controller that allows it to read Windows files. Since it was emulated anyway, I encoded both uppercase and lowercase versions of the disk devices in the ROM. Again, no translation is taking place, the device controller simply recognizes both. ;)

 

 

  • Like 2
Link to comment
Share on other sites

10 hours ago, Tursi said:

.. looks like I might need a copy of XB100 to test this, too, can't seem to locate one. ;)

 

http://ftp.whtech.com/Cartridges/MAME/zip/exbasic1.zip

10 hours ago, Tursi said:

I don't know what you mean by "fixes uppercase"... it doesn't do anything of the sort.

 

But if you are talking about disk access -- that's because it has a different disk controller. MAME emulates hardware disk controllers only. Classic99 has an emulation disk controller that allows it to read Windows files. Since it was emulated anyway, I encoded both uppercase and lowercase versions of the disk devices in the ROM. Again, no translation is taking place, the device controller simply recognizes both. ;)

 

 

Maybe I missed something but when I am under Classic99 XB I type the same letter when Shift button is pressed or released.  I can type a lowercase letter only in Caps Lock mode.  Under MAME a lowercase letter is typed when shift is not pressed and an uppercase when pressed.

  • Like 1
Link to comment
Share on other sites

Could it be that Alpha Lock is pressed in Classic99 and not in MAME?

 

One thing that obviously surprises many people who are new to the TI-99 is that uppercase is the normal case, lowercase is used rarely. For example, the Editor/Assembler defines all instructions as uppercase, and BASIC keywords are listed as uppercase only.

 

For MAME I suggest to use a different key than ShiftLock to map AlphaLock. I'm using the left Windows key. The problem is that the state of the PC shift lock and the emulated Alpha Lock may get out of sync when you leave the emulation with active Shift Lock.

  • Like 3
Link to comment
Share on other sites

Yep. The normal state of the TI-99/4A is Alpha Lock DOWN - that is, you should do most operations in uppercase.

 

Classic99 inverts the Caps Lock key because switching back and forth while multi-tasking with Windows is annoying and stupid. This is all in the manual. Except the "stupid" opinion. ;) 

 

  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...