Jump to content

Photo

Ahl's Benchmark?


141 replies to this topic

#51 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Sun Dec 9, 2007 11:09 AM

Some emulators are fairly rudimentary in their implementation and timing.

But, WinVICE and Atari800Win+ are cycle-exact, and should return results extremely close to the real thing, provided the machine they run on can maintain 100% speed.

It still gets an asterisk until it's run on the real thing.

#52 Larry OFFLINE  

Larry

    River Patroller

  • Topic Starter
  • 3,931 posts
  • Location:U.S. -- Midwest

Posted Sun Dec 9, 2007 3:50 PM

The difference in accuracy makes me nervous. They *should* be the same.


Dunno. That was the only difference, run on a real 64.

I've already got the stuff put away in the garage. Perhaps someone who uses the 64 emulator can replicate these.

-Larry

#53 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Sun Dec 9, 2007 11:40 PM

Apple IIgs 2.8MHz, hand timed
Time 42.5
Accuracy 1.04141235E-03
Random 8.42399192

There are aftermarket accellerators clocked at up to 18MHz.
One of those should be able to run it in about 6.5 seconds
*without* using native 65c816 instructions.
<edit>
I'm not sure if this is from the same benchmark
but the standard gs had a result of 42.8...
so the other benchmarks should be pretty close.
(hand timing could have been off that much)
TransWarp 11.25 Mhz Time 8.4s
<edit> 5.4 estimated with A=A*A
http://home.swbell.n.../R026GSEMUS.htm

Edited by JamesD, Mon Dec 10, 2007 11:29 PM.


#54 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Mon Dec 10, 2007 8:12 PM

After looking at some results people are getting I think A*A seems to be consistently faster than A^2.
Using that method cut results on the CoCo 3. I'll see if someone can rerun the benchmarks for it to improve times.
I'll have to check out the Apples again.

<edit>
The CoCo3 benchmark improved by over 38%.
The IIgs benchmark inproved by 40% +-hand timing error.


I think it's safe to say that an FPGA based machine with a 25MHz CPU could run this in under 1 second on many machines.

Edited by JamesD, Mon Dec 10, 2007 11:36 PM.


#55 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Tue Dec 11, 2007 2:25 PM

The benchmark over on the C128 forum:

10 PRINT TIME$
20 A=2.71826
30 B=3.14159
40 C=1
50 FOR I=1 TO 5000
60 C=C*A
70 C=C*B
80 C=C/A
90 C=C/B
100 NEXT I
110 PRINT TIME$
120 END


Then they added these lines for a larger benchmark:
25 A$="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
35 B$="1234567890"
65 C$=LEFT$(A$,INT(A))
75 C$=LEFT$(B$,INT(B))
85 C$=RIGHT$(A$,INT(A))
95 C$=RIGHT$(B$,INT(B))



To give you an idea of how fast the SuperCPU 65c816 upgrade is:
PAL SuperCPU128 128 mode: 16 or 17
PAL SuperCPU128 64 mode: 10

CoCo 3 Emulator Fast clock: 141.6
NTSC C128 Fast: 220

#56 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Wed Dec 12, 2007 10:39 PM

For systems where A=A*A and A=A^2 end up resulting in large differences it's because A=A*A is more accurate than the other method. If it appears inaccurate it's only because SQR() is equally inaccurate to A=A^2.

Now, even though A=A*A is significantly faster, it's use doesn't test efficiency of math library functions and kinda defeats the purpose of the original benchmark. If you are using much larger exponents the ^ can be faster btw.

If this were something other than a benchmark you could unroll the loop and the number of multiplies would be reduced to a fraction of what the loop does. The calculations could actually be executed in under a second.

#57 carlsson OFFLINE  

carlsson

    Metagalactic Mule

  • 5,238 posts
  • Location:Västerås, Sweden

Posted Mon Dec 17, 2007 6:09 AM

Anyone got a BBC Model B?

I own the following Acorn computers: Electron, BBC Model B, BBC Master 128, BBC Master Compact. I can benchmark them all tonight (or at least before Christmas).

I also own the very rare computers vTech Laser 2001 (2 MHz 6502, compatible with creatiVision) and COMX-35 (RCA 1802, supposedly one of the slowest Basics ever). I could benchmark these as well. In particular the COMX would be interesting to benchmark compared to ZX-81 in SLOW mode, TI-99/4A, Atari 400/800 etc.

Furthermore, I have a Philips VG-8235 MSX2 to benchmark. Commodore (VIC-20, C64, C128) and Atari (800XL, 130XE) computers are already catered for, it seems.

While we're discussing benchmarks, do you remember that Byte Magazine already in 1977 (or if it was 1979?) published a suite of eight simple benchmarks? I think David Ahl's benchmark was published as an extention to these eight benchmarks. I think many years ago I posted on the newsgroup comp.sys.oric about these benchmarks, and it might be interesting to run them all for a whole set of comparisons. Some Basics are lightning fast when they run an empty FOR-NEXT loop, but get dog slow as soon as you fill the loop with some content like variable assignments, IF statements etc.

Edited by carlsson, Mon Dec 17, 2007 6:12 AM.


#58 richwilsonaus OFFLINE  

richwilsonaus

    Combat Commando

  • 1 posts
  • Location:Tasmania, Australia

Posted Mon Dec 17, 2007 10:30 AM

* Amstrad CPC 6128 Plus Locomotive BASIC 1.1

Accuracy 3.50475E-05
Random 3.07867885
Time 38.9666667 sec


* benchmarked on WinAPE


I performed this benchmark on WinAPE, after adding the following lines:

5 t=TIME

...

120 PRINT "Time: ";(TIME - t) / 300

WinAPE is cycle accurate, and it's using the internal TIME function. The TIME is incremented every 64 * 52 = 3328 microseconds, so the calculation for display should actually be (TIME - t) * 3328 / 1E6 = 38.904 sec. I will run this on my real Amstrad Plus today.

I can also run this benchmark on the BBC Model B. It could be at least twice as fast as the C64 since it's 6502 is clocked at 2MHz, and it doesn't lose any CPU states for video access like the C64.

#59 carlsson OFFLINE  

carlsson

    Metagalactic Mule

  • 5,238 posts
  • Location:Västerås, Sweden

Posted Mon Dec 17, 2007 1:58 PM

Acorn Electron (0.6 - 2 MHz variable rate 6502)
Accuracy: 1.28746033E-5
Random: 5.17578411
Time: 28.5 seconds

Acorn BBC Model B (2 MHz 6502, test rerun twice)
Accuracy: 1.28746033E-5
Random: 10.0480092, 0.309859753
Time: 21 seconds, 20 seconds

Acorn BBC Master Compact (2 MHz R65C102, dunno if it matters)
Accuracy: 2.21729279E-5
Random: 28.1809998
Time: 7.5 seconds - yes, I double checked this benchmark result!

vTech Laser 2001 (2 MHz 6502, I believe)
Accuracy: 3.2722295E-04
Random: 11.8781874
Time: 45 seconds

COMX-35 (2 MHz RCA CDP-1802, like it matters)
Accuracy: 0.32489
Random: 1.26404
Time: approx 7.5 minutes!!! (I didn't measure on the exact seconds)

All times were hand measured using a wrist watch. I was unable to measure my BBC Master 128, as it has a partly malfunctioning keyboard (unable to enter the equals sign, among others). The Laser 2001 doesn't seem to have a ^ function, so I replaced it with A*A.

In any case, the COMX doesn't seem any slower than the original Atari Basic, which surprises me a bit. Of course, the COMX-35 was introduced in late 1983, by which Atari computers had moved on to the XL series and should no longer suffer from a dog-slow Basic. This is a candidate for another series of tests, to see which one really has the slowest Basic and should be poked fun for this.

Edited by carlsson, Mon Dec 17, 2007 2:15 PM.


#60 Tursi OFFLINE  

Tursi

    River Patroller

  • 4,851 posts
  • HarmlessLion
  • Location:BUR

Posted Wed May 22, 2013 10:28 AM

Adding to an old thread, since it came up in the TI group... :) Testing against the TI using Classic99 emulator (which is close but not 100%). Timed with clock device, time ends before results are printed. Porting to TI Extended BASIC required using '::' as the multiple statement operator and removing the argument from RND. Porting to TI BASIC further required breaking up the multiple statement lines, as multiple statements are not supported there.

Texas Instruments 99/4A * (3MHZ TMS9900)

TI BASIC
Accuracy: 0.00000011
Random: 1.928494715
Time: 4m31s

TI Extended BASIC (with 32k memory)
Accuracy: 0.00000011
Random: 6.27999115
Time: 3m55s

The random factor varies a fair bit depending on the random seed. Both of those values are on the low end.

#61 thorfdbg OFFLINE  

thorfdbg

    Dragonstomper

  • 746 posts

Posted Wed May 22, 2013 2:41 PM

For reference: The following is (still) with the Atari 8K basic, but with the (improved) Mathpack that comes with Os++:

Accuracy = 1.18E-3 (that is, by a factor of ten better, note that "accuracy" should spell "error")
Random = 4.985358 (on this run)
Time = 230secs (measured on the machine, not with a stop-watch, i.e. by counting cycles which should be precise)

Thus, a only tiny little bit more careful implementation is by one magnitude more accurate, and by a factor of 2 here faster. Unfortunately, there is not enough ROM space to unroll the loop as TBXL did so there is no chance to improve this much further.

And, finally, if the math is all done on the host (i.e. in an emulation), we get:

Accuracy: 0.014125
Random: 3.94
Time: 17.36 (also, machine time).

The accuracy is worse since numbers are converted back and forth from the Atari BCD to the machine native IEEE format, but the execution speed is of course un-beatable. But that also means that only 17 seconds of the overall native >400 seconds are due to the poor loop management of Atari Basic.

If you ask me about my best bet what improved this is that: a) multiplication is a bit faster, but b) log and exp, as used by Atari basic for the power operator, are quite a bit faster due to a smarter and more precise choice of the approximating polynomials.

(That said, Atari Basic rev.B and C. still have a stupid bug that wants me believe that 4^4 = 257. But that holds only for the Os++ mathpack, and is due to the outright stupid rounding mechanism of the BASIC and not due to the math itself. Basic A gives an unrounded result that is very close to the right one).

#62 slx OFFLINE  

slx

    Stargunner

  • 1,204 posts
  • Location:Vienna, Austria

Posted Sun May 26, 2013 12:27 AM

Out of curiosity I tested this on a HP-35s calc and got 2:45 with 2^10-8 'accuracy' and 12,0067 'randomness'. I can't compare with contemporary calculators as my HP-41 doesn't have a built-in random number function and my 15C (which has one) is a modern day clone much faster than the original. Like Atari BASIC they use BCD (apparently with more significant digits) but I suppose the programs are closer to assembler than to BASIC.

#63 Faicuai OFFLINE  

Faicuai

    Dragonstomper

  • 701 posts
  • Location:Florida, U.S.A.

Posted Sun May 26, 2013 7:20 AM

Out of curiosity I tested this on a HP-35s calc and got 2:45 with 2^10-8 'accuracy' and 12,0067 'randomness'. I can't compare with contemporary calculators as my HP-41 doesn't have a built-in random number function and my 15C (which has one) is a modern day clone much faster than the original. Like Atari BASIC they use BCD (apparently with more significant digits) but I suppose the programs are closer to assembler than to BASIC.


You beat me to the punch... I was wondereing how my 48GX (+Metakerne/Erable/Alg48) would perform. I will run this one today (or tomorrow), and check execution timings and precision.

I am preparing a special ROM-build for my Incognito (which will correct the FP-code of the included Atari 10K roms) and also compare with modified fast-FP rom for XL, and also with 48GX results...

#64 Larry OFFLINE  

Larry

    River Patroller

  • Topic Starter
  • 3,931 posts
  • Location:U.S. -- Midwest

Posted Sun May 26, 2013 10:07 AM

I wonder how Ahl might run at 14 MHz? ;)

-Larry

#65 slx OFFLINE  

slx

    Stargunner

  • 1,204 posts
  • Location:Vienna, Austria

Posted Sun May 26, 2013 1:10 PM

I am preparing a special ROM-build for my Incognito (which will correct the FP-code of the included Atari 10K roms) and also compare with modified fast-FP rom for XL, and also with 48GX results...

Hope you'll share that incognito ROM, also waiting for a board. (And if you PM your 48 code, I'd like to try it on a 48G and 49G+, I still didn't grasp enough of RPL to do it myself. Maybe I'll try in the hpmuseum forum for someone to run it on a HP-15 as that was the calc I used in senior high during my "Atari prime".)

#66 Faicuai OFFLINE  

Faicuai

    Dragonstomper

  • 701 posts
  • Location:Florida, U.S.A.

Posted Mon May 27, 2013 8:28 AM

Hope you'll share that incognito ROM, also waiting for a board. (And if you PM your 48 code, I'd like to try it on a 48G and 49G+, I still didn't grasp enough of RPL to do it myself. Maybe I'll try in the hpmuseum forum for someone to run it on a HP-15 as that was the calc I used in senior high during my "Atari prime".)


Here you go:

Even though this is code pertaining to a different platform, I will post it here (anyway) because it certainly serves us as a solid reference for evaluating speed and accuracy of Atari's own results:


HP 48GX Results Summary (code samples shown further below):
  • HW Config:
    • 128K MAIN ram, 128K RAM Card Slot 1, 1MB RAM Card Slot 2
    • HP SATURN 4bits/20bits/64bits CPU, running @ 3.9 Mhz on 3x AAA 850mah Eneloop batts (!!!)
  • SW Config: Erable 3.2, ALG48, Java, 106K free MAIN ram.
  • Final Results:
    • Test Version #1 (UserRPL code interpreter): 14.95 secs, Delta: 0.00000002 (2x10^-8, 12 digits), Random: 23.52
    • Test Version #2 (UserRPL code interpreter): 21.39 secs, Delta: 0.00000002 (2x10^-8, 12 digits), Random: 16.35.
    • Test Version #3 (UserRPL code interpreter): 44.59 secs, Delta: 0.00000000 ("infinity" precision), Random: 3.60.
    • NOTES: "Random" sum varies from as little as 1.3 to 30+. SCREEN refresh turned-off, for full CPU-processing power.
Here is the code for each Test Version:

Test Version #1 (unrolled inner-loops & optimized for on-Stack/RPL performance, sensibly faster than local variables, timed with Erable's TEVAL):

<<0 0 1 100 FOR n n

SQRT SQRT SQRT SQRT SQRT SQRT SQRT SQRT SQRT SQRT

SQ SQ SQ SQ SQ SQ SQ SQ SQ SQ + SWAP

RAND RAND RAND RAND RAND RAND RAND RAND RAND RAND RAND

RAND RAND RAND RAND RAND RAND RAND RAND RAND RAND RAND

+ + + + + + + + + + + + + + + + + + + + SWAP

NEXT

5 / 1010 - ABS SWAP 1000 - ABS >>


Test Version #2 (run-of-the-mill FOR-NEXT inner loops, but control variables "S" and "R" still tallied on-STACK, timed with Erable's TEVAL):

<<0 0 1 100 FOR n n

1 10 FOR i SQRT SWAP RAND + SWAP NEXT

1 10 FOR i SQ SWAP RAND + SWAP NEXT

3 ROLL + SWAP NEXT

1000 - ABS SWAP 5 / 1010 - ABS >>


Test Version #3 (On-Stack Algebraic solve of "10th Power of 10th Root of n", unrolled inner-loops, 12-digits RAND sum, requires ALG48 "ASIM" or similar):

<<0 0 1 100 FOR n

'XROOT(10,n)^10' ASIM EVAL + SWAP

RAND RAND RAND RAND RAND RAND RAND RAND RAND RAND RAND

RAND RAND RAND RAND RAND RAND RAND RAND RAND RAND RAND

+ + + + + + + + + + + + + + + + + + + + SWAP

NEXT

5 / 1010 - ABS SWAP 1000 - ABS >>


As for Incognito special ROM-build, I will certainly share it. Please, keep in mind that you will need a RELIABLE (notice the word) Flash/PLCC burner, so you do not get temporarily stuck (as me 8-). It will be finished by Wed/Thursday, this week. All ROMs (XL and legacy) will be updated from "binary-correct" versions of each, and both RevA and RevB 10K-ROMs will be updated with Newell's FP code and Fast-FP XL-rom will also be included (latest version of SDX will be there, too). Hopefully, It will be a nuts-and-bolts package.

Enjoy!

#67 AtariGeezer OFFLINE  

AtariGeezer

    River Patroller

  • 2,538 posts
  • Location:Santee, CA

Posted Mon May 27, 2013 4:07 PM

As for Incognito special ROM-build, I will certainly share it. Please, keep in mind that you will need a RELIABLE (notice the word) Flash/PLCC burner, so you do not get temporarily stuck (as me 8-).


It won't be long before the Incognito Flasher is complete. I estimate in the next week that it will be done :)
Just have a couple of kinks to work out, then I'll ask the beta team to test it before a public release...

#68 bob1200xl OFFLINE  

bob1200xl

    River Patroller

  • 2,497 posts

Posted Tue May 28, 2013 1:52 PM

Probably 70 seconds on Atari BASIC.

Something like 6.25 seconds on TBXL.

Bob



I wonder how Ahl might run at 14 MHz? ;)

-Larry



#69 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Thu May 30, 2013 12:40 AM

Adding to an old thread, since it came up in the TI group... :) Testing against the TI using Classic99 emulator (which is close but not 100%). Timed with clock device, time ends before results are printed. Porting to TI Extended BASIC required using '::' as the multiple statement operator and removing the argument from RND. Porting to TI BASIC further required breaking up the multiple statement lines, as multiple statements are not supported there.

Texas Instruments 99/4A * (3MHZ TMS9900)

TI BASIC
Accuracy: 0.00000011
Random: 1.928494715
Time: 4m31s

TI Extended BASIC (with 32k memory)
Accuracy: 0.00000011
Random: 6.27999115
Time: 3m55s

The random factor varies a fair bit depending on the random seed. Both of those values are on the low end.

Well, the TI-99/4 beats the Sinclair Spectrum and standard Atari BASIC,
It's even close to the speed of Atari BASIC using the improved math pack.
However, it's not even 1/3 the speed of the faster machines.


I never did run the benchmark on the JR-200 or NEC TREK and my Thomson TO-8 needs repairs before I could test it.
I would expect the JR-200 to do well. The TREK would probably do ok in text mode but in hi-res graphics modes it would probably slow down.
FWIW, in the benchmarks I ran on the Tandy MC-10 it was faster than the C64 but I didn't run this benchmark.

#70 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Fri May 31, 2013 3:32 PM

For reference: The following is (still) with the Atari 8K basic, but with the (improved) Mathpack that comes with Os++:

Accuracy = 1.18E-3 (that is, by a factor of ten better, note that "accuracy" should spell "error")
Random = 4.985358 (on this run)
Time = 230secs (measured on the machine, not with a stop-watch, i.e. by counting cycles which should be precise)

Thus, a only tiny little bit more careful implementation is by one magnitude more accurate, and by a factor of 2 here faster. Unfortunately, there is not enough ROM space to unroll the loop as TBXL did so there is no chance to improve this much further.

I wonder what that would do for BASIC XL. 220 secs or less I would guess. (I have a BASIC XL cart).

#71 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Sat Jun 1, 2013 3:02 PM

I hand timed an MC-10 emulator. The emulator clearly isn't cycle accurate or my timing is off. Probably both but it seems about right vs the C64 times given other benchmarks I've seen.
Time: 120-123 secs (timed multiple times)
Accuracy: 5.96284867E-04
Random: Yup, it's random

#72 nanochess OFFLINE  

nanochess

    River Patroller

  • 4,974 posts
  • Coding something good
  • Location:Mexico City

Posted Sat Jun 1, 2013 3:33 PM

I translated Ahl's benchmark to Javascript just for curiosity.

<script>
// Ahl's simple benchmark in Javascript
var a, n, i, r, s, t, m;
t = new Date();
m = 0;
do {
r = 0;
s = 0;
for (n = 1; n <= 100; n++) {
a = n;
for (i = 1; i <= 10; i++) {
	 a = Math.sqrt(a);
	 r = r + Math.random();
}
for (i = 1; i <= 10; i++) {
	 a = Math.pow(a, 2);
	 r = r + Math.random();
}
s = s + a;
}
m++;
} while (new Date() - t < 1000) ;
document.writeln("Times " + m);
document.writeln("Accuracy " + Math.abs(1010-s/5));
document.writeln("Random " + Math.abs(1000-r));
</script>
Put it inside a file named benchmark.html and run it with your web browser.

As modern machines are too fast, the test runs until a whole second has passed and it shows how many times it ran.

If someone noted it, the accuracy value indicates to how many digits is exact the floating point core, so it should be a very low value. (the trick is it takes the value, takes square root ten times and then squares ten times, so the value theorically should be the same but cannot because of floating point implementations)

Edited by nanochess, Sat Jun 1, 2013 3:34 PM.


#73 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Sat Jun 1, 2013 7:05 PM

I think it's safe to say that just about any modern computer (including many cell phones) could probably run this faster than a single VBLANK interrupt could be used to time.
When you switch to other languages it probably just makes matters worse.

#74 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Sat Jun 1, 2013 10:13 PM

I just timed the NEC Trek /PC-6001 on the Virtual NEC Trek emulator.by hand
I don't know how accurate the emulator timing is but it's pretty interesting.
I knew the video hardware wasn't isolated from the CPU and added wait states to the CPU, but I didn't know it was this bad.

BTW, I mentioned in another post that hi-res mode would be worse than text but after some thought I realize the VDG has to read the same number of bytes from RAM in text mode as hi-res. The VDG doesn't remember what characters were read from one scan line to the next and there are the same number of horizontal bytes and vertical rows in both modes. The modes that allowed more colors on screen by switching palettes or modes would require more reads from RAM though and slow this down further.

NEC Trek/PC-6001 (4MHz PD780C-1 CPU, a Z80 Clone)
Time: 496 secs (8:16)
Accuracy: 4.78744507E-04

If that's correct, we have a new bottom of the heap by 46 secs.
I don't have an Extended BASIC ROM image to see if it alters the benchmark.

#75 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,859 posts
  • Location:Flyover State

Posted Sat Jun 1, 2013 11:03 PM

The MSX Turbo R
Time: 30 secs (Hand timed)
Accuracy: 2.058E-07

That's slightly faster than the Apple IIc+
That only leaves the Acorn machines as potentially faster but I'd like to make sure the square function was used and not A*A before I hand Acorn the speed title for standard speed machines.




0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users