Jump to content
IGNORED

BASIC Interpreter Speed Comparison: Prime Number Generation


Darklantha

Recommended Posts

I checked the disassembly of the Plus/4 ROM. It's pretty much as I thought.
Every byte of the program BASIC reads requires executing 4 extra instructions.
It's an extra 10 clock cycles per byte.
But the code is in RAM so it should be patchable without a new ROM.
Someone supposedly did it at one point but nobody seems to know where the program is

Link to comment
Share on other sites

not to use the poke in the program itself... only in writing a program to cram 4 or so characters in... then save it... no poke within the actual lines themselves... and only when you are really pressed to get the extra characters in a basic line....

Edited by _The Doctor__
  • Like 1
Link to comment
Share on other sites

not to use the poke in the program itself... only in writing a program to cram 4 or so characters in... then save it... no poke within the actual lines themselves... and only when you are really pressed to get the extra characters in a basic line....

I've written exactly one program in assembly for the Atari before this, and maybe one BASIC program, so the Atari port is a strait copy of the plainest code combined with google.

Someone that knows the Atari would probably do better than I can. FWIW, I don't think 4 characters would matter on any version I've written.

Link to comment
Share on other sites

Updated Atari code and numbers.
Atari BASIC
Limit 250 = 2.88333333

Limit 10000 = 125.916667
Altirra
Limit 250 = 1.08333333

Limit 10000 = 45.2666666

20 K=0:I=0:T=0:P=0
50 PRINT CHR$(125)
100 PRINT "Prime Number Generator"
110 PRINT "Upper Limit";:INPUT N
120 POKE 19,0:POKE 20,0
130 T = (N-3)/2
140 DIM A(T+1)

160 FOR I=0 TO T:A(I)=1:NEXT I
200 FOR I=0 TO T:IF A(I)=0 THEN PRINT"..";: NEXT I:GOTO330
240 P=I+I+3:PRINT P;".";:K=I+P
250 IF K<=T THEN FOR K=K TO T STEP P:A(K)=0:NEXT K:NEXT I:GOTO330
256 NEXT I

330 ETIME=(PEEK(19)*256+PEEK(20))/60
340 PRINT
350 PRINT "Total: ";ETIME
360 END
Edited by JamesD
Link to comment
Share on other sites

I was curious, so I wrote the lines on my "real" ZX Spectrum+ BASIC and the Acorn Electron BBC BASIC.

ZX Spectrum+ (48k)

10 CLS : CLEAR
20 LET K=0 : LET I=0 : LET T=0 : LET P=0
30 PRINT "Prime Number Generator"
40 INPUT "Upper Limit :";N
50 LET T=INT ((N-3)/2)
60 DIM A(T+1)
70 FOR I=1 TO T: LET A(I)=1: NEXT I
80 FOR I=1 TO T
90 IF A(I)=1 THEN GO TO 110
100 PRINT "..";: GO TO 140
110 LET P=I+I+3: PRINT P;".";:LET K=I+P
120 IF K>T THEN GO TO 140
130 FOR K=K TO T STEP P: LET A(K)=0: NEXT K
140 NEXT I

Note: cannot assign A(0). Must start with I=1
Couldn't find a timer function. Did it manually.
Limit 250: 5,5 - 5,6
Limit 10000: 262,2 (text scrolling is quite slow)

 

 

 

ACORN Electron with BBC BASIC 2 (32K)

10 CLS: CLEAR
20 K=0:I=0:T=0:P=0
100 PRINT "Prime Number Generator"
110 INPUT "Upper Limit";N
120 ETIME=TIME
130 T=(N-3)/2
140 DIM A(T+1)
160 FOR I=0 TO T:A(I)=1:NEXT
200 FOR I=0 TO T:IF A(I) THEN P=I+I+3:PRINT;P;".";:K=I+P:ELSE PRINT"..";:NEXT:GOTO 330
250 IF K<=T FOR K=K TO T STEP P:A(K)=0:NEXT
260 NEXT
330 ETIME=(TIME-ETIME)/60
340 PRINT:PRINT"Total: ";ETIME
360 END

Limit 250: 3,38333
cannot run with limit 10000. Not enough DIM space
Limit 8200: 107,9333

 

 

 

 

 

 

Link to comment
Share on other sites

What configuration are you using to get those numbers? With an NTSC computer, XL/XE OS version 2, and Atari BASIC rev. C, I'm getting about 4.17 seconds on both real hardware and emulation, not 2.88.

Altirra 2.71

I have drive accelleration enabled but nothing else speed wise.

Attach Cartridge: Basic (Rev. C).bin

6502 CPU

Speed options Match Hardware

Hardware is set to 600XL...

Memory Config Rambo

Video NTSC No Artifacting Standard video

 

Unless the system is somehow using a fast math ROM, I don't see anything that would impact speed.

And I don't even see a setting for that.

Edited by JamesD
Link to comment
Share on other sites

turbo basic or fastbasic.. :)

All I can think of is something is that the BASIC ROM is a patched version (I downloaded it just for the tests) or the math ROM isn't standard and I see no setting for that.

 

 

*edit*

I may have found it. Even though I wasn't using Altirra BASIC, the default system ROM may be the problem.

Nothing I select gives the numbers that were returned on real hardware, but I was able to get higher numbers than for that run.

I'll have to go through all the ROM setting and see if I can get something that isn't accelerated

Edited by JamesD
Link to comment
Share on other sites

I was curious, so I wrote the lines on my "real" ZX Spectrum+ BASIC and the Acorn Electron BBC BASIC.

 

ZX Spectrum+ (48k)

10 CLS : CLEAR
20 LET K=0 : LET I=0 : LET T=0 : LET P=0
30 PRINT "Prime Number Generator"
40 INPUT "Upper Limit :";N
50 LET T=INT ((N-3)/2)
60 DIM A(T+1)
70 FOR I=1 TO T: LET A(I)=1: NEXT I
80 FOR I=1 TO T
90 IF A(I)=1 THEN GO TO 110
100 PRINT "..";: GO TO 140
110 LET P=I+I+3: PRINT P;".";:LET K=I+P
120 IF K>T THEN GO TO 140
130 FOR K=K TO T STEP P: LET A(K)=0: NEXT K
140 NEXT I

Note: cannot assign A(0). Must start with I=1

Couldn't find a timer function. Did it manually.

Limit 250: 5,5 - 5,6

Limit 10000: 262,2 (text scrolling is quite slow)

 

 

 

ACORN Electron with BBC BASIC 2 (32K)

10 CLS: CLEAR
20 K=0:I=0:T=0:P=0
100 PRINT "Prime Number Generator"
110 INPUT "Upper Limit";N
120 ETIME=TIME
130 T=(N-3)/2
140 DIM A(T+1)
160 FOR I=0 TO T:A(I)=1:NEXT
200 FOR I=0 TO T:IF A(I) THEN P=I+I+3:PRINT;P;".";:K=I+P:ELSE PRINT"..";:NEXT:GOTO 330
250 IF K<=T FOR K=K TO T STEP P:A(K)=0:NEXT
260 NEXT
330 ETIME=(TIME-ETIME)/60
340 PRINT:PRINT"Total: ";ETIME
360 END

Limit 250: 3,38333

cannot run with limit 10000. Not enough DIM space

Limit 8200: 107,9333

The Spectrum is kinda what I expected, but I thought the BBC would be faster.

8200 Has the modified MC-10 turning in a time of about 90, but given some how some other numbers have been off... who knows?

But still... completely different emulators.

*edit*

Try switching the logic so you are testing for A(I)=0 as the default (not A(i)?) on line 200.

You want the most executed case to be in the THEN. Otherwise it has to scan until ELSE which means reading more characters.

It won't be a massive difference, but it should save a few seconds on the large limit.

Edited by JamesD
Link to comment
Share on other sites

All I can think of is something is that the BASIC ROM is a patched version (I downloaded it just for the tests) or the math ROM isn't standard and I see no setting for that.

 

 

*edit*

I may have found it. Even though I wasn't using Altirra BASIC, the default system ROM may be the problem.

Nothing I select gives the numbers that were returned on real hardware, but I was able to get higher numbers than for that run.

I'll have to go through all the ROM setting and see if I can get something that isn't accelerated

 

It sounds like you have the high-level emulation (HLE) OS active. This is the one you got in older versions of Altirra by default, if you hadn't set up ROM images. Newer versions (2.90+) have removed the HLE OS and default to AltirraOS, which is written in pure 6502 and runs the exact same way it would on hardware. However, it still has a faster math pack and screen editor than the Atari OS, so it'll still skew your timings -- though they're still 'real' in that you could get those times on hardware with it. The benchmark spends some 10-30% of the time printing to the screen with bigger limits, so the speed of the E: driver matters. AltirraOS + Altirra BASIC runs 250 in 1.53 seconds and 10K in 67.5s.

 

The other option to watch out for is System / Acceleration / Fast Math... but that's pretty obvious and not on by default.

 

Once you have matching ROMs between the emulator and the hardware, the timings should match too.

  • Like 1
Link to comment
Share on other sites

*edit*

Try switching the logic so you are testing for A(I)=0 as the default (not A(i)?) on line 200.

You want the most executed case to be in the THEN. Otherwise it has to scan until ELSE which means reading more characters.

It won't be a massive difference, but it should save a few seconds on the large limit.

You mean A(I)=1 not 0

 

With A(I)=1 it is a bit slower with 250. 3,5 instead of 3,38333

With the large limit: 111,85 instead of 107,933

 

These changes are more efficient on the ACORN Electron:

10 CLS: CLEAR
20 K%=0:I%=0:T%=0:P%=0
100 PRINT "Prime Number Generator"
110 INPUT "Upper Limit";N%
120 ETIME=TIME
130 T%=INT((N%-3)/2)
140 DIM A%(T%+1)
160 FOR I%=0 TO T%:A%(I%)=1:NEXT
200 FOR I%=0 TO T%:IF A%(I%) THEN P%=I%+I%+3:PRINT;P%;".";:K%=I%+P%:ELSE PRINT"..";:NEXT:GOTO 330
250 IF K%<=T% FOR K%=K% TO T% STEP P%:A%(K%)=0:NEXT
260 NEXT
330 ETIME=(TIME-ETIME)/60
340 PRINT:PRINT"Total: ";ETIME
360 END

Limit 250: 2,43

Limit 8200: 74,16

Edited by tecci06
Link to comment
Share on other sites

You mean A(I)=1 not 0

 

Give this change a try, it's faster on my machine, but my system is giving different numbers on your code so there may be a problem with my computer.

This change may impact other computers as well.

 

Part of the speed difference is removal of the spaces, and removal of THEN in line 200, but this makes sure the print for non-primes is the default since it is the most executed.

Reversing the logic in the A%() array still allows the faster logical test to be used.

 

140 DIMA%(T%+1)
160 FORI%=0TOT%:A%(I%)=0:NEXT
200 FORI%=0TOT%:IFA%(I%)PRINT"..";:NEXT:GOTO330
250 P%=I%+I%+3:PRINT;P%;".";:K%=I%+P%:IFK%<=T%FORK%=K%TOT%STEPP%:A%(K%)=1:NEXT
260 NEXT

Edited by JamesD
Link to comment
Share on other sites

This version runs in standard Microcolor BASIC (MC-10) and seems to be faster than the old code.
8200 is 84 seconds in Microcolor BASIC
8200 is 96 seconds in my "optimized" BASIC. Yes, it's slower. Supporting ELSE had a cost. It doesn't show up with heavy use of floating point because of the huge speed difference there.
250 is well under 3 seconds in Microcolor BASIC
10000 is 103 in Microcolor BASIC. I think the previous number 100 was supposed to be 110, and I just forgot where I started timing. I wasn't using the stopwatch on my phone.

20 K=0:I=0:T=0:P=0
30 CLS
100  PRINT "Prime Number Generator"
110  INPUT "Upper Limit";N
130  T=(N-3)/2
140  DIMA(T+1)
160 FORI=0TOT:A(I)=0:NEXT
200 FORI=0TOT:IFA(I)THENPRINT"..";:NEXT:GOTO340
220 P=I+I+3:PRINTP;".";:K=I+P:IFK<=T THENFORK=KTOTSTEPP:A(K)=1:NEXT:NEXT:GOTO340
320 NEXT
340  PRINT
350  PRINT "Done: "
360 END

This is faster than the last CoCo version.

I made the PCLEAR 1 part of the program instead of having to type it before running the code.
Program and hand timed are basically the same.
10000 is 61.5333333 seconds.
8200 is 50.45 seconds.

0 POKE65495,0:PCLEAR1
20 K=0:I=0:T=0:P=0
30 CLS
100  PRINT "Prime Number Generator"
110  INPUT "Upper Limit";N
120 TIMER=0
130  T=(N-3)/2
140  DIMA(T+1)
160 FORI=0TOT:A(I)=0:NEXT
200 FORI=0TOT:IFA(I)THENPRINT"..";:NEXT:GOTO330
220 P=I+I+3:PRINTP;".";:K=I+P:IFK<=T THENFOR K=K TOT STEPP:A(K)=1:NEXT:NEXT:GOTO330
320 NEXT
330 ETIME=TIMER/60
340  PRINT
350  PRINT "Done: ";ETIME
360 END

Link to comment
Share on other sites

are these all going to standardize across the machines or will the range and limits be different for everything, what use is timing it if were doing different limits... Have we left the path?

Well, 250 worked with the original code which was slow, but how do you tell the difference between 1.9 and 2.2 seconds with the latest code using hand timing?

I tried using 10000 but the BBC Micro and TI-99/4A can't use that large of a number, and I'm sure there will be something else that will have issues with it.

tecci06 used 8200 with the BBC, but the TI-99/4A seems to only handle 6000.

And what about the inevitable system than can't handle 6000?

 

We have a general idea of what machines are faster than others now, but honestly, we are still optimizing the code and we may need to revisit all the times anyway.

Should we have multiple limits, and fill in NA for larger limits that machines can't handle?

What limits shall we use?

The faster machines are finishing in under 10 seconds using 1000, and hand timing still isn't very accurate.

2500? 5000? 7500? 10000?

 

Link to comment
Share on other sites

New VZ code. Definitely faster than the old code.

10 10 DEFINT K,I,T,P,A
20 K=0:I=0:T=0:P=0
30 CLS
100  PRINT "Prime Number Generator"
110  INPUT "Upper Limit";N
130  T=(N-3)/2
140  DIMA(T+1)

150 FORI=0TOT:A(I)=0:NEXT
200 FORI=0TOT:IFA(I)PRINT"..";:NEXT:GOTO340
210P=I+I+3:PRINTP;".";:K=I+P:IFK<=TFORK=KTOTSTEPP:A(K)=1:NEXT
260 NEXT

340  PRINT
350  PRINT "DONE"
360 END

Link to comment
Share on other sites

A faster Plus/4 version

20 K=0:I=0:T=0:P=0
30 SCNCLR
100  PRINT "Prime Number Generator"
110  INPUT "Upper Limit";N
120  eTime=TIME
130  T = (N-3)/2
140  DIM A(T+1)

160 FORI=0TOT:A(I)=0:NEXT
200 FORI=0TOT:IFA(I)THENPRINT"..";:NEXT:GOTO330
210P=I+I+3:PRINTP;".";:K=I+P:IFK<=TTHENFORK=KTOTSTEPP:A(K)=1:NEXT:NEXT:GOTO330
260 NEXT

330  eTime=(TIME-eTime)/60
340  PRINT
350  PRINT "Total: ";eTime
360 END

Link to comment
Share on other sites

A few things to remember.
A lot of these optimizations won't matter with a compiler.
Some BASICs print extra spaces around numbers and others don't, so that means some machines are doing more work than others.
Screen sizes are different. Screens with more characters take longer to scroll, and screens with fewer characters take less time to scroll but have to scroll more times.

Link to comment
Share on other sites

This will speed up the Plus/4 version a bit. It disables the screen.
I don't think the 2nd line is required because PRINT seems to turn it back on, but I included it just in case.
The screen is blanked while initializing the A array though.

115 POKE65286,PEEK(65286)AND239
335 POKE65286,PEEK(65286)OR16
Link to comment
Share on other sites

The Plus/4's slow execution is offset by the size of an array it can handle.
It's max upper limit is 24000. Almost 3 times what the BBC Micro can handle, and 4 times the TI.
I haven't check the MC-10 and CoCo, but I wouldn't be surprised if it's double them.
It takes 236.816667 seconds with the screen blanking code, and 355.4 seconds without it.
The largest prime found was 23981.

Link to comment
Share on other sites

 

 

Give this change a try, it's faster on my machine, but my system is giving different numbers on your code so there may be a problem with my computer.

This change may impact other computers as well.

 

Well, with A%(I%)=0 - only "dots" are printed. No prime numbers.

If I use A%(I%)=1 see post #88. So i did leave it with IF A%(I%) THEN P%=I%+I%+3:....

 

I get a bit more speed after removing all spaces and disabling the screen during the output.

 

Limit 250: 2,266

Limit 8200: 67,13

 

BTW: It is not a BBC Micro, it is an Arcorn Electron.

 

From the Wiki: Clock rate: variable. CPU runs at 2 MHz when accessing ROM and 1 MHz or 0.5897 MHz when accessing RAM (depending on graphics mode) due to sharing memory access with the video display circuits. The Electron is widely misquoted as operating at 1.79 MHz after measurements derived from speed testing against the thoroughly 2 MHz BBC Micro...

Edited by tecci06
Link to comment
Share on other sites

Well, with A%(I%)=0 - only "dots" are printed. No prime numbers.

If I use A%(I%)=1 see post #88. So i did leave it with IF A%(I%) THEN P%=I%+I%+3:....

That's not the same as the code changes I posted, which were tested on MAME using the BBC Micro Emulation.

Trust me, it's faster and the same change was made to every updated version that followed.

 

I get a bit more speed after removing all spaces and disabling the screen during the output.

 

Limit 250: 2,266

Limit 8200: 67,13

We just have to be careful to test it with and without the screen turned off.

That would work for some applications, but not others.

If this were some program you'd start and walk away for several hours, it wouldn't matter, but you have to be able to read the screen if there is an error.

 

ON ERROR GOTO would be quite useful if you were using this in an application.

 

 

BTW: It is not a BBC Micro, it is an Arcorn Electron.

 

From the Wiki: Clock rate: variable. CPU runs at 2 MHz when accessing ROM and 1 MHz or 0.5897 MHz when accessing RAM (depending on graphics mode) due to sharing memory access with the video display circuits. The Electron is widely misquoted as operating at 1.79 MHz after measurements derived from speed testing against the thoroughly 2 MHz BBC Micro...

That would explain the difference in speed I saw under emulation. I saw BBC BASIC and missed the electron part, my bad.

I can't test the code on the MAME Electron emulation without completely retyping the code by hand because someone decided not to handle upper and lower case the same as the BBC emulation.

The BBC Micro can handle 10000 for a limit, and turns in a time of 62.0666667 seconds.

A limit of 8200 results in 51.0333333

A limit of 250 results in 1.7.

12000 seems to be that largest limit the BBC Micro can handle

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...