+Lee Stewart Posted July 27, 2017 Share Posted July 27, 2017 Lee, I can't figure out how you got FB-Forth to go faster than Turbo Forth in the optimized version. Can you double check it when you have nothing better to do? I am sure the reason is that TurboForth's V! calls vsbw in TurboForth, whereas my port of the same code has all of the relevant ALC within V! . The “bl @vsbw” and return would do it, I think. ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted August 13, 2017 Share Posted August 13, 2017 I thought I would expand this conversation by using a classic benchmark, the Sieve of Erathosenes. I know it's an old chestnut, but it is commonly used to see how a language does a few things. I tried it first with TI-BASIC. I used the MSX version from Rosetta code and fixed a few things so it would run on the TI-99 100 REM Eratosthenes Sieve 110 REM for TI BASIC 120 REM ****************** 130 REM MSX BRRJPA 140 INPUT "Search until: ":L 150 DIM P(1500) 160 FOR N=2 TO SQR(L+1000) 170 IF P(N)<>0 THEN 210 180 FOR K=N*N TO L STEP N 190 LET P(K)=1 200 NEXT K 210 NEXT N 220 FOR N=2 TO L 230 IF P(N)<>0 THEN 250 240 PRINT N; 250 NEXT N I use 1000 for the input value because it takes along time. The big shock for me was that XB was slower than TI-BASIC. Then of course I did a version for Forth and tested it on Turbo Forth and Camel99 a couple of ways. Here is that code which is also from Rosetta Code but slightly modified by me. I don't agree with the way they did the looping in the WHILE statement. It's a slow thing to have multiplication to calculate a loop I will see if I can make it more like the C version... \ Sieve of Erathosenes in Forth \ Tested with TurboForth and CAMEL99 Forth \ array calculators : []@ ( n addr -- ? ) + C@ ; : []! ( n addr -- ) + C! ; : ERASE ( addr n -- ) 0 FILL ; HEX : FREEMEM ( -- n) FF00 HERE - ; : ?MEM ( n -- ) FREEMEM OVER < ABORT" Out of memory" ; \ byte array uses unallocated memory at 'HERE' DECIMAL : PRIMES ( n -- ) ?MEM CR ." Running..." HERE OVER ERASE 1 0 HERE []! \ mark as prime like 'C' version 1 1 HERE []! 2 \ start at 2 BEGIN 2DUP DUP * > WHILE DUP HERE []@ 0= IF 2DUP DUP * DO 1 I HERE []! DUP +LOOP THEN 1+ REPEAT CR ." Complete." CR DROP ." Primes: " 2 DO I HERE []@ 0= IF I . THEN LOOP ; And here are the timings. BASIC can't allocate enough memory for more than about 1500 elements because of the big floating point numbers so I limited it to 1000. 1000 primes X BASIC 14.3 secs TI BASIC 11.9 10,000 primes TF 8.0 CAMEL99 ITC 10.28 CAMEL99 DTC 7.25 I would be interested in seeing the GCC results using the Rosetta code version or something similar. #include <stdlib.h> #include <math.h> char* eratosthenes(int n, int *c) { char* sieve; int i, j, m; if(n < 2) return NULL; *c = n-1; /* primes count */ m = (int) sqrt((double) n); /* calloc initializes to zero */ sieve = calloc(n+1,sizeof(char)); sieve[0] = 1; sieve[1] = 1; for(i = 2; i <= m; i++) if(!sieve[i]) for (j = i*i; j <= n; j += i) if(!sieve[j]){ sieve[j] = 1; --(*c); } return sieve; } Quote Link to comment Share on other sites More sharing options...
RXB Posted August 13, 2017 Share Posted August 13, 2017 (edited) I use 1000 for the input value because it takes along time. The big shock for me was that XB was slower than TI-BASIC. I side by side compared RXB and TI Basic and RXB beat TI Basic easy. Edited August 13, 2017 by RXB Quote Link to comment Share on other sites More sharing options...
+TheBF Posted August 13, 2017 Share Posted August 13, 2017 I have no doubt about it Rich. Do you have an approximate number of secs? I did not make it clear, but I only time the loop that does the calculation. The print out of the results I ignored. I also did some testing and realized the Forth example is pretty silly in one other way. The HERE word reads an address from a variable and is used as the base address of the data array. So that's not at all like using a fixed memory address for the data array. When I replaced a Forth version of HERE with an Assembly language version of HERE the ITC benchmark improved 20%! "HERE" is a convenient memory location for throw away calculations where you don't need to create a variable, but not appropriate for a speed benchmark IMHO. So I will change that and republish the results for all the Forth examples. Quote Link to comment Share on other sites More sharing options...
+TheBF Posted August 13, 2017 Share Posted August 13, 2017 (edited) So I changed the benchmark to use a VALUE to hold the array address. It improved TF and CAMEL99 ITC version, but it actually slowed down the directed threaded version a little. The DTC version has HERE written in Assembler so this only proves that my code for CONSTANT is slower than fetching the contents of a variable in ASM. Anyway here is result. \ Sieve of Erathosenes in Forth using a VALUE as array pointer \ Tested with TurboForth and CAMEL99 Forth 0 VALUE SIEVE[] \ holds array address \ array calculators : SIEVE[]@ ( n addr -- ? ) SIEVE[] + C@ ; : SIEVE[]! ( n addr -- ) SIEVE[] + C! ; : ERASE ( addr n -- ) 0 FILL ; HEX : FREEMEM ( -- n) FF00 HERE - ; : ?MEM ( n -- n ) FREEMEM OVER < ABORT" Out of memory" ; \ byte array uses unallocated memory at HERE DECIMAL : PRIMES ( n -- ) ?MEM HERE TO SIEVE[] \ assign address to the array CR ." Running..." SIEVE[] OVER ERASE 1 0 SIEVE[]! \ mark as prime like 'C' version 1 1 SIEVE[]! 2 \ start at 2 BEGIN 2DUP DUP * > WHILE DUP SIEVE[]@ 0= IF 2DUP DUP * DO 1 I SIEVE[]! DUP +LOOP THEN 1+ REPEAT CR ." Complete." CR DROP ." Primes: " 2 DO I SIEVE[]@ 0= IF I . THEN LOOP ; 10,000 primes W/HERE W/VALUE TF 8.0 7.7 CAMEL99 ITC 10.28 8.4 CAMEL99 DTC 7.25 7.5 Edited August 14, 2017 by TheBF Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted August 13, 2017 Share Posted August 13, 2017 Cut-and-paste did this, I’ll bet: : SIEVE[]@ ( n addr -- ? ) SIEVE[] + C@ ; should be : SIEVE[]@ ( addr -- ? ) SIEVE[] + C@ ; ...lee Quote Link to comment Share on other sites More sharing options...
+TheBF Posted August 13, 2017 Share Posted August 13, 2017 Yup! And in fact in should be : SIEVE[]@ ( n -- ? ) SIEVE[] + C@ ; I merged the address into these new definitions to make it read nicer. B Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted August 13, 2017 Share Posted August 13, 2017 I would say that the stack effects for ?MEM should be “( n -- n )” because n survives execution. Also, why did you define SQRT ? You don’t use it. ...lee Quote Link to comment Share on other sites More sharing options...
+TheBF Posted August 13, 2017 Share Posted August 13, 2017 I was originally working replicating the C version A forgot to remove SQRT from this file. Had to go socialize. Wife's orders B 2 Quote Link to comment Share on other sites More sharing options...
Willsy Posted August 14, 2017 Share Posted August 14, 2017 Nice! Does the sieve just produce primes? If so there's a primes finder on the TF disk. Just type DEMOS after it boots and it'll tell you which block it's on. Run it with 1000 PRIMES and it should do its stuff. It can find primes to 1000 in around 2.5 seconds if I remember correctly. If I remember correctly I adapted it from a Brodie program in Starting Forth. It was one of the first programs TF ever ran :-) Quote Link to comment Share on other sites More sharing options...
+TheBF Posted August 14, 2017 Share Posted August 14, 2017 Nice! Does the sieve just produce primes? If so there's a primes finder on the TF disk. Just type DEMOS after it boots and it'll tell you which block it's on. Run it with 1000 PRIMES and it should do its stuff. It can find primes to 1000 in around 2.5 seconds if I remember correctly. If I remember correctly I adapted it from a Brodie program in Starting Forth. It was one of the first programs TF ever ran :-) Yes this one just finds PRIME numbers. It look like it was done by Anton, but I can't be sure. So if we scale the Starting Forth version to 10000 it would take 25 seconds, so this one is quite a bit faster. With this version in TF it should search to over 20,000 because it uses all available memory in the expansion card. Quote Link to comment Share on other sites More sharing options...
apersson850 Posted August 23, 2017 Share Posted August 23, 2017 (edited) I entered the Pascal code as it is on the Rosetta page, using a set to keep track of the primes. 1000 primes X BASIC 14.3 secs TI BASIC 11.9 Pascal 2.869 s 10,000 primes TF 8.0 CAMEL99 ITC 10.28 CAMEL99 DTC 7.25 It seems the sets aren't too efficiently implemented in Pascal. Probably because they are represented by a bit in a packed field, so there's a lot of packing and unpacking. I changed from using a set to an array of booleans instead. That reduced the time from 10.558 s to 4.663 s.Then I suspected some time could be saved in setting up the array, so I used the faster system intrinsic moveleft to do that. Now the time is 2.869 seconds.With this solution, a boolean bit is stored as an integer, so 1000 elements require 2000 bytes. Created at Aug 23, Wed, 21:50.27 RAMDISK:BENCH8.TEXT program sieve; uses support, realtime; const primelimit = 1000; var primes: array[1..primelimit] of boolean; n,k: integer; needcomma: boolean; timer: timerid; elapsed: ttime; ch: char; begin writeln('Sieve starts running'); tmrnew(timer); tmrreset(timer); tmrstart(timer); (* Calcuate the primes *) primes[1] := false; primes[2] := true; moveleft(primes[2],primes[3],primelimit-4); for n := 1 to trunc(sqrt(primelimit)) do begin if primes[n] then begin k := n*n; while k<primelimit do begin primes[k] := false; k := k+n; end; end; end; tmrstop(timer); tmrread(timer,elapsed); (* Output the primes *) needcomma := false; for n := 1 to primelimit do if primes[n] then begin if needcomma then write(', '); write(n); needcomma := true; end; writeln; with elapsed do begin write('Time ',hour,':'); if minute<10 then write('0'); write(minute,':'); if second<10 then write('0'); write(second,','); if fract<100 then write('0'); if fract<10 then write('0'); writeln(fract); end; read(ch); end. Edited August 24, 2017 by apersson850 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted August 23, 2017 Share Posted August 23, 2017 That's great optimization. Thanks telling us about it. Quote Link to comment Share on other sites More sharing options...
senior_falcon Posted August 24, 2017 Share Posted August 24, 2017 1000 primes X BASIC 14.3 secs TI BASIC 11.9 10,000 primes TF 8.0 CAMEL99 ITC 10.28 CAMEL99 DTC 7.25 After compiling the BASIC program in post 127 I got the following results: For 1000 primes, compiled XB takes about 1 second. For 8,500 primes it takes 10 seconds. So the speeds are similar but somewhat slower. Over the winter I will do some rewriting on this so it can load a larger program. Quote Link to comment Share on other sites More sharing options...
RXB Posted August 24, 2017 Share Posted August 24, 2017 Pretty sure XB has more accurate Primes to 10 decimal places then TI Basic does, they did put a ton of Floating Point into XB for more accuracy over TI Basic. Quote Link to comment Share on other sites More sharing options...
apersson850 Posted August 24, 2017 Share Posted August 24, 2017 Hahaha, that was a good one! 1 Quote Link to comment Share on other sites More sharing options...
sometimes99er Posted August 24, 2017 Share Posted August 24, 2017 It can find primes to 1000 in around 2.5 seconds if I remember correctly. So if we scale the Starting Forth version to 10000 it would take 25 seconds, so this one is quite a bit faster. So you just multiplied both values with ten ? Quote Link to comment Share on other sites More sharing options...
sometimes99er Posted August 24, 2017 Share Posted August 24, 2017 Pretty sure XB has more accurate Primes to 10 decimal places then TI Basic does, they did put a ton of Floating Point into XB for more accuracy over TI Basic. If I may say so, - "more accurate Primes" is actually rather important. 1 Quote Link to comment Share on other sites More sharing options...
Willsy Posted August 24, 2017 Share Posted August 24, 2017 (edited) If I may say so, - "more accurate Primes" is actually rather important. Boo! You're just being prejudiced against inaccurate primes. WHAT DO WE WANT? EQUAL RIGHTS FOR INACCURATE PRIMES WHEN DO WE WANT IT? SOME TIME IN THE NOT-TOO-DISTANT FUTURE PLUS OR MINUS AN INDETERMINATE AND VARIABLE AMOUNT OF TIME! :-) Edited August 27, 2017 by Willsy 1 1 Quote Link to comment Share on other sites More sharing options...
apersson850 Posted August 24, 2017 Share Posted August 24, 2017 As long as we stay within the range of signed integers, as we do here, it doesn't make any difference. Quote Link to comment Share on other sites More sharing options...
+TheBF Posted August 24, 2017 Share Posted August 24, 2017 So you just multiplied both values with ten ? Yes. I did not run your code with 10000 primes Quote Link to comment Share on other sites More sharing options...
apersson850 Posted August 27, 2017 Share Posted August 27, 2017 I didn't expect it to work, but I tried making the array large enough to accomodate numbers up to 10000. But the p-system responded "Can't allocate global data", which is what I expected. Quote Link to comment Share on other sites More sharing options...
+TheBF Posted August 27, 2017 Share Posted August 27, 2017 (edited) I didn't expect it to work, but I tried making the array large enough to accomodate numbers up to 10000. But the p-system responded "Can't allocate global data", which is what I expected. It makes me wonder how big a "Boolean" type is in the P-Code system. Any idea? Oops re-read your earlier post completely. 16 bits for a Boolean. How big can you go before it hits the limit? Could you try ARRAY of CHAR ? Edited August 27, 2017 by TheBF Quote Link to comment Share on other sites More sharing options...
JamesD Posted August 27, 2017 Share Posted August 27, 2017 I didn't expect it to work, but I tried making the array large enough to accomodate numbers up to 10000. But the p-system responded "Can't allocate global data", which is what I expected. Boolean is probably 8 bit, so that would only be 10K. The P-Machine may not be able to deal with arrays that cross memory pages. Quote Link to comment Share on other sites More sharing options...
RXB Posted August 28, 2017 Share Posted August 28, 2017 I suppose I could write a RXB version using up to 10 digits (999999999) and make them strings 10 digits in length and store in SAMS. SAMS memory 960 Meg (1966080 bytes) to store 196608 ten digit primes. So this would allow for up to 6 digit primes to be solved vs the TI Basic version that can only store 1000 primes or XB that could go as high as 3500 Primes. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.