Jump to content
IGNORED

Benchmarking Languages


Tursi

Recommended Posts

Lee,

I can't figure out how you got FB-Forth to go faster than Turbo Forth in the optimized version.

Can you double check it when you have nothing better to do?

 

I am sure the reason is that TurboForth's V! calls vsbw in TurboForth, whereas my port of the same code has all of the relevant ALC within V! . The “bl @vsbw” and return would do it, I think.

 

...lee

  • Like 1
Link to comment
Share on other sites

  • 3 weeks later...

I thought I would expand this conversation by using a classic benchmark, the Sieve of Erathosenes.

I know it's an old chestnut, but it is commonly used to see how a language does a few things.

 

I tried it first with TI-BASIC. I used the MSX version from Rosetta code and fixed a few things so it would run on the TI-99

100 REM  Eratosthenes Sieve 
110 REM    for TI BASIC 
120 REM  ****************** 
130 REM  MSX BRRJPA
140 INPUT "Search until: ":L
150 DIM P(1500)
160 FOR N=2 TO SQR(L+1000)
170 IF P(N)<>0 THEN 210
180 FOR K=N*N TO L STEP N
190 LET P(K)=1
200 NEXT K
210 NEXT N
220 FOR N=2 TO L
230 IF P(N)<>0 THEN 250
240 PRINT N;
250 NEXT N

I use 1000 for the input value because it takes along time. The big shock for me was that XB was slower than TI-BASIC. :?

 

Then of course I did a version for Forth and tested it on Turbo Forth and Camel99 a couple of ways.

Here is that code which is also from Rosetta Code but slightly modified by me.

I don't agree with the way they did the looping in the WHILE statement. It's a slow thing to have multiplication to calculate a loop

I will see if I can make it more like the C version...

\ Sieve of Erathosenes in Forth
\ Tested with TurboForth and CAMEL99 Forth

\ array calculators
: []@ ( n addr -- ? )  + C@  ;
: []! ( n addr -- )    + C! ;

: ERASE  ( addr n -- ) 0 FILL ;

HEX
: FREEMEM  ( -- n) FF00 HERE - ;
: ?MEM     ( n -- )  FREEMEM OVER < ABORT" Out of memory" ;

\ byte array uses unallocated memory at 'HERE'
DECIMAL
: PRIMES ( n -- )
        ?MEM
        CR ." Running..."
        HERE OVER ERASE
        1 0 HERE []!       \ mark as prime like 'C' version
        1 1 HERE []!
        2                  \ start at 2
        BEGIN
           2DUP DUP * >
        WHILE
           DUP HERE []@ 0=
           IF  2DUP DUP *
               DO
                  1 I HERE []!
               DUP +LOOP
           THEN
           1+
        REPEAT
        CR ." Complete."
        CR
        DROP
        ." Primes: " 2 DO  I HERE []@ 0= IF I . THEN  LOOP
;

And here are the timings. BASIC can't allocate enough memory for more than about 1500 elements because of the big floating point numbers so I limited it to 1000.

1000 primes
X BASIC         14.3 secs
TI BASIC        11.9

10,000 primes
TF               8.0
CAMEL99 ITC     10.28
CAMEL99 DTC      7.25

I would be interested in seeing the GCC results using the Rosetta code version or something similar.

 

 

 

#include <stdlib.h>
#include <math.h>
 
char*
eratosthenes(int n, int *c)
{
	char* sieve;
	int i, j, m;
 
	if(n < 2)
		return NULL;
 
	*c = n-1;     /* primes count */
	m = (int) sqrt((double) n);
 
	/* calloc initializes to zero */
	sieve = calloc(n+1,sizeof(char));
	sieve[0] = 1;
	sieve[1] = 1;
	for(i = 2; i <= m; i++)
		if(!sieve[i])
			for (j = i*i; j <= n; j += i)
				if(!sieve[j]){
					sieve[j] = 1; 
					--(*c);
				}
  	return sieve;
}

 

 

Link to comment
Share on other sites

I use 1000 for the input value because it takes along time. The big shock for me was that XB was slower than TI-BASIC. :?

 

 

I side by side compared RXB and TI Basic and RXB beat TI Basic easy.

Edited by RXB
Link to comment
Share on other sites

I have no doubt about it Rich.

 

Do you have an approximate number of secs?

I did not make it clear, but I only time the loop that does the calculation.

The print out of the results I ignored.

 

I also did some testing and realized the Forth example is pretty silly in one other way.

The HERE word reads an address from a variable and is used as the base address of the data array.

So that's not at all like using a fixed memory address for the data array.

When I replaced a Forth version of HERE with an Assembly language version of HERE the ITC benchmark improved 20%!

 

"HERE" is a convenient memory location for throw away calculations where you don't need to create a variable, but not appropriate for a speed benchmark IMHO.

 

So I will change that and republish the results for all the Forth examples.

Link to comment
Share on other sites

So I changed the benchmark to use a VALUE to hold the array address.

It improved TF and CAMEL99 ITC version, but it actually slowed down the directed threaded version a little.

The DTC version has HERE written in Assembler so this only proves that my code for CONSTANT is slower than fetching the contents of a variable in ASM.

 

Anyway here is result.

\ Sieve of Erathosenes in Forth using a VALUE as array pointer
\ Tested with TurboForth and CAMEL99 Forth

0 VALUE SIEVE[]           \ holds array address

\ array calculators
: SIEVE[]@ ( n addr -- ? ) SIEVE[] + C@  ;
: SIEVE[]! ( n addr -- )   SIEVE[] + C! ;

: ERASE  ( addr n -- ) 0 FILL ;

HEX
: FREEMEM  ( -- n) FF00 HERE - ;

: ?MEM     ( n -- n )  FREEMEM OVER < ABORT" Out of memory" ;

\ byte array uses unallocated memory at HERE
DECIMAL
: PRIMES ( n -- )
        ?MEM
        HERE TO SIEVE[]       \ assign address to the array

        CR ." Running..."
        SIEVE[] OVER ERASE
        1 0 SIEVE[]!          \ mark as prime like 'C' version
        1 1 SIEVE[]!
        2                     \ start at 2
        BEGIN
           2DUP DUP * >
        WHILE
           DUP SIEVE[]@ 0=
           IF  2DUP DUP *
               DO
                  1 I SIEVE[]!
               DUP +LOOP
           THEN
           1+
        REPEAT
        CR ." Complete."
        CR
        DROP
        ." Primes: " 2 DO  I SIEVE[]@ 0= IF I . THEN  LOOP
;
10,000 primes    W/HERE   W/VALUE  
TF               8.0       7.7
CAMEL99 ITC     10.28      8.4
CAMEL99 DTC      7.25      7.5
Edited by TheBF
Link to comment
Share on other sites

Nice!

 

Does the sieve just produce primes? If so there's a primes finder on the TF disk. Just type DEMOS after it boots and it'll tell you which block it's on.

 

Run it with 1000 PRIMES and it should do its stuff.

 

It can find primes to 1000 in around 2.5 seconds if I remember correctly.

 

If I remember correctly I adapted it from a Brodie program in Starting Forth. It was one of the first programs TF ever ran :-)

Link to comment
Share on other sites

Nice!

 

Does the sieve just produce primes? If so there's a primes finder on the TF disk. Just type DEMOS after it boots and it'll tell you which block it's on.

 

Run it with 1000 PRIMES and it should do its stuff.

 

It can find primes to 1000 in around 2.5 seconds if I remember correctly.

 

If I remember correctly I adapted it from a Brodie program in Starting Forth. It was one of the first programs TF ever ran :-)

 

 

Yes this one just finds PRIME numbers. It look like it was done by Anton, but I can't be sure.

So if we scale the Starting Forth version to 10000 it would take 25 seconds, so this one is quite a bit faster.

 

With this version in TF it should search to over 20,000 because it uses all available memory in the expansion card.

Link to comment
Share on other sites

  • 2 weeks later...

I entered the Pascal code as it is on the Rosetta page, using a set to keep track of the primes.

1000 primes
X BASIC         14.3 secs
TI BASIC        11.9
Pascal           2.869 s

10,000 primes
TF               8.0
CAMEL99 ITC     10.28
CAMEL99 DTC      7.25

It seems the sets aren't too efficiently implemented in Pascal. Probably because they are represented by a bit in a packed field, so there's a lot of packing and unpacking.

I changed from using a set to an array of booleans instead. That reduced the time from 10.558 s to 4.663 s.
Then I suspected some time could be saved in setting up the array, so I used the faster system intrinsic moveleft to do that. Now the time is 2.869 seconds.
With this solution, a boolean bit is stored as an integer, so 1000 elements require 2000 bytes.

 

                                        Created at Aug 23, Wed, 21:50.27
                                        RAMDISK:BENCH8.TEXT

program sieve;

uses
  support,
  realtime;
  
const
  primelimit = 1000;
   
var
  primes: array[1..primelimit] of boolean;
  n,k: integer;
  needcomma: boolean;
  timer: timerid;
  elapsed: ttime;
  ch: char;
      
begin
  writeln('Sieve starts running');

  tmrnew(timer);
  tmrreset(timer);
  tmrstart(timer);
  
  (* Calcuate the primes *)
  primes[1] := false;
  primes[2] := true;
  moveleft(primes[2],primes[3],primelimit-4);
  for n := 1 to trunc(sqrt(primelimit)) do
  begin
    if primes[n] then
    begin
      k := n*n;
      while k<primelimit do
      begin
        primes[k] := false;
        k := k+n;
      end;
    end;
  end;

  tmrstop(timer);
  tmrread(timer,elapsed);

  (* Output the primes *)
  needcomma := false;
  for n := 1 to primelimit do
    if primes[n] then
    begin
      if needcomma then
        write(', ');
      write(n);
      needcomma := true;
    end;

  writeln;
  with elapsed do
  begin
    write('Time ',hour,':');
    if minute<10 then
      write('0');
    write(minute,':');
    if second<10 then
      write('0');
    write(second,',');
    if fract<100 then
      write('0');
    if fract<10 then
      write('0');
    writeln(fract);
  end;
  read(ch);
end.

 

 

Edited by apersson850
  • Like 1
Link to comment
Share on other sites


1000 primes
X BASIC         14.3 secs
TI BASIC        11.9

10,000 primes
TF               8.0
CAMEL99 ITC     10.28
CAMEL99 DTC      7.25

 

After compiling the BASIC program in post 127 I got the following results:

For 1000 primes, compiled XB takes about 1 second. For 8,500 primes it takes 10 seconds. So the speeds are similar but somewhat slower. Over the winter I will do some rewriting on this so it can load a larger program.

Link to comment
Share on other sites

 

If I may say so, - "more accurate Primes" is actually rather important. :thumbsup:

Boo! You're just being prejudiced against inaccurate primes.

 

WHAT DO WE WANT?

EQUAL RIGHTS FOR INACCURATE PRIMES

WHEN DO WE WANT IT?

SOME TIME IN THE NOT-TOO-DISTANT FUTURE PLUS OR MINUS AN INDETERMINATE AND VARIABLE AMOUNT OF TIME!

 

:-)

Edited by Willsy
  • Like 1
  • Haha 1
Link to comment
Share on other sites

I didn't expect it to work, but I tried making the array large enough to accomodate numbers up to 10000. But the p-system responded "Can't allocate global data", which is what I expected.

 

It makes me wonder how big a "Boolean" type is in the P-Code system. Any idea?

 

Oops re-read your earlier post completely. 16 bits for a Boolean.

How big can you go before it hits the limit?

 

Could you try ARRAY of CHAR ?

Edited by TheBF
Link to comment
Share on other sites

I didn't expect it to work, but I tried making the array large enough to accomodate numbers up to 10000. But the p-system responded "Can't allocate global data", which is what I expected.

Boolean is probably 8 bit, so that would only be 10K.

The P-Machine may not be able to deal with arrays that cross memory pages.

 

Link to comment
Share on other sites

I suppose I could write a RXB version using up to 10 digits (999999999) and make them strings 10 digits in length and store in SAMS.

 

SAMS memory 960 Meg (1966080 bytes) to store 196608 ten digit primes.

 

So this would allow for up to 6 digit primes to be solved vs the TI Basic version that can only store 1000 primes or XB that could go as high as 3500 Primes.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...