Jump to content
Willsy

The 7's Problem

Recommended Posts

6 hours ago, moulinaie said:

Hi all,

 

I used my My Little Compiler to solve the problem:

It solves the problem in 14 seconds (but only 7 seconds to calculate, the first 7 seconds are used to load the compiler from disk and compile the DATA lines).

 

here is the source for the Pre-compiler:

100 CALL CLEAR
$MLC F 110 10 3000
300 CALL LINK("SEVEN",N,A$)
310 PRINT "7 TO THE POWER OF ";N; " IS"
320 PRINT A$
330 END
$SEVEN 0
	DIMTABLE F 256			; F points to the special 256 bytes buffer
	DIMTABLE G 256			; G points to 256 bytes (for returning string)
	STARTDATA
	   BYTES 0,7,4,1,8,5,2,9,6,3
           BYTES 0,0,1,2,2,3,4,4,5,6
        ENDDATA	E			; E points to this multiplication table
	LET P 1				; current power = 1
	PUTTABLE F 0 7			; current value is 7 at first byte
	LET Z 0				; current LEN-1 of number in digits
	LET X 0				; clear word for byte operations
	LET A 0				; flag for "found"
	REPEAT
		LET N 0				; max occurences of 7
		LET C 0				; current carry is zero
		FOR I 0 Z
			GETTABLE F I X		; X = digit
			GETTABLE E X Y		; multiplied by 7
			ADD Y C			; plus carry
			ADD X 10		; points to tenths
			GETTABLE E X C		; get new carry
			COMPARE Y 10		; if more than a digit
			IF>=
        	        	SUB Y 10	; then reports... 
	                	INC C		; ...one tenth on carry
	    		ENDIF
			PUTTABLE F I Y		; store new digit
			COMPARE Y 7		; a "7" found?
			IF=
				INC N		; one more
				COMPARE N 6	; six "7" found?
				IF=
					INC A	; flag for found
				ENDIF
			ELSE
				LET N 0		; not a "7", reset counter
			ENDIF
		NEXT
		COMPARE C 0			; if carry remains
		IF<>
			INC Z				; then one more digit
			PUTTABLE F Z C			; and stored
   		ENDIF
		INC P	
	COMPARE A 0
	UNTIL<>
	LET J Z
	FOR I 0 Z
		GETTABLE F I X			; one digit from F table
		ADD X 48			; in ASCII
		PUTTABLE G J X			; one character in G string
		DEC J
	NEXT
	INC Z					; correct len
	LET U 2
	PUTTABLE U 0 G				; set A$ to G string with Z characters
        PUTPARAM 1 P				; set N to the current power
$$
$END

 

This is reduced to those few lines in XB :

 

100 CALL CLEAR
110 CALL INIT::PRINT "Loading MLC Compiler..."::CALL LOAD("DSK1.NEWFLO")::CALL LINK("NEWFLO")
120 IO(1)=3000::CALL LINK("COMPIL",IO(),S$(),C$())::If IO(1) THEN PRINT "Error ";IO(1)::END
130 PRINT "Compilation OK!"
300 CALL LINK("SEVEN",N,A$)
310 PRINT "7 TO THE POWER OF ";N; " IS"
320 PRINT A$
330 END
3000 DATA P
3010 DATA SEVEN
3020 DATA "TRF256TRG256_ABAT<TH10.00070401080502090603TH10.00000102020304040506TELAT>E=P1TPF0.7=Z0=X0=A0LA=N0=C0=I0LBTGFIXTGEXY+YC+X10TGEXC"
3030 DATA "CY10_C?<C-Y10ICLCTPFIYCY7_C!=CINCN6_D!=DIALD_DBDLC=N0LDIICIZ!>BCC0_B?=BIZTPFZCLBIPCA0?=A=JZ=I0LATGFIX+X48TPGJXDJIICIZ!>AIZ=U2"
3040 DATA "TPU0GP1P"
3050 DATA ""
3060 DATA ""

 

Guillaume.

 

mlc.jpg

Very clever!

Willsy's original challenge required scrolling...

If I do it this way I should expect very different results for sure!

 

 

 

Share this post


Link to post
Share on other sites

So simply removing the need to scroll was a big improvement.  It also meant that I could scan the array for integer sevens rather than converting to a text string and scanning the string. Without improving the calculator the results in indirect threaded Forth are 33 Seconds.

 

This is a reasonable ratio for Assembler to Indirect Threaded Forth. 3 to 5 times slower is typical.

I will try Guillaumes calculation method and see if that buys me something.

It may not because the UM/MOD ( "un-signed mixed division and modulo")  routine is just a native DIV instruction that simultaneously computes 2 results.

The overhead in Forth of  shifting values and doing a subtract might actually be no improvement. We shall see.

 

For complete comparison the compile time is also about 7 seconds. 

( I have another compiler that compiles about 20% faster, but I broke it making some changes so it's offline) :)

 

Spoiler
\ lucien2 Version of FIG Forth PORTED to CAMEL99 Forth with re-work
\ ** non-scrolling version **

\ NEEDS DUMP   FROM DSK1.TOOLS
NEEDS ELAPSE FROM DSK1.ELAPSE
NEEDS ()@,   FROM DSK1.CODEMACROS

DECIMAL

180 CONSTANT SIZE
\ variable         fast fetchers
VARIABLE i        MACRO [email protected]   i @, ;MACRO
VARIABLE X        MACRO [email protected]   X @, ;MACRO
VARIABLE POWER
VARIABLE LENGTH

CREATE A1   SIZE CELLS ALLOT
CREATE A2   SIZE CELLS ALLOT

CREATE PAD  256 ALLOT

\ Integer Arrays that is indexed addressing
MACRO ][email protected] ( ndx -- n)   A1 ()@,  ;MACRO
MACRO ][email protected] ( ndx -- n)   A2 ()@,  ;MACRO

MACRO ]A1! ( ndx -- n)   A1 ()!,  ;MACRO
MACRO ]A2! ( ndx -- n)   A2 ()!,  ;MACRO

\ Note: Used UM/MOD,  un-signed division because it's faster
: A1*7->A2 ( -- )
           i OFF
           X OFF
           BEGIN
               [email protected] ][email protected] DUP 0=
               [email protected] LENGTH @ >  AND
               [email protected] 0=   AND
           0= WHILE
               7 * [email protected] + S>D  10 UM/MOD  X !  [email protected] ]A2!
               i 1+!
           REPEAT
           DROP
           [email protected] LENGTH ! ;

\ number converion helpers
: <#      ( -- ) PAD HP ! ;        \ set HP to buffer for number conversion
: #>      ( -- pad length ) PAD HP @ OVER - ;
: HOLD   ( char -- )  HP @ C!  HP 1+! ;  \ hold digit in pad, bump pointer

: A2>$   ( -- addr len)  \ array to text string
       <#
       -1 LENGTH @ 1-
        DO
           I ][email protected] [CHAR] 0 +  HOLD
        -1 +LOOP
        #>                            \ compute length
       [CHAR] 0 SKIP ;                \ skip leading zeros

: SEVENS? ( -- ?)
        0
        LENGTH @ 1- 0
        DO
           1+ I ][email protected] 7 = AND
           DUP 4 > IF LEAVE THEN
        LOOP
;

: INTRO
  PAGE ." The 5 Sevens Problem"
  CR
  CR   ." Press a key to start"  KEY DROP
;

: CALCULATOR
   CR ." Working ..."
   BEGIN
        A1*7->A2
        A2 A1  LENGTH @ CELLS CMOVE
        POWER 1+!
        SEVENS?
     UNTIL ;

: ERASE  ( address length --) 0 FILL ;
: EMPTY  ( array --) SIZE CELLS ERASE ;

: SETUP
     A1 EMPTY
     A2 EMPTY
     7 A1 !    2 POWER !   LENGTH OFF
     PAGE
     TICKER OFF
;

: RESULTS
     CR
     CR
     CR
     CR ." The Answer is 7 ^" POWER @ 1- .
     CR .ELAPSED
     CR
;

: RUNFASTER     INTRO   SETUP   CALCULATOR  RESULTS  A2>$ TYPE ;

\ CAMEL99 time  33

 

 

sevensnoscroll.png

SEVENSCOMPILETIME.png

Share this post


Link to post
Share on other sites

Just for curiosity, the same code compiled with Direct threaded Forth gets us to 27.4 seconds, a 20% improvement.

Compile time 5.23 seconds.

(*with a bug in the string conversion. I don't use this compiler these days. Code is bigger)

DTCSEVENS.png

dtcsevenscompile.png

Share this post


Link to post
Share on other sites

Honestly if you want to delay everything with Basic or XB just use PRINT and scroll the entire screen.

DISPLAY AT(row,col) is much faster then having to move the entire screen every time it adds a new power to display.

RXB CALL HPUT(row,col,string or number) is very slightly faster  then DISPLAY AT(row,col) (4 minute test is about 1 second faster, 21 minute test it is almost 3 seconds faster)

My main point is SCROLL is insanely slow to use and always is slow, you are after all rewriting the entire screen each pass.

 

  • Like 2

Share this post


Link to post
Share on other sites
On 10/21/2019 at 5:57 AM, moulinaie said:

 

Knowing that 7 = 8 - 1, you can say that :

value * 7 =  shift-left (value,3) - value.

That can be fast in assembly. But I don't know if your FORTH version has the binary shift instructions.

 

Guillaume.

I added a shift-left by 3 operator to my FORTH way back, to handle character patterns faster. It's called 8* so this is simple to try.

 

I have cleaned up the computation section so for reference the '*" operator/                  Time 27.33 seconds

 

I tried the shift method  in hi-level Forth and it is a bit slower that using the '*" operator.  Time= 29.53 seconds

: 7*  ( n -- n') DUP 8* SWAP - ;

This is because of the stack juggling operations needed and the overhead of running each of those Forth instructions.

(The address interpreter is 3 instructions that run each time a Forth word ends)

 

I tried removing the extra call to 7* by making a text macro. 

: 7*  S"  DUP 8* SWAP -" EVALUATE ; IMMEDIATE

Time =28.41

 

 

I removed all Forth overhead with my INLINE[ ] compiler. 

CODE 7*    INLINE[ DUP 8* SWAP - ] NEXT, ENDCODE

TIME = 27.13  (20mS faster) :)

 

 

So I re-wrote it in Forth Assembler. 

CODE 7*  ( n -- n')
           TOS R1 MOV,
           TOS 3 SLA,
           R1 TOS SUB,
           NEXT,
           ENDCODE

Time= 26.48  (3% better)

 

The real bottle neck is probably the DIV instruction. And also Forth looping is much slower than assembler.

I will have to implement your algorithm.

 

--------------------

FYI for anyone following this thread,  I have simplified the computation code quite a bit:

: A1*7->A2 ( -- )
           X OFF
           0           \ index on stack
           BEGIN
              DUP  ][email protected] 7* [email protected] + S>D  10 UM/MOD  X !
              OVER ]A2!
              1+
              DUP [email protected] >  [email protected] 0=  AND
           UNTIL
           LENGTH ! ;

 

 

 

Share this post


Link to post
Share on other sites
16 hours ago, TheBF said:

 

So I re-wrote it in Forth Assembler. 

CODE 7*  ( n -- n')
           TOS R1 MOV,
           TOS 3 SLA,
           R1 TOS SUB,
           NEXT,
           ENDCODE

 

 

Using assembler is not FORTH... 😉

 

In this case, MLC can do assembler too, there is an integrated assembler into the PreCompiler:

 

mlc.jpg.2b1d6e5eeb1e2d3a185f531902557c47.jpg

 

Here is the source code for SEVENA (A for assembler) : (you can see that Assembler can exchange data with the variables of MLC as A to Z are predefined into the assembler) 

 

100 CALL CLEAR
$MLC F 110 10 3000
300 CALL LINK("SEVEN",N,A$)
310 PRINT "7 TO THE POWER OF ";N; " IS"
320 PRINT A$
330 END
$SEVEN
	DIMTABLE F 256			; F points to the special 256 bytes buffer
	DIMTABLE G 256			; G points to 256 bytes (for returning string)
	STARTDATA
	   BYTES 0,7,4,1,8,5,2,9,6,3
           BYTES 0,0,1,2,2,3,4,4,5,6
        ENDDATA	E			; E points to this multiplication table
	LET P 1				; current power = 1
	PUTTABLE F 0 7			; current value is 7 at first byte

	; R0 = pointer on F table
	; R1 = pointer to E table
	; R2 = 7
	; R3 = one digit and pointer into E
	; R4 = one digit after multiplication
	; R5 = carry
	; R6 = number of digits-1 into F table
	; R7 = number of consecutive "7" currently found
	; R8 = flag for FOUND
	; R9 = loop counter on digits while mul*7
	; R10 = 10	

	$[
	LI R6,1				; current LEN in digits in F table
	CLR R8				; flag for "found"
	LI R10,>A00			; R10 = byte 10
	LI R2,>700			; R2 = byte 7
	MOV @E,R1			; mutliplication table E
	; start of MAIN loop (a852)
		CLR R7
		CLR R5
		MOV R6,R9
		MOV @F,R0
		; DEBUT start of multiply loop (a85C)
			CLR R3			; for byte operation
			MOVB *R0,R3		; new digit
			SWPB R3			; to word
			A R1,R3			; E+digit -> points to the units of table*7
			MOVB *R3,R4		; get units
			AB R5,R4		; plus carry
			MOVB @10(R3),R5		; E+digit+10 -> points to tenths of table*7, new carry
			CB R4,R10		; more than a digit?
			JLT +3 			; to AA 
				SB R10,R4	; if so, reports ten...
				AI R5,>100	; ...on the carry (byte)
			; AA
			MOVB R4,*R0+		; store new digit
			CB R2,R4		; is it a "7" ?
			JNE +6 			; to BB
				INC R7		; if so, one more found
				CI R7,6		; six "7" ?
				JNE +3 		; to CC
					INC R8	; yes!!! end of search !
				JMP +1 		; to CC
			; BB
				CLR R7		; not a "7"... reset counter
			; CC
		DEC R9				; loop counter
		JNE -25				; no, return to DEBUT
		CI R5,0				; a carry remains?
		JEQ +2 				; to DD
			INC R6			; yes! One more digit
			MOVB R5,*R0		; and store it
		; DD
		INC @P				; power+1
	CI R8,0					; flag for found
	JEQ -40					; still zero... back to MAIN
	DEC R6					; to return LEN-1
	MOV R6,@Z				; else, update number of digits
	$]	
		
			
			
	LET J Z
	FOR I 0 Z
		GETTABLE F I X			; one digit from F table
		ADD X 48			; in ASCII
		PUTTABLE G J X			; one character in G string
		DEC J
	NEXT
	INC Z					; correct len
	LET U 2
	PUTTABLE U 0 G				; set A$ to G string with Z characters
        PUTPARAM 1 P				; set N to the current power
$$
$END

 

And the resulting X-BASIC program :

 

100 CALL CLEAR
110 CALL INIT::PRINT "Loading MLC Compiler..."::CALL LOAD("DSK1.NEWFLO")::CALL LINK("NEWFLO")
120 IO(1)=3000::CALL LINK("COMPIL",IO(),S$(),C$())::If IO(1) THEN PRINT "Error ";IO(1)::END
130 PRINT "Compilation OK!"
300 CALL LINK("SEVEN",N,A$)
310 PRINT "7 TO THE POWER OF ";N; " IS"
320 PRINT A$
330 END
3000 DATA P
3010 DATA SEVEN
3020 DATA "TRF256 TRG256 _A BA T< TH10.00070401080502090603 TH10.00000102020304040506 TE LA T>E =P1 TPF0.7 TT518 TT1 TT1224 TT522 TT2560"
3030 DATA "TT514 TT1792 TT-16288 TT-24536 TT1223 TT1221 TT-15802 TT-16352 TT-24534 TT1219 TT-12080 TT1731 TT-24383 TT-12013 TT-20219"
3040 DATA "TT-11933 TT10 TT-28028 TT4355 TT28938 TT549 TT256 TT-9212 TT-28414 TT5638 TT1415 TT647 TT6 TT5635 TT1416 TT4097 TT1223 TT1545"
3050 DATA "TT5863 TT645 TT0 TT4866 TT1414 TT-11259 TT1440 TT-24514 TT648 TT0 TT5080 TT1542 TT-14330 TT-24494 =JZ =I0 LA TGFIX +X48 TPGJX DJ"
3060 DATA "II CIZ !>A IZ =U2 TPU0G P1P"
3070 DATA ""
3080 DATA ""

 

Again, here is the ZIP file with everything in it and the DSK1 forlder to test it under CLASSIC99.

 

OLD DSK1.SEVENA
RUN

That's it !

 

Guillaume.

 

 

seven_mlc.zip

Share this post


Link to post
Share on other sites

Well... the line between Forth and Assembler is very thin.  Notice, I did not write the entire program in Assembler just three instructions.

And after I do that a new word is in the Forth language called "7*". :) 

So is it Forth?  All 100+ primitive operations in the language are exactly the same. 

I will leave it up to you.

 

Thank you for the Assembler version.  I will re-write it in Forth Assembler as an exercise. 

If I do it correctly, I should be able to use your code for the calculation section and keep the rest in Forth just to see what happens to performance.

 

So MLC is a very slick system.  It integrates with BASIC very well.  I think more BASIC programmers should take advantage of it for time critical stuff.

Something I notice is that it is very much like Forth only it uses PREFIX notation.  I was playing with emulating MLC in Forth by redefining Forth.

 

Here is how it might look (NOT FINISHED, NOT TESTED)  It would not run as fast as MLC of course, but it would work.

But you get an idea of what I mean from the example below.

INCLUDE DSK1.DATABYTE     \ BFOX library for TI99 BYTE DATA directives
: DIMTABLE    BUFFER: ;
: LET       ! ;
: STARTDATA     HERE ; 
: ENDDATA   CONSTANT ; 
: PUTABLE   SWAP CELLS + ! ; 
\ I need define variables before usage,  
( I think MLC is using array indexed by the character name no?)

VARIABLE A VARIABLE B VARIABLE C VARIABLE D ETC...
: ;      \  ;    ; Comment is redefined to ';' :-)   

; now we write reverse polish notation MLC
    256 DIMTABLE F			; F points to the special 256 bytes buffer
	256 DIMTABLE G 			; G points to 256 bytes (for returning string)
	STARTDATA
	       BYTE 0,7,4,1,8,5,2,9,6,3
           BYTE 0,0,1,2,2,3,4,4,5,6
        ENDDATA	E		; E points to this multiplication table
	1 P LET				; current power = 1
	7 0 F PUTTABLE		; current value is 7 at first byte
	0 E LET 			; current LEN-1 of number in digits
	0 X LET				; clear word for byte operations
	0 A LET 			; flag for "found"

 

 

 

Share this post


Link to post
Share on other sites
19 minutes ago, TheBF said:

[]

 

Thank you for the Assembler version.  I will re-write it in Forth Assembler as an exercise. 

If I do it correctly, I should be able to use your code for the calculation section and keep the rest in Forth just to see what happens to performance.

 

[]

 

Here is how it might look (NOT FINISHED, NOT TESTED)  It would not run as fast as MLC of course, but it would work.

But you get an idea of what I mean from the example below.

 

 

 

Hi !

 

I think if you translate from my to your assembler, the speed will be the same.

 

I understand what you mean by emulating MLC ! 

I remember that when I wrote my FORTH compiler for the Atari ST, I used the same method as you to implement assembler. I turned it into prefixed notation.

It's a very funny language!

 

Guillaume.

  • Like 1

Share this post


Link to post
Share on other sites

Here is my solution in XB , no multiplication, no division, only integers (but the BASIC treat them as float...)

I'd be curious to know what the speed is with a Basic Compiler.

 

100 DIM F(256)
110 FOR I=0 TO 9:: READ D(I),U(I)::NEXT I::P=1::F(0)=7::Z=0::A=0
120 C=0::N=0
130 FOR I=0 TO Z::X=F(I)::Y=U(X)+C::C=D(X)::IF Y>=10 THEN Y=Y-10::C=C+1
140 F(I)=Y::IF Y<>7 THEN N=0::GOTO 160
150 N=N+1::IF N=6 THEN A=1
160 NEXT I
170 IF C THEN Z=Z+1::F(Z)=C
180 P=P+1::IF A=0 THEN GOTO 120
300 PRINT "7 to the power of";P;"is"
310 FOR I=Z TO 0 STEP -1::PRINT CHR$(F(I)+48);::NEXT I
315 CALL SOUND(4000,440,0)
320 END
1000 DATA 0,0,0,7,1,4,2,1,2,8,3,5,4,2,4,9,5,6,6,3

 

It takes about 11 min 34 sec to find that the solution is 7^175 !

 

But it works.

 

(I added four seconds of sound at the end to "wake me up"! Then , you just have to subtract 4 seconds to your chrono at the end of the sound.)

 

With Option CPU Overdrive, it takes 1 min 20sec to complete.

 

Guillaume.

 

Edited by moulinaie
  • Like 3

Share this post


Link to post
Share on other sites

Ah, so you know Forth well. 

I am "preaching to the choir" as we say in English.

 

Atari ST... I always wanted one of those machines and a copy of Dragon Forth.  

 

Les memoires.... 😊

  • Like 1

Share this post


Link to post
Share on other sites
47 minutes ago, TheBF said:

Ah, so you know Forth well. 

I am "preaching to the choir" as we say in English.

 

Atari ST... I always wanted one of those machines and a copy of Dragon Forth.  

 

Les memoires.... 😊

In French we say "je prèche des convaincus" (preaching conviced people) !

 

What's cool with FORTH is that you can extend it so easely.

In my version, I wanted to organize memory my way and I didn't work with addresses but I tokenised the words into one single 16 bits word. So I was able to treat 65536 different instructions, looks enough. (including the user words).

A text line was read and tokenised giving a serie of 16 bits words (with a pre-interpretation for IF...THEN...DO ..LOOP jumps, etc)

Then the tokens were executed one by one.

I still use the FORTH on the ST/TT.

 

Guillaume.

 

PS: In this case, you should use "les souvenirs" instead of "les mémoires".

Share this post


Link to post
Share on other sites

Thank you for the French lesson.  I am an "Anglophone" here in Canada and so I don't get to use French very much. I love the language.

 

So that method is commonly called a "byte code" interpreter, however you used integers.

There is a system called Open-firmware that uses 255 byte codes for the low level operations. This makes very small programs.

I don't remember how they index the user's words. I think they used banks of 255 bytes, but not sure anymore.

Open-firmware booted Solaris and Apple Power PC machines and I believe it is used for booting some Linux distros. (not up to date on that)

 

Thanks again for your input here. It's very interesting to see other solutions.

Share this post


Link to post
Share on other sites

Been dragging my a** for a week with the flu.  But I took a run at re-writing this program in a what I consider to be Forth style, which I would define as factoring out the program into easily digestible pieces for easy debugging and also to add some clarity to what can be a hard language to read. :)

 

This version compiles on Turbo Forth and CAMEL99 Forth. The run times are virtually identical 32.3 / 32.7 with Turbo Forth ahead due to it's big stash of code sitting in scratchpad RAM. But CAMEL99 performs respectably.

 

I removed the double condition that ended the multiplier loop by using two loops that return to the same place. This made a slightly faster multiply step.

 

One of the interesting bits is counting the sevens with no variable and no IF statement.  The counter is on the data stack and IF is replaced by the AND operator.

Where does the extra time get used up versus Assembler? For reference, writing the SEVENS? routine in Assembler results in 17 second timing.

As shown earlier we can speed this up by 3% by replacing the multiply operator with a three instruction routine.

 

I have yet to get the big integer multiplier code working in assembler to compare this method to Guillaume's method.

I am missing something simple. 😡 

So much code so little time.

\ Sevens problem re-written in a factored Style.
\ Turbo Forth compatible version

DECIMAL
180 CONSTANT SIZE

VARIABLE X
VARIABLE POWER
VARIABLE LENGTH

CREATE A1  SIZE CELLS ALLOT
CREATE A2  SIZE CELLS ALLOT
CREATE PAD  256 ALLOT       \ re-defined for best speed

\ Expose memory BLOCKS as indexed arrays
: ]A1 ( ndx -- n) CELLS A1 +  ;
: ]A2 ( ndx -- n) CELLS A2 + ;

: A1*7->A2 ( -- )
         0          \ index on stack
         0 X !      \ remainder storage
         BEGIN
            BEGIN
              DUP  ]A1 @ 7 * X @ + 0  10 UM/MOD X !
              OVER ]A2 !
              1+
              DUP LENGTH @ >
            UNTIL
            X @ WHILE ( "while there is a remainder")
         REPEAT       ( Do it again)
         LENGTH !
;

\ BIG number converion based on Forth internal method (modified)

48 CONSTANT '0'
VARIABLE HP     \ "HOLD" pointer. Where to put DIGIT in the string

: <#    ( -- ) PAD HP ! ;
: #>    ( -- pad length ) PAD HP @ OVER - ;
: HOLD  ( char -- )  HP @ C!  1 HP +! ;  \ hold digit in pad, bump pointer
: DIGIT ( n -- char) '0' + ;
: A2>#S    ( -- pad length ) 0 LENGTH @ DO   I ]A2 @ DIGIT  HOLD    -1 +LOOP ;

\ These are not in Turbo Forth kernel
: /STRING   ( c-addr1 u1 n -- c-addr2 u2 ) ROT OVER + -ROT - ;

: SKIP  ( addr len char -- addr' len')
        >R
        BEGIN  
          OVER [email protected] [email protected] =
         WHILE
           1 /STRING
        REPEAT
        R> DROP  ;

: A2$  ( -- addr len ) <#  A2>#S  #>  '0' SKIP ;

: SEVENS? ( -- ?)
        0                      \ flag/counter on stack
        LENGTH @ 1- 0
        DO
           1+                  \ bump counter
           I ]A2 @ 7 =         \ test for a '7'. TRUE= -1, FALSE=0
           AND                 \ AND flag with count (replaces IF)
           DUP 5 =             \ is count equal to 5?
           IF LEAVE THEN       \ if so, leave the loop, return flag
        LOOP
;

: INTRO
  PAGE ." The 5 Sevens Problem"
  CR
  CR   ." Find the power of 7 with more than"
  CR   ." 5 sequential sevens"
  CR
  CR   ." Press key to start"  KEY DROP
;

: EMPTY   CELLS 0 FILL ;

: INITS ( -- )
     A1 SIZE EMPTY
     A2 SIZE EMPTY
     7 A1 !
     2 POWER !
     1 LENGTH !
;

: CALCULATOR
      A2 A1
      BEGIN
        A1*7->A2
        2DUP LENGTH @ CELLS CMOVE \ copy A2->A1
        1 POWER +!
        SEVENS?
      UNTIL
      2DROP ;

: RUN
     INTRO
     INITS
     CR
     CR ." Working..."  CALCULATOR
     CR
     CR ." The Answer is 7 ^" POWER @ 1- .
     CR A2$ TYPE
;

\ CAMEL99 V2.5        0:32.7
\ TURBO Forth         0:32.3

 

  • Like 1

Share this post


Link to post
Share on other sites
On 3/25/2011 at 7:56 AM, lucien2 said:

Maybe versus Weiand's Forth:

Compiled-Basic: 1m40s

Weiand-Forth: 3m

 

 

 

My first version runs in 3m with TF (5m with Weiand), so the new one must run in ~1m50s (I don't have TF to test it).

We can't really compare, since it's not exactly the same algorithm.

I'm new to forth, and I'm impressed with the performance against interpreted basic. icon_smile.gif

 

Running Turboforth with its fast screen I/O  the version that works like the Compiled BASIC, using my re-written version ran in 1:28~

 

turbosevens.png

Share this post


Link to post
Share on other sites
8 hours ago, TheBF said:

Been dragging my a** for a week with the flu.  But I took a run at re-writing this program in a what I consider to be Forth style, which I would define as factoring out the program into easily digestible pieces for easy debugging and also to add some clarity to what can be a hard language to read. :)

 

This version compiles on Turbo Forth and CAMEL99 Forth. The run times are virtually identical 32.3 / 32.7 with Turbo Forth ahead due to it's big stash of code sitting in scratchpad RAM. But CAMEL99 performs respectably.

 

I removed the double condition that ended the multiplier loop by using two loops that return to the same place. This made a slightly faster multiply step.

 

One of the interesting bits is counting the sevens with no variable and no IF statement.  The counter is on the data stack and IF is replaced by the AND operator.

Where does the extra time get used up versus Assembler? For reference, writing the SEVENS? routine in Assembler results in 17 second timing.

As shown earlier we can speed this up by 3% by replacing the multiply operator with a three instruction routine.

 

I have yet to get the big integer multiplier code working in assembler to compare this method to Guillaume's method.

I am missing something simple. 😡 

So much code so little time.

\ Sevens problem re-written in a factored Style.
\ Turbo Forth compatible version

DECIMAL
180 CONSTANT SIZE

VARIABLE X
VARIABLE POWER
VARIABLE LENGTH

CREATE A1  SIZE CELLS ALLOT
CREATE A2  SIZE CELLS ALLOT
CREATE PAD  256 ALLOT       \ re-defined for best speed

\ Expose memory BLOCKS as indexed arrays
: ]A1 ( ndx -- n) CELLS A1 +  ;
: ]A2 ( ndx -- n) CELLS A2 + ;

: A1*7->A2 ( -- )
         0          \ index on stack
         0 X !      \ remainder storage
         BEGIN
            BEGIN
              DUP  ]A1 @ 7 * X @ + 0  10 UM/MOD X !
              OVER ]A2 !
              1+
              DUP LENGTH @ >
            UNTIL
            X @ WHILE ( "while there is a remainder")
         REPEAT       ( Do it again)
         LENGTH !
;

\ BIG number converion based on Forth internal method (modified)

48 CONSTANT '0'
VARIABLE HP     \ "HOLD" pointer. Where to put DIGIT in the string

: <#    ( -- ) PAD HP ! ;
: #>    ( -- pad length ) PAD HP @ OVER - ;
: HOLD  ( char -- )  HP @ C!  1 HP +! ;  \ hold digit in pad, bump pointer
: DIGIT ( n -- char) '0' + ;
: A2>#S    ( -- pad length ) 0 LENGTH @ DO   I ]A2 @ DIGIT  HOLD    -1 +LOOP ;

\ These are not in Turbo Forth kernel
: /STRING   ( c-addr1 u1 n -- c-addr2 u2 ) ROT OVER + -ROT - ;

: SKIP  ( addr len char -- addr' len')
        >R
        BEGIN  
          OVER [email protected] [email protected] =
         WHILE
           1 /STRING
        REPEAT
        R> DROP  ;

: A2$  ( -- addr len ) <#  A2>#S  #>  '0' SKIP ;

: SEVENS? ( -- ?)
        0                      \ flag/counter on stack
        LENGTH @ 1- 0
        DO
           1+                  \ bump counter
           I ]A2 @ 7 =         \ test for a '7'. TRUE= -1, FALSE=0
           AND                 \ AND flag with count (replaces IF)
           DUP 5 =             \ is count equal to 5?
           IF LEAVE THEN       \ if so, leave the loop, return flag
        LOOP
;

: INTRO
  PAGE ." The 5 Sevens Problem"
  CR
  CR   ." Find the power of 7 with more than"
  CR   ." 5 sequential sevens"
  CR
  CR   ." Press key to start"  KEY DROP
;

: EMPTY   CELLS 0 FILL ;

: INITS ( -- )
     A1 SIZE EMPTY
     A2 SIZE EMPTY
     7 A1 !
     2 POWER !
     1 LENGTH !
;

: CALCULATOR
      A2 A1
      BEGIN
        A1*7->A2
        2DUP LENGTH @ CELLS CMOVE \ copy A2->A1
        1 POWER +!
        SEVENS?
      UNTIL
      2DROP ;

: RUN
     INTRO
     INITS
     CR
     CR ." Working..."  CALCULATOR
     CR
     CR ." The Answer is 7 ^" POWER @ 1- .
     CR A2$ TYPE
;

\ CAMEL99 V2.5        0:32.7
\ TURBO Forth         0:32.3

 

 

Hi, cool version!

 

I like your "SEVENS?" procedure with the AND. 

Compared to the assembler, sure that this FORTH version is time consuming, for example in the use of ]A1 and ]A2, for every access to the table, the index is added to the base address. If there was something like a pointer to parse the table sequentially, it would be faster.

Another idea, is it possible to write two procedures, the one to compute A1*7 into A2 and the second te compute A2*7 into A1. Then you won't have to CMOVE the table at each iteration.

 

Guillaume.

 

  • Like 1

Share this post


Link to post
Share on other sites

Hello,

 

I worked again on the MLC program with assembler.

By default, MLC uses >A000 as its Workspace Pointer for registers R0-R15.

 

The program modifies temporary this to use >83E0 as the workspace pointer (the zone used by GPL!). This is much faster now !

 

I run once the program (to get my "SEVEN" subprogram in memory), and then (on real hardware)

FOR I=1 TO 30::CALL LINK("SEVEN",N,A$)::NEXT I

With the old assembler version it took 60 seconds, so one calculation* was only 60/30 = 2 seconds. 

With the new version, it takes 42 seconds, so one calculation* is only 42/30 = 1,4 seconds.

 

Waow! Sure, there is no display, but N (=power) and A$ (=result string) are correctly filled.

 

Guillaume.

 

(*) one calculation means to compute every power of seven until 175 to find the correct answer

 

PROGRAM to paste into CLASSIC99 (if you already have MLC on your disk) :

 

100 CALL CLEAR
110 DIM IO(3)
120 CALL INIT::PRINT "Loading MLC Compiler..."::CALL LOAD("DSK1.NEWFLO")::CALL LINK("NEWFLO")
130 IO(1)=3000::CALL LINK("COMPIL",IO(),S$(),C$())::If IO(1) THEN PRINT "Error ";IO(1)::END
140 PRINT "Compilation OK!"
300 CALL LINK("SEVEN",N,A$)
310 PRINT "7 TO THE POWER OF ";N; " IS"
320 PRINT A$
330 END
3000 DATA P
3010 DATA SEVEN
3020 DATA "TRF256 TRG256 TRH22 _A BA T< TH10.00070401080502090603 TH10.00000102020304040506 TE LA T>E =P1 TPF0.7 TM-31776.22H TT736"
3030 DATA "TT-31776 TT518 TT1 TT1224 TT522 TT2560 TT514 TT1792 TT-16288 TT-24536 TT1223 TT1221 TT-15802 TT-16352 TT-24534 TT1219 TT-12080"
3040 DATA "TT1731 TT-24383 TT-12013 TT-20219 TT-11933 TT10 TT-28028 TT4355 TT28938 TT549 TT256 TT-9212 TT-28414 TT5638 TT1415 TT647 TT6"
3050 DATA "TT5635 TT1416 TT4097 TT1223 TT1545 TT5863 TT645 TT0 TT4866 TT1414 TT-11259 TT1440 TT-24514 TT648 TT0 TT5080 TT-14330 TT-24494"
3060 DATA "TT-16352 TT-24534 TT-16288 TT-24532 TT514 TT12288 TT-24506 TT-12048 TT-20286 TT1537 TT-11197 TT1542 TT5882 TT736 TT-24576"
3070 DATA "TMH22-31776 =U2 TPU0G P1P"
3080 DATA ""
3090 DATA ""

Source program:

; this program searches for the first power of 7 
; that contains six consecutive "7"

100 CALL CLEAR
$MLC F 110 10 3000
300 CALL LINK("SEVEN",N,A$)
310 PRINT "7 TO THE POWER OF ";N; " IS"
320 PRINT A$
330 END
$EQU
	\gplreg		PADE0		; >83E0
	\mlcreg		&HA000
$$
$SEVEN
	DIMTABLE F 256			; F points to a 256 bytes buffer
	DIMTABLE G 256			; G points to 256 bytes (for returning string)
	DIMTABLE H 22			; to save GPL registers R0-R10 (11*2)
	STARTDATA
	   BYTES 0,7,4,1,8,5,2,9,6,3
       BYTES 0,0,1,2,2,3,4,4,5,6
    ENDDATA	E				; E points to this multiplication table
	LET P 1					; current power = 1
	PUTTABLE F(0) 7			; current value is 7 at first byte

	; R0 = pointer on F table
	; R1 = pointer to E table
	; R2 = 7
	; R3 = one digit and pointer into E
	; R4 = one digit after multiplication
	; R5 = carry
	; R6 = number of digits-1 into F table
	; R7 = number of consecutive "7" currently found
	; R8 = flag for FOUND
	; R9 = loop counter on digits while mul*7
	; R10 = 10	
	
	BMOVE \gplreg 22 H	; save GPL registers R0-R10 into H

	$[
	LWPI \glpreg		; use fast registers !
	LI R6,1				; current LEN in digits in F table
	CLR R8				; flag for "found"
	LI R10,>A00			; R10 = byte 10
	LI R2,>700			; R2 = byte 7
	MOV @E,R1			; mutliplication table E
	MAIN_LOOP:
		CLR R7
		CLR R5
		MOV R6,R9
		MOV @F,R0
		MULTIPLY_LOOP:
			CLR R3			; for byte operation
			MOVB *R0,R3		; new digit
			SWPB R3			; to word
			A R1,R3			; E+digit -> points to the units of table*7
			MOVB *R3,R4		; get units
			AB R5,R4		; plus carry
			MOVB @10(R3),R5		; E+digit+10 -> points to tenths of table*7, new carry
			CB R4,R10		; more than a digit?
			JLT AA:
				SB R10,R4	; if so, reports ten...
				AI R5,>100	; ...on the carry (byte)
		AA:
			MOVB R4,*R0+		; store new digit
			CB R2,R4		; is it a "7" ?
			JNE BB:
				INC R7		; if so, one more found
				CI R7,6		; six "7" ?
				JNE CC:
					INC R8	; yes!!! end of search !
				JMP CC:
		BB:
				CLR R7		; not a "7"... reset counter
		CC:
		DEC R9				; loop counter
		JNE MULTIPLY_LOoP:	; no, return to DEBUT
		CI R5,0				; a carry remains?
		JEQ DD:
			INC R6			; yes! One more digit
			MOVB R5,*R0		; and store it
		DD:
		INC @P				; power+1
	CI R8,0					; flag for found
	JEQ MAIN_LOOP:			; still zero... back to MAIN
	MOV R6,@Z				; else, update number of digits
	MOV @F,R0				; source F ascending
	MOV @G,R1				; dest G descending
	LI R2,&H3000			; "0" in high byte
	A R6,R1					; R1 points to the end of STRING G
	
	EE:
		MOVB *R0+,R3		; one digit from F
		AB R2,R3			; to ASCII
		DEC R1
		MOVB R3,*R1			; character in G string
	DEC R6
	JNE EE:
	
	lwpi \mlcreg			; back to my registers
	$]	
		
	BMOVE H 22 \gplreg			; restore GPL values
			
	LET U 2
	PUTTABLE U 0 G				; set A$ to G string with Z characters
    PUTPARAM 1 P				; set N to the current power
$$
$END

 

seven_mlc.zip

Edited by moulinaie
  • Like 1

Share this post


Link to post
Share on other sites
1 hour ago, moulinaie said:

 

Hi, cool version!

 

I like your "SEVENS?" procedure with the AND. 

Compared to the assembler, sure that this FORTH version is time consuming, for example in the use of ]A1 and ]A2, for every access to the table, the index is added to the base address. If there was something like a pointer to parse the table sequentially, it would be faster.

Another idea, is it possible to write two procedures, the one to compute A1*7 into A2 and the second te compute A2*7 into A1. Then you won't have to CMOVE the table at each iteration.

 

Guillaume.

 

If there isn't "something like a pointer" it's my fault. It's Forth. :) 

I have been thinking that as well.  Managing two pointers on the stack will be slower ( I think) because I have to swap them back and forth to update them but I will give it a try. But I could make a couple of  variables to hold the pointers and some auto incrementing fetch and store operators... that might work.

 

The CMOVE is not a big slowdown compared to the computation.  To test it I made MOVE16 word in Assembler that is 2X faster (moves 16bit words) and the speedup was very little.  Of course each little part that is re-written in Forth Assembler makes a difference as we saw with your suggestion for 7*.  Typically that is how you use these indirect threaded Forth systems.  Write the program, find the bottlenecks and re-code the little pieces in Assembler. 

 

I really want to get the Assembler version of A1*7->A2 working as CODE to get a sense of what happens. I need to dig in to find my error.

 

My last resort is to get my Native99 Forth compiler working well enough to compile the program like you are doing with MLC and see what happens.

That will take a little more time  since I left that project a while back and it will take me some time to re-familiarize myself and add what extensions I need to this program.

 

Using the scratchpad RAM really makes a difference as you have noted.  It is so sad we don't have more RAM on the 16bit buss. :(

Share this post


Link to post
Share on other sites

I really like the 64 KBytes internal 16-bit wide RAM I put into my console. It's relieving to not have to worry where code, and workspace, is located, as all is equally fast.

  • Like 1

Share this post


Link to post
Share on other sites
8 minutes ago, TheBF said:

Using the scratchpad RAM really makes a difference as you have noted.  It is so sad we don't have more RAM on the 16bit buss. :(

 

I think that most of you know this page, but in case... This is how to add the 32k on the 16 bits bus.

 

http://www.mainbyte.com/ti99/32K16/32k16.html

 

 

3 minutes ago, apersson850 said:

I really like the 64 KBytes internal 16-bit wide RAM I put into my console. It's relieving to not have to worry where code, and workspace, is located, as all is equally fast.

How did you proceed?

 

Guillaume.

Edited by moulinaie

Share this post


Link to post
Share on other sites

Well, I added 64 KBytes of static RAM, covering all the address range of the TMS 9900. By default, only 2000H-3FFFH and A000H-FFFFH is enabled. By using CRU bits (base address 400H), I can enable the remaining RAM, 8 KBytes at a time. I can also disable the RAM at the normal memory expansion addresses, which will allow a standard memory expansion, like a card in the PEB, to become visible.

So it's possible to get full 64 K contiguous RAM, as well as use either the fast internal 32 K RAM or the standard speed 32 K RAM, as you like. This gives a total of 96 K RAM available, under software control.

Share this post


Link to post
Share on other sites
1 minute ago, apersson850 said:

Well, I added 64 KBytes of static RAM, covering all the address range of the TMS 9900. By default, only 2000H-3FFFH and A000H-FFFFH is enabled. By using CRU bits (base address 400H), I can enable the remaining RAM, 8 KBytes at a time. I can also disable the RAM at the normal memory expansion addresses, which will allow a standard memory expansion, like a card in the PEB, to become visible.

So it's possible to get full 64 K contiguous RAM, as well as use either the fast internal 32 K RAM or the standard speed 32 K RAM, as you like. This gives a total of 96 K RAM available, under software control.

 

That's impressive !

Is it a difficult modification of the console? Have you got some pictures?

 

Guillaume.

Share this post


Link to post
Share on other sites
5 hours ago, apersson850 said:

I didn't have any easy way of making a PCB at that time, so everything is piggy-backed on exisiting circuits.

Picture of computer's board

Wiring diagram

Block schematics

Slight modification to the design took place after the diagram was done.

 

But this is way off topic in this thread.

I will just stick with the SAMS 1 Meg card or the newer 4 Meg card.

 

  • Like 1

Share this post


Link to post
Share on other sites
7 hours ago, apersson850 said:

Well, I added 64 KBytes of static RAM, covering all the address range of the TMS 9900. By default, only 2000H-3FFFH and A000H-FFFFH is enabled. By using CRU bits (base address 400H), I can enable the remaining RAM, 8 KBytes at a time. I can also disable the RAM at the normal memory expansion addresses, which will allow a standard memory expansion, like a card in the PEB, to become visible.

So it's possible to get full 64 K contiguous RAM, as well as use either the fast internal 32 K RAM or the standard speed 32 K RAM, as you like. This gives a total of 96 K RAM available, under software control.

If something like what you described could be implemented on a card that would fit in the PEBox rather than modifying a console, you would have some pretty powerful features at your fingertips programmers could easily use.  With the Geneve Operating system source code as a base, the Math, Video, and Keyboard XOP's could be tweaked as is.  Probably wouldn't be much of a hurdle to even implement the memory management routines as well.

 

With just the 64K, you would really just be capped at the video and keyboard routines which may work out well for the F18A MK2.

 

Now, back to the topic of this thread.............

  • Thanks 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...