Jump to content
TheBF

Machine Forth OMG

Recommended Posts

And more proof that it's best to remove code from the inner loop.

 

Just changing OVER + SWAP  to the optimized word BOUNDS dropped the time to 22 seconds.

DECIMAL
: FIB
     0 #  1 #
     ROT 0 #
     DO
       \ OVER + SWAP
       BOUNDS
     LOOP
     DROP
;

PROG: MAIN
     1000 # 0 #
     DO
        I FIB DROP
     LOOP
     NEXT,      \ Return to Forth
END.

The difference in the code inside the loop is:

 

OVER + SWAP 

       TOS DPUSH,  
2 (SP) TOS MOV,
  *SP+ TOS ADD,
    TOS R0 MOV,  
  *SP  TOS MOV,  
    R0 *SP MOV,

BOUNDS 

  *SP R1 MOV, 
 TOS *SP ADD, 
  R1 TOS MOV,

Chuck's machine Forth does not have SWAP. 

 

  • Like 2

Share this post


Link to post
Share on other sites

Tail Call Optimization Oversight

 

I have been working on making some solid VDP library routines in this weird language and when I started applying my fancy new tail call optimization I discovered something I completely missed.

 

Example Sub-routine with tail-call optimized

: PAGE   ( -- )  0 # 768 # 32 # VFILL -;

This code clears the screen in 32 column mode.  It is just a wrapper for VFILL and three parameters to fill VDP RAM at address 0 with 768,  ASCII 32 characters.

Notice we end with -;  and not the normal ;   

 

This is how you end a sub-routine when you know the last word is another sub-routine and so you can eliminate the "return from sub-routine" code.

 

-;  looks back in the code and  removes the normal BL @VFILL  and replaces it with  B @VFILL.  Still blows my mind that this works. :)

 

BUT!   every sub-routine in this system has a preamble of two instructions to save R11 on the return stack like this:

DECT R7
MOV R11,*R7

Since we are not "calling VFILL" but using Branch we don't need to make room on the stack and save R11. 

This means we don't want to enter at the normal entry address but 2 CELLS after.  (CELL is Forth speak for an integer memory location) 

 

So the definition for  -;   has been changed to the code below and it seems to work perfectly now. (at least until I find the next bug) :)

\ tail call optimizing semi-colon
H: -; ( --  )
    LOOKBACK   ( addr )   \ get entry address of sub-routine
    2 CELLS +  ( addr' )  \ move past sub-routine RPUSH instructions
    -04 TALLOT            \ erase BL @addr
    ( addr')  @@ B,       \ compile a branch to the NEW sub-routine
;H

Edit: 

The next thing I should do is look back before the sub-routine address and make sure there is BL instruction.

This could error out  if not or I could make tail-call removal automatic.

 

  • Like 2
  • Thanks 1

Share this post


Link to post
Share on other sites

How to choose the best addressing mode with a very "dumb" compiler

 

I have been struggling with keeping the Machine Forth as "normal" as possible because my old brain needs that.

So as mentioned I decided that data like variables and constants simple pushed themselves onto the data stack.

This keeps it normal but consumes three instructions for each argument. ( two for the push and one for LI instruction) 

9900 has that wonderful symbolic mode where you can move things memory to memory. So how best to write that when compiler is too stupid to figure it out?

Constants are a not issue because at the moment they only generate code when you use them in the program.

Variables are a real memory address in your program with the data held there. I made this word to retrieve the DATA field of a variable (the address) 

: [email protected]  ( <NAME> -- addr)  ' >BODY @  REL>TARG ;

It looks up the variable, fetches the contents and converts the address from relative to the real target address in the program image.

 

Then we can create 

: LET   [email protected]  ; \ get address of a variable
: :=    [email protected]  ( addr1 addr2) SWAP @@ ROT @@ MOV, ;

Using these words we can create:

VARIABLE X
VARIABLE Y

LET X := Y   \ :-))) 

LOL.  This compiles  to:

 

MOV @X,@Y 

 

Not very Forthy but it's the best I can come up with.   :) 

 

  • Like 4

Share this post


Link to post
Share on other sites

Changed my mind on syntax for symbolic addressing.

 

To stay more in the Forth syntax I am re-purposing ' (tick) when you are in the TARGET compiler mode. 

 

In machine Forth tick will just be an alias for [email protected] and return the data field address of a variable in the program image. 

This is the common factor.

 

EDIT:   Changed  :=   to '!    Seems more consistent.

: '!    ( addr1 addr2) SWAP @@ ROT @@ MOV, ;

 

When I want to do a memory to memory move  the syntax becomes:

VARIABLE X
VARIABLE Y

' X ' Y  '! 

 

It becomes very clear why compilers use types for data so the compiler can decide how to handle things. 

I will avoid going there for now to see what can be accomplished this way.

  • Like 4

Share this post


Link to post
Share on other sites
8 hours ago, TheBF said:

In machine Forth tick will just be an alias for [email protected] and return the data field address of a variable in the program image.

 

Aha! So back to FigForth’s definition of tick.

 

...lee

  • Like 1

Share this post


Link to post
Share on other sites
1 hour ago, Lee Stewart said:

 

Aha! So back to FigForth’s definition of tick.

 

...lee

lol.  Yes indeed. I hadn't thought of that.   :)

Great catch.  

 

This thing is remarkably hard to get everything working right.  As you know well, I have a unique ability to miss the details in my enthusiasm ... 🧐

@Tursi is my saviour because I can open 2 Classic99 windows. One has the compiler running in over-drive and the other opens the binary programs.

Two debugger windows lets me see the Code generated and size of code generated under different conditions and the other debugger lets me step through the code.

I can't imagine how much longer this would take on real iron.

 

I have written a simple VDP driver to test the optimizer and the tail-call optimizer.  It's starting to feel stable.  I found a bunch of minutia that I had not considered over the last few days.

I am trying to use Machine Forth rather than ALC when the code generated is the same but ALC gives nice things like accessing the 2nd item on the stack seamlessly so I use that stuff in the primitives where is simply better.

 

Here is what seems to be a solid VDP driver. It is light-weight  with no scroll or safety net and I have not done a VMBR yet. 

It compiles with 384 bytes.  The POP/PUSH optimizer has not impact because there are no big interfaces to other words.

However anything that ends with DROP will be optimized in a program that calls something that starts with DUP. 

I am slowly getting used to this coding style but it is not any easier however the speed it fun.

 

Using the tail-call optimization makes you try and organize things so that words end with another word and not inline code.  That's another new thing to think about.

 

Spoiler
\ VDPDRVR.FTH

TARGET                 \ Use TARGET wordlist (to compile code)

HEX
8800 EQU VDPRD
8802 EQU VDPSTS
8C00 EQU VDPWD
8C02 EQU VDPWA

: VDPA! ( Vaddr -- ) \ set vdp address (read mode)
          0 LIMI,
          TOS SWPB,
          TOS VDPWA @@ MOVB,
          TOS SWPB,
          TOS VDPWA @@ MOVB,
          DROP
;

: [email protected]    ( addr -- c)
          VDPA!
          VDPRD @@ TOS MOVB,
          TOS 8 SRL,
;

: VC! ( c Vaddr -- )  4000 #OR VDPA!  VDPWD #C!  ;

HEX
\ * VDP write to register. Kept the TI name
: VWTR   ( c reg -- ) \ Usage: 5 7 VWTR
           ><              \ register goes to left side
          *SP+ TOS ADD,
           8000 #OR VDPA!
           DROP
;

: VFILL ( Vaddr cnt char -- ) \ Mixing ASM & machine forth
           TOS SWPB,
           TOS R1 MOV,
           R0  DPOP,
           DROP               \ Vaddr now in TOS
           4000 #OR VDPA!     \ set the write address and DROP TOS. (clean up)
           R3 VDPWD LI,       \ VDP port faster in a register
           BEGIN
              R1 R3 ** MOVB,
              R0 DEC,
           -UNTIL
;

HEX
: TYPE+  ( addr len ) \ Common factor:
\ ** Set VDP address (VDPA!) which disables interrupts BEFORE use
         *SP+ AREG MOV,
          R3 8C00 LI,    \ 12% faster to use a register
          BEGIN
            *AREG+  R3 ** MOVB,
            1-
         -UNTIL
          DROP ;

: VWRITE ( addr Vaddr len -- ) SWAP 4000 #OR VDPA! TYPE+ -;

\ ____( Simple Screen I/O)_______
\ *There is no protection for running off the screen! :-)
DECIMAL
   VARIABLE C/L     \ characters per line
   VARIABLE COL
   VARIABLE ROW
32 CONSTANT BL      \ space character

HEX
: AT-XY  ( x y --) ' COL #!  ' ROW #! ;  \ 2 instructions MOV TOS,@COL etc.
: CR     ( -- ) ' C/L  ' ROW  '+!  ;     \ 1 instruction. ADD @C/L,@ROW

\ COMPUTE video screen address from ROW & COL
: VPOS   ( -- Vaddr)  ROW @  C/L @ *  COL @  +  ;

HEX
: TYPE   ( addr len -- ) VPOS  OVER  ' COL #+!  4000 #OR VDPA! TYPE+ -;
: EMIT   ( char -- ) VPOS  COL 1+!  VC!  -;
: SPACE  ( -- )      BL EMIT -;

DECIMAL
\ Graphic mode clear screen
: PAGE   ( -- )
     COL OFF   ROW OFF
     VPOS  768 # 32 # VFILL -;

 

 

Here is a little test program used to debug the VDP driver

\ hello world in machine Forth Demo     Sept 23 2021  Fox
\ compiles to 128 bytes

COMPILER              \ Use compiler wordlist (for interpreted words)
   NEW.
   HEX A000 ORIGIN.
   OPT-OFF

   INCLUDE DSK2.VDPDRVR

TARGET
\ --------
CREATE TXT  S" Hello World! " S,

PROG: MAIN
HEX
      8300 WORKSPACE    \ Fast ram for registers
      83BE RSTACK       \ and return stack
      83FC DSTACK       \ and Data stack
      17 # 7 # VWTR

DECIMAL
      32 # C/L !        \ init chars/line
      BEGIN
        PAGE
        CHAR 1 # DUP EMIT 1+ DUP EMIT 1+  EMIT

        3 # 0 # AT-XY
        9 # FOR
            TXT COUNT TYPE  SPACE
        NEXT

        13 # 13 # AT-XY
        TXT 1+ VPOS 5 # VWRITE
     AGAIN
END.

COMPILER SAVE DSK2.HELLO5

The video shows the program running.

 

 

If you have nothing else to do here is the current Compiler.

Spoiler
\ MACHINE FORTH COMPILER     Dec 14, 2020  Brian Fox, Kilworth Ontario
\ This file compiles with Camel99 Forth V2.6x in DSK1.

\ *** ADDED PENDING DROP   ?DROP   concept to optimizer ***

\ The Machine Forth disk should be in DSK2. 
\ To build the cross compiler your need Editor/Assembler Cartridge
\ 1.  Select menu option 5 "Run Program File"
\ 2.  At the prompt type: DSK1.CAMEL99 
\ 3.  When Camel Forth is loaded type:  INCLUDE DSK2.MACHFORTH
\ 4.  The machine forth compiler is ready when the black screen appears 

\ *To cross-compile a program
\  With the MACHFORTH ready type: INCLUDE <DSK2.MYPROGRAM>

\ 4.  See the demo programs provided for examples
\ 5.  After your program compiles type SAVE <DSK1.PROGNAME>
\              -OR-
\     You can place the SAVE command at the bottom of your
\     program to save it automatically after compiling.
\
\ 6.  Type FORTH BYE to return to TI-99 Title screen
\ 7.  Use E/A menu 5 and RUN your saved program.
\ ________________________________________________________

NEEDS LOAD-FILE FROM DSK1.LOADSAVE
NEEDS .S        FROM DSK1.TOOLS
NEEDS DEFER     FROM DSK1.DEFER
NEEDS ELAPSE    FROM DSK1.ELAPSE
NEEDS U.R       FROM DSK1.UDOTR

NEEDS FORTH     FROM DSK1.WORDLISTS
NEEDS ORG       FROM DSK2.MFORTHASM

CR .( Machine Forth special registers & modes...)
ONLY FORTH ALSO XASSEMBLER DEFINITIONS
4 CONSTANT TOS
: (TOS)  TOS () ;
: *TOS   TOS ** ;
: *TOS+  TOS *+ ;

6 CONSTANT SP
: (SP)   SP () ;
: *SP    SP ** ;
: *SP+   SP *+ ;

7 CONSTANT RP
: (RP)   RP () ;
: *RP    RP ** ;
: *RP+   RP *+ ;

8 CONSTANT I  \ loop index register
: (I)    I ()  ;
: *I     I **  ;
: *I+    I *+  ;

9 CONSTANT AREG  ( "address register")
: (AREG)   AREG () ;
: *AREG    AREG ** ;
: *AREG+   AREG *+ ;

: *R10      R10 ** ; \ Forth NEXT while testing. Scratch in binary program
: *R11      R11 ** ; \ sub-routine linkage

\ PUSH & POP macros for DATA stack
: DPUSH,   ( src -- )  SP DECT,  *SP  MOV, ;
: DPOP,    ( dst -- )  *SP+      SWAP MOV, ;

\ PUSH & POP macros for RETURN stack
: RPUSH,  ( src -- ) RP DECT,  *RP   MOV, ;
: RPOP,   ( dst -- ) *RP+      SWAP  MOV, ;

\ _________________________________________________
CR .( CROSS-COMPILER Name Spaces)
FORTH DEFINITIONS
VOCABULARY MFORTH     \ for mforth compiler words
VOCABULARY TARGETS   \ for words in the compiled program

: HOST     ONLY FORTH DEFINITIONS ;

: COMPILER
      ONLY FORTH ALSO XASSEMBLER
      ALSO MFORTH DEFINITIONS ;

: TARGET
      ONLY XASSEMBLER ALSO MFORTH
      ALSO TARGETS ALSO TARGETS DEFINITIONS ;


\ __________________[ pop/push optimizer ]____________________
COMPILER
HEX
 C136 CONSTANT 'DROP'  \ machine code for DROP
VARIABLE DROPS         \ count when DROP, is executed
                       \ means TOS has been refilled ie: in USE
VARIABLE DUPS          \ # of DUPS made
VARIABLE OPTS          \ # of optimizations made

: DUP,     ( n -- n n)  TOS DPUSH, DUPS 1+! ;  \ normal dup
: DROP,    TOS DPOP,  DROPS 1+! ; \ count each drop

: LOOKBACK ( -- u)  THERE 2- @ ; \ fetch previous instruction code

: OPT-DUP, ( n -- n ?n)   \ SMART dup
  LOOKBACK 'DROP' =     \ look back for DROP
  IF -2 TALLOT          \ move target dictionary back 1 cell
      OPTS 1+!          \ count the optimization
   ELSE
        DUP,            \ we must DUP,
   THEN ;

\ SMARTDUP,
\ At the end of many words there is a DROP to refill the TOS register
\ This is one 9900 instruction. (>C136)
\ SMARTDUP uses this instruction as a cue that TOS register is free to use
\ and entire DROP/DUP code can be omitted.
\ *Where possible end your code with DROP rather than putting it earlier
\ so that SMARTDUP can take advantage of it.

DEFER SMARTDUP,     ' DUP, IS SMARTDUP,   \ default is unoptimized

\ _______________________________________________
\ SWAP BYTE optimizer
: ?SWPB,  ( n -- n)
         LOOKBACK 0984 =   \ look back for "SRL R4,8"
         IF -2 TALLOT      \ remove SRL
         ELSE  TOS SWPB,   \ we need SWPB
         THEN
;

\ _______________________________________________
COMPILER
\ Rename HOST FORTH version of colon/semi-colon.
\ This let's us create Assembler macros as Forth words
\ and later to define target compiler colon & semi-colon.
: ;H    POSTPONE ;  ;  IMMEDIATE
: H:    :  ;H

\ tail call optimizing semi-colon
H: -; ( --  )
    LOOKBACK   ( addr )   \ get entry address of sub-routine
    2 CELLS +  ( addr' )  \ move past sub-routine RPUSH instructions
    -04 TALLOT            \ erase BL @addr
    ( addr')  @@ B,       \ compile a branch to the NEW sub-routine
;H

\ Here we steal Words from Forth so the Mforth COMPILER has them also, like
\ comments etc.  When we invoke TARGET we are cut-off from Forth VOCABULARY.
H: \    1 PARSE 2DROP ; IMMEDIATE
H: (    POSTPONE (  ;H  IMMEDIATE  ( borrowed from Forth )
H: COMPILER   COMPILER  ;H
H: HOST       HOST      ;H
H: TARGET     TARGET    ;
H: [HC]       HOST   ;H  IMMEDIATE
H: [TC]       TARGET ;H  IMMEDIATE
H: [CC]       COMPILER ;H IMMEDIATE
H: >R         >R ;H
H: R>         R> ;H
H: CR         CR ;H
H: HEX        HEX ;H
H: DECIMAL    DECIMAL ;H
H: SWAP       SWAP ;H
H: THERE      THERE ;H
H: INCLUDE    INCLUDE  ;H

\ COMPILER  COLON is the just the Host forth's colon
H: ;    POSTPONE ;  ;H  IMMEDIATE
H: :    :  ;H

\ Cross-compiler variables and constants
HOST HEX
VARIABLE LORG     2000 LORG !   \ LORG is TI-99 load address
VARIABLE XSTATE

       2000 CONSTANT CDATA  \ CODE compiles to this buffer
CDATA CELL+ CONSTANT 'BOOT  \ holds program boot address

\ compute diff. between actual load-address and image buffer
: OFFSET   ( -- n)  LORG @ CDATA -  ;
: REL>TARG  ( relocated -- target)  OFFSET - ;

\ run the program at target memory
CODE (RUN)  ( addr -- ) 0454 ,  NEXT,  ENDCODE \  *TOS B, DROP,

COMPILER
: RUN   ( addr --)
       DUP 3FFE 2004 WITHIN ABORT" ORG not in Low RAM"
      (RUN) DROP ;

\ cross-compiler directives
: NEW.
   1000  2000  0 VFILL     \ erase VDP RAM 8K block
   CDATA 2000 FF FILL      \ fill 8K program space with FFFF
   CDATA  ORG              \ program begins after the intial branch & address
   FFFF @@ B,              \ Compile 1ST instruction. branch to bad address
  ['] TARGETS >BODY OFF    \ erase words in TARGETS wordlist
   DROPS OFF  OPTS OFF
   DUPS  OFF
;

.( .)
: ORIGIN.   ( addr ) DUP 2000 U< ABORT" ORG < 2000"  LORG ! ;

: EVEN.   TDP @ ALIGNED TDP ! ;

\ target image fetch and store take relocated addresses and convert them
: T!      ( n relocated -- ) REL>TARG ! ;
: [email protected]      ( relocated -- n)  REL>TARG @ ;

: [email protected] ( <NAME> -- addr)  ' >BODY @  REL>TARG ;

\ NEEDS SAVE FROM DSK2.MFORTHSAVE
\ ______________________________________________________________
\ data structure creation
COMPILER
\ optimizing this DUP is hard. Many times you must free up the TOS.
: LIT,  ( n -- )
         DUP,  TOS SWAP LI,  ;

\ Label creator returns relocated TARGET address when executed
: L:    THERE REL>TARG  CONSTANT ;   ( this Host Forth CONSTANT)

: EQU   ( n -- <NAME>) CONSTANT ;  \ just give the compiler the value

: CONSTANT ( n -- n)  \ create the compiler's constant
      CREATE   ,              \ remember the value
      DOES> ( pfa )
           @  LIT, ; \ compile constant as a literal no.

\ *IDEA*  make a TAG variable with coded values for data types.
: CREATE ( -- addr)
        CREATE THERE ,     \ remember the target address
        DOES> @  REL>TARG  LIT,  ;   \ pushes address onto stack

: ADDR ( n -- n)  \ Special kind of constant goes into AREG
      CREATE   ,
      DOES> ( pfa ) @   AREG SWAP LI,  ;

\ mforth data structure words
: VARIABLE  ( -- addr) CREATE  0000 T, ;

: ALLOT    ( n --) TALLOT ;

\ compile text into the TARGET program image. Use THERE
: S, ( c-addr u -- ) THERE OVER 1+ TALLOT PLACE  EVEN. ;
: S"  ( -- addr len)  [CHAR] " PARSE ;

\ ______________________________________________________________
\ colon definitions
COMPILER
\ Machine Forth colon/semi-colon (creates nestable sub-routines)
: M:
\ compile time action
   CREATE  !CSP
           THERE ,       \ remember the relocated address
           R11 RPUSH,   \ compile "enter sub-routine" code

 \ Runtime action: (when we invoke this word)
         DOES> @  REL>TARG ( addr) @@ BL, ; \ fetch address, compile BL to address

: ;M      R11 RPOP,  RT,  ;  \ compile exit sub-routine code

\ ______________________________________________________________
\ Forth virtual machine setup directives
: WORKSPACE ( addr --) LWPI, ;       \ Forth workspace
: DSTACK    ( addr --) SP SWAP LI, ; \ data stack
: RSTACK    ( addr --) RP SWAP LI, ; \ return stack

: PROG: ( <label> -- taddr)
  CREATE !CSP
     THERE REL>TARG DUP ,  \ record program entry in this word and...
     'BOOT !               \ store relocated address in program header
     XSTATE ON
  DOES> @
;

: END.  ?CSP
        TDP @ 3FFF > ABORT" Prog > 8K"
        XSTATE OFF ;

\ _________________________________________________
HOST CR .( Structured branching and looping )
COMPILER
\ compile an offset byte into the 2nd byte of a JUMP instruction
: OFFSET, ( byte --)  2- 2/ SWAP 1+ C! ;
: <BACK   ( addr addr' -- ) TUCK -  OFFSET, ;

: -IF    ( -- $$)   THERE 0 JLE, ; \ goto THEN if TOS>0
: IF     ( -- $$)   THERE 0 JEQ, ; \ goto THEN if TOS=0
: THEN   ( addr --) THERE OVER -  OFFSET, ;
: ELSE   ( -- $$ )  THERE 0 JMP,  SWAP THEN ;

COMPILER
: BEGIN   THERE ;  \ REMEMBER: THERE is the relocated address
: WHILE    IF SWAP ;  \ loop while TOS <> 0
: -WHILE  -IF SWAP ;  \ loop while TOS = 0
: AGAIN  ( addr --) THERE  0 JMP, <BACK ;
: UNTIL  ( addr --) THERE  0 JEQ, <BACK ; \ jump back until TRUE
: -UNTIL ( addr --) THERE  0 JNE, <BACK ; \ jump back until FALSE
: REPEAT ( addr -- ) AGAIN THEN ;

\ ______________________________________________________________
HOST CR .( Forth Intrinics)
\ These inline code generators are in the actual compiler
COMPILER
HEX
\  Moore's Machine Forth + a few extras
: DROP  ( n -- )  DROP,  ;
: DUP             SMARTDUP,  ;

: OPT-OFF   ['] DUP,     IS SMARTDUP, ;
: OPT-ON    ['] OPT-DUP, IS SMARTDUP, ;

: #     ( n -- )  DUP  TOS SWAP LI, ;
: #@    ( addr -- n)    DUP  @@ TOS MOV, ;
: @     ( TOSaddr -- n) *TOS TOS MOV, ;
: (@)   ( ndx addr)     (TOS) TOS MOV, ;

\ added byte operations: BFox
: #[email protected]   ( addr -- c)   DUP  @@ TOS MOVB,  TOS 8 SRL, ;
: [email protected]    ( TOSaddr -- c) *TOS TOS MOVB,  TOS 8 SRL, ;

: #!    ( n variable -- ) TOS SWAP @@ MOV,  DROP ; \ 6 bytes
: !     ( n TOSaddr --)  *SP+ *TOS MOV, DROP ;

: #C!   ( c addr --) TOS SWPB,  TOS SWAP @@ MOVB, DROP ; \ symbolic addressing
: C!    ( c addr --) 1 (SP) *TOS MOVB,  SP INCT,  DROP ;

[CC] .( .)
: 2*    ( n -- n ) TOS TOS ADD, ;
: 2/    ( n -- n)  TOS 1 SRA, ;
: INVERT ( n -- )  TOS INV, ;  ( ANS Forth name )
: AND,  ( n mask -- n) *SP INV, *SP+ TOS SZC, ; \ and stack values
: #AND  ( n n -- n ) TOS SWAP ANDI, ;  \ and TOS with literal number
: XOR   ( n n -- n)  *SP+ TOS XOR, ;   \ (option 1)
: +     ( n n -- n)  *SP+ TOS ADD, ;   \ (option 1) add on stack
: #+    ( n n -- )    TOS SWAP AI, DROP ;  \ (option 2) TOS + literal number

\ Naming convention for A register has been changed so it
\ uses the features of the TMS9900 CPU. It's dumb not to.

\ These Forth conventions are kept: @ means fetch and ! means store.
\ '*' suffix indicates the source register (A I or R) are indirect
\ memory access.
\ The A register can also be post-incremented after fetching natively
\ We have added indexed addressing as well with (A)@

\ @-A Dpush(T) A=A-cell T=[A]
\ !-A A=A-cell [A]=T Dpop(T)

: [email protected]    ( -- n)  SMARTDUP,   AREG  TOS MOV, ;  \ Dpush(T) T=A
: *[email protected]   ( -- n)  SMARTDUP,  *AREG  TOS MOV, ;  \ Dpush(T) T=*A
: *[email protected]+  ( -- n)  SMARTDUP,  *AREG+ TOS MOV, ;  \ Dpush(T) T=*A  A=A+cell
: (A)@  ( u --)  SMARTDUP,  (AREG) TOS MOV, ;  \ Dpush(T) [email protected](A)

: #A!   ( addr --)  AREG SWAP LI, ; \ load A with literal number BF addition
: A!    ( addr -- ) TOS  AREG  MOV, DROP ;  \ A!   A=T  Dpop(T)
: *A!   ( addr)     TOS *AREG  MOV, DROP ;  \ !A  [A]=T Dpop(T)
: *A!+  ( n --)     TOS *AREG+ MOV, DROP ;  \ !A+ [A]=T A=A+cell Dpop(T)
: (A)!  ( n --)     TOS  SWAP (AREG) MOV, DROP ;

\ More optimal 9900 instructions for A register BFox
: A+!   ( n -- )  TOS AREG ADD,  DROP ;
: A1+!  ( -- )    AREG INC, ;
: A2+!  ( -- )    AREG INCT, ;
: A1-!  ( -- )    AREG DEC, ;
: A2-!  ( -- )    AREG DECT, ;

\ added byte operations. BFox
: *[email protected]   ( -- 0c00) SMARTDUP,  *AREG  TOS MOVB, TOS 8 SRL, ;
: *[email protected]+  ( -- 0c00) SMARTDUP,  *AREG+ TOS MOVB, TOS 8 SRL, ;
: *AC!   ( 0c00 --)  1 (TOS) *AREG MOVB,  DROP ;
: *AC!+  ( 0c00 --)  1 (TOS) *AREG+ MOVB, DROP ;

.( .)
\ return stack operators ........................
\ Names changed to better reflect 9900 Assembler
\ * Indicates indirect addressing mode
\ @ means fetch to TOS register
\ ! means store TOS register to Return stack
: [email protected]    ( -- n )  SMARTDUP, *RP TOS MOV, ;  \ G*Dpush(T) T=[R] C
\ post incrementing
: [email protected]+   ( -- n )  [email protected]  *RP INCT, ;   \ G*  Dpush(T) T=[R] R=R+cell
\ pre decrementing
: [email protected]   ( -- n )  [email protected]  *RP DECT, ;   \ G*  Dpush(T) R=R-cell T=[R]

: R!   ( n -- )  TOS *RP MOV,  DROP ;   \  *R->T  DROP(T)
\ post incrementing
: R!+  ( n -- )  R!   *RP INCT, ;      \ !R+ [R]=T R=R+cell Dpop(T)
\ pre decrementing
: -R!  ( n -- )  *RP DECT, R!   ;      \ R=R-cell [R]=T Dpop(T)

\ Single instructions on 9900. No brainer optimizing
: RDROP ( -- )    RP INCT, ;
: DUP>R ( n -- n) TOS *RP MOV, ;

: >R    ( n -- )  TOS *RP MOV,  DROP  ;        \ G*  Rpush(R) R=T Dpop(T)
: R>    ( -- n)   SMARTDUP,   *RP+ TOS MOV, ;  \ G*  Dpush(T) T=R Rpop(R)

\ For/next loop uses return stack for index. Counts down only.
\ Use A register for other indices.
: [email protected]   ( -- n)  SMARTDUP,  I TOS MOV, ;   \ [email protected] Dpush(T) T=I
: I!   ( n --)  TOS I MOV, DROP ;        \ I=T  Dpop(T)
: I+!  ( n --)  TOS I ADD, DROP ;        \ BFox word
: I1-! ( -- )   I DEC, ;

\ now we can define the FOR NEXT LOOP
: FOR  ( n --)  I RPUSH,   I!  BEGIN ;
: NEXT ( -- )   I1-!  THERE 0 JOC, <BACK  I RPOP, ;

: CALL  ( addr --)  *TOS BL, DROP ;
: RET     *R11 BL, ;

\ _____________[ Chuck Moore Machine Forth Ends ]____________

\ ********* ANS/ISO Forth, 9900 Specific operators *********
: OVER  ( n1 n2 -- n1 n2 n1) TOS DPUSH,  2 (SP) TOS MOV, ;
: NIP   ( n1 n2 -- n2)  SP INCT, ;

\ ___________________________________________________________________
\ __[ Unique TMS9900 primitives ]__
: 1+   ( n -- n')   TOS INC, ;
: 2+   ( n -- n')   TOS INCT, ;
: 1-   ( n -- n')   TOS DEC, ;
: 2-   ( n -- n')   TOS DECT, ;
: 4*   ( n -- n')   TOS 2 SLA, ;
: 8*   ( n -- n')   TOS 3 SLA, ;

\ X -> Y    takes var from TOS to symbolic address
\ Less optimized
: ->  ( var1 -- )   [email protected]  *TOS SWAP @@ MOV, DROP  ;  \ assign var1 to var2

\ Accessing MEM2MEM feature of 9900 CPU saves memory big-time
: '     [email protected]  ;
\ X @ Y !  is 10 instructions, 14 bytes
\ ' X ' Y '!  is 1 instruction, 6 bytes. WINNER!
: '!    ( addr1 addr2) SWAP @@ ROT @@ MOV, ;
: '+!   ( addr1 addr2) SWAP @@ ROT @@ ADD, ;

: LSHIFT ( n -- n)  TOS SWAP SLA, ;
: RSHIFT ( n -- n)  TOS SWAP SRL, ;

\ inc/dec variables with direct addressing
: 1+! ( addr -- ) *TOS  INC,  DROP ;
: 2+! ( addr -- ) *TOS INCT,  DROP ;
: 1-! ( addr -- ) *TOS  DEC,  DROP ;
: 2-! ( addr -- ) *TOS DECT,  DROP ;
: +!  ( n addr --) *SP+ *TOS ADD,  DROP ; \ 6 bytes  \ 5 # X +!
: #+! ( n [addr] -- )  TOS SWAP @@ ADD,  DROP ;

: NOP    ( -- )     0 JMP, ;
: ABS    ( n -- n') TOS ABS, ;
: NEGATE ( n --n')  TOS NEG, ;
: OR     ( w w -- w ) *SP+ TOS SOC, ;
: #OR    ( -- )    TOS SWAP ORI, ;

\ 2 and 3 instruction words for ANS/ISO compatibility
: -     ( n n -- n') *SP+ TOS SUB, TOS NEG, ;
: UM*   ( n n -- d)  *SP  TOS MPY, R5  *SP MOV, ;
: *     ( n n -- n)  *SP+ R3 MOV,  TOS R3 MPY, ;
: ALIGNED ( n -- n)   TOS INC,  TOS -2 ANDI, ;
: AND   ( w w -- w ) *SP INV,  *SP+ TOS SZC, ;
: [email protected]    ( addr -- d) 2 (TOS) DPUSH,   *TOS TOS MOV, ; \ fetch double integer
: EXIT  ( -- )       R11 RPOP,  RT, ;

\ _______________________________________________________________
\ hi-level Forth additions for people less pure than Chuck Moore
: NOT  ( -- )  R1 STST,  R1 2000 ANDI, ;  \ invert EQ flag status
: ON   ( variable --) *TOS SETO, DROP ;
: OFF  ( variable --) *TOS CLR,  DROP ;
: ><   ( n -- )  TOS SWPB, ;

: BOUNDS ( adr len -- adr2 adr1) *SP R1 MOV, TOS *SP ADD, R1 TOS MOV, ;
: SWAP   ( n1 n2 -- n2 n1)      TOS R0 MOV,  *SP  TOS MOV,  R0 *SP MOV, ;

\ CUT n characters from left side of STRING (addr,len)
: /STRING ( addr len n -- addr' len' ) TOS *SP SUB,  TOS 2 (SP) ADD,  DROP ;
: ?TERMINAL ( -- ?)  DUP  0020 @@ BL,  NOT ; \ TI-99 ROM sub-routine

: COUNT  ( Caddr -- addr len )
         DUP               \ make a copy of addr
        *SP INC,           \ inc. past the count byte
        *TOS TOS MOVB,     \ put [email protected] inline to save space
         TOS 8 SRL, ;

\ Rename target number compilers to normal forth names
: ,     T, ;
: C,    TC, ;

\ Last definitions make the TARGET compiler.
\ Alias mforth compiler colon/semi-colon words into TARGET vocabulary
\ so they look like ':' ';' by using Camel Forth's H: and ;H
\ Yes it's confusing. Welcome to Forth cross-compilers.
COMPILER
H: :    M:  ;H
H: ;   ;M   ;H

HOST NEEDS SAVE  FROM DSK2.SAVEIMG

HEX
: COLD  PAGE
       ." Machine Forth Compiler V2  BFox"
       E1 7 VWTR
       COMPILER
       ABORT  ;

COLD

 

 

 

  • Like 3

Share this post


Link to post
Share on other sites

One day, MACHFORTH will be a thing 

 

I have been doing some tail chasing with this project since I started out just trying to make it compile runnable code.

However now I want it to be a useable Forth programming system for TI-99. 

There is no standard way for cross-compiled Forth to operate so I am feeling my way along. 

 

I spent some time (ok a lot of time) re-thinking how to make this Machine Forth look more like normal Forth.

The problem is mostly about interpreting literal numbers correctly during interpretation and  during compilation.

In version 1, I used the # operator to compile the code to put a literal into the code.  I think Chuck Moore resorted to this as well.

(But it looks kind of ugly)

 

The solution was to create a "state" variable for the cross-compiler as Forth does internally. (DUH!) 

The thing that needed changing in the Forth REPL was the code that compiles a literal number.

Since Forth interprets and compiles it needs to either:

  1. Compile some code into the program image that safely puts a number onto the top of the Forth data stack when the program is running.
  2. Interpret the number which for Forth means put in onto the top of compiler's data stack and just leave it there.  

The determining control is the state variable.  In Camel99 Forth I have vectored the REPL  (read evaluate print loop) to use the LISPer's term.

So I re-wrote the REPL loop and changed one line that deals with compiling the literal number. 

VARIABLE XSTATE  \ CROSS-COMPILER state variable

: LIT,  ( n -- )
    XSTATE @ IF      \ IF we are compiling machine Forth code
       TOS DPUSH,    \ *SmartDUP does not work here.
       TOS SWAP LI,  \ compile n as a literal number into R4.
    THEN  ;

: <MFINTERP> ( i*x c-addr u -- j*x ) \ machforth compiler interpret loop
         'SOURCE 2!  >IN OFF
          BEGIN
              BL WORD DUP [email protected] ( -- addr len)
          WHILE
              FIND ?DUP
              IF ( it's a word)
                   1+ STATE @ 0= OR
                   IF   EXECUTE
                   ELSE COMPILE,
                   THEN
              ELSE ( it's a number)
                   COUNT NUMBER? ABORT" MF: Not recognized"
                   LIT,  ( This is the only real difference )
              THEN
              DEPTH 0< ABORT" MF: Compiler stack underflow"
          REPEAT
          DROP ;

I can vector the REPL loop from normal Forth numbers or  for cross-compiling numbers.

The place to do that is in the directives  COMPILER which needs normal Forth numbers and TARGET which is used when we are compiling to the "target" image. 

 

But of course that leads to a minor complication when writing Assembly language inside a colon definition because Forth assembler is interpreted. (No it really is) :) 

This is easily remedied with the [   ]   operators which turn off  and turn on the compiler like this:

: BYE
      [ 0 LIMI,
        83C4 @@ CLR,
        0000 @@ BLWP, ]
;

So that's not too hard to endure.

 

Another issue that came up is when I want to create Assembler macros but I think have a solution since I now have a command FORTH-REPL  so I can over-ride the MACHFORTH-REPL anytime I need to. 

 

So here is what a test program looks like now.

Spoiler
\ MACHFORTH DEMO #1B
\ On Classic99 debugger you will see R4 counting up and rolling over

\ This demo shows how to:
\  1. compile to 2000 origin
\  2. set NEW workspace and stacks in 16bit scratch-pad RAM
\  3. how to exit program to TI-99 title PAGE
\  4. save a finished program that can RUN from E/A Option 5
\  *PRESS FCTN 4 to return to title screen

COMPILER
   HEX
   NEW.
   2000 ORIGIN.

TARGET
: BYE ( -- )              \ make a sub-routine
      [ 0 LIMI,           \ use inline assembler
        83C4 @@ CLR,
        0000 @@ BLWP, ]
;

PROG: DEMO1B
\ assembler setup commands must run in the interpreter
    [  HEX          \ switch off compiler
       0 LIMI,
       8300 WORKSPACE
       83FE DSTACK
       83BE RSTACK
    ]               \ turn on the compiler
       0BEEF        \ >BEEF -> DATA stack
     [ DECIMAL ]    \ change radix in interpreter mode 
       BEGIN
          22 - 7 *   \ do some math with literal numbers 
         ?TERMINAL  \ test for Break key
       UNTIL        \ MACHFORTH UNTIL does not consume the stack parameter
       DROP         \ clean up the stack
       BYE          \ Go home

COMPILER
SAVE DSK2.DEMO1B

 

 

 

 

 

 

 

 

 

 

 

 

  • Like 2

Share this post


Link to post
Share on other sites

Here is an example where you really see the impact of the extra work the 9900 has to do to maintain a stack.

Using the stack to transfer a register to a variable takes a lot of extra time using the Forth stack.

 

Having inline Assembler is the solution is so handy.

 

Spoiler
\ MACHFORTH DEMO #1E   FOR NEXT loop index test

\ I is R8 renamed, used as the loop index for FOR NEXT loops
\ [email protected]  (I fetch) pushes R8 (loop index) onto Forth data stack

COMPILER
   HEX
   NEW.
   2000 ORIGIN.

   VARIABLE X

TARGET
\ FORTH fetch and store 5 seconds
\ ASM   using 9900 mem2mem  1.5 seconds
PROG: DEMO1E
        FFFF FOR
\          [email protected]  X !           \ loop index -> X
         [ I  X @@ MOV, ]    \ Assembler  R8 -> X
        NEXT

      NEXT,      \ Return to Forth
END.

 

 

 

 

  • Like 3

Share this post


Link to post
Share on other sites

The code is beginning to look a little more like normal Forth.

It still needs a few magic incantations but it's getting closer to normal Forth

(if there is such a thing)  :) 

\ fibonacci benchmark in Camel Forth

COMPILER \ Set up environment
   NEW.
   HEX 2000 ORIGIN.
   OPT-OFF

\ Machine Forth does not have DO LOOP so we use a library
INCLUDE DSK2.DOLOOP

\ Machine Forth doesn't normally have ROTate, we have to create one.
TARGET
: ROT  ( n1 n2 n3 --  n2 n3 n1)
     [  2 (SP)   R0 MOV,
       *SP   2 (SP) MOV,
        TOS     *SP MOV,
        R0      TOS MOV, ]
;

: FIB ( n -- )
     0  1
     ROT 0
     DO
        OVER + SWAP
     LOOP
     DROP
;

\ 42.6 seconds using Expansion RAM stacks
\ 35.7 SECONDS using FAST RAM stacks
\ 174 bytes
PROG: MAIN
\ Assembler instructions and directives must be interpreted
[   0 LIMI,
HEX 8300 WORKSPACE
    83EA RSTACK
    8400 DSTACK

DECIMAL
]  \ turn compiler back on
     1000 0
     DO
       I FIB DROP
     LOOP
     [ 0 @@ BLWP, ]
END.

COMPILER
SAVE DSK2.FIBONACCI

 

  • Like 2
  • Thanks 2

Share this post


Link to post
Share on other sites

Out of curiosity I wanted to see if I could write a VDP driver in MachForth using mostly MachForth.

It makes sense to use ALC inside loops because it can really slow things down with the extra stack instructions.

 

Some of the code is similar to what you would hand code yourself.

Like this:

: VDPA! ( Vaddr -- )   \ set vdp address (read mode)
        [ 0 LIMI, ]
        VDPWA (C!)
        VDPWA (C!)
        DROP ;

Compiled to this:

2004  0647  dect R7           * enter nested sub-routine   
2006  C5CB  mov  R11,*R7        

2008  0300  limi >0000        
200C  06C4  swpb R4           
200E  D804  movb R4,@>8c02             
2012  06C4  swpb R4                    
2014  D804  movb R4,@>8c02             
2018  C136  mov  *R6+,R4      * DROP (refill R4 from stack memory)

201A  C2F7  mov  *R7+,R11     * exit nested sub-routine          
201C  045B  b    *R11              

Where things are not as efficient are when we have the arguments in the wrong order for optimal Forth.

I opted to use >R and R>  for these cases but registers would be faster and arguably I could just change the argument order for these functions but that would confuse me too much.

 

We can look at  VWRITE which has the least Rstack pushing and popping.

: VWRITE ( addr Vaddr len -- )
          >R
          4000 OR VDPA!
          R> 1- FOR   [ *TOS+ VDPWD @@ MOVB, ]  NEXT
        [ 2 LIMI, ]
          DROP ;
\ VWRITE 
20EE  0647  dect R7          * enter nested sub-routine  
20F0  C5CB  mov  R11,*R7    

20F2  0647  dect R7          * >R
20F4  C5C4  mov  R4,*R7      
20F6  C136  mov  *R6+,R4     * DROP
20F8  0646  dect R6          * DUP 
20FA  C584  mov  R4,*R6                
20FC  0204  li   R4,>4000    * LITERAL NO.          
2100  E136  soc  *R6+,R4     * 4000 OR    
2102  06A0  bl   @>2004      * CALL VDPA!
2106  0646  dect R6          * DUP  
2108  C584  mov  R4,*R6      
210A  C137  mov  *R7+,R4     * R> 
210C  0604  dec  R4          * 1-
210E  0647  dect R7          * FOR  
2110  C5C8  mov  R8,*R7      * save loop counter register to rstack           
2112  C204  mov  R4,R8       * Load loop counter from data stack           
2114  C136  mov  *R6+,R4     * DROP           
2116  D834  movb *R4+,@>8c00 * write byte to VDP port           
211A  0608  dec  R8          * dec loop counter 
211C  18FC  joc  >2116       * NEXT           
211E  C237  mov  *R7+,R8     * restore Loop counter register           
2120  0300  limi >0002                 
2124  C136  mov  *R6+,R4               

2126  C2F7  mov  *R7+,R11              
2128  045B  b    *R11        * exit nested sub-routine                  

Clearly this can be optimized. I can see one spot where my simple optimizer would remove  6 bytes at >20F6.

But it is still reasonably quick however I will re-write these in tighter ALC.

This was interesting to try.

 

The entire driver and the test program.

Spoiler
\ VDP DRIVER in MACHFORTH

COMPILER  \ Preamble
  NEW.
  HEX 2000 ORIGIN.

\ TINYVDP.FTH library for MACHFORTH V2.1
COMPILER
HEX
8800 EQU VDPRD
8802 EQU VDPSTS
8C00 EQU VDPWD
8C02 EQU VDPWA

TARGET
: VDPA! ( Vaddr -- )   \ set vdp address (read mode)
        [ 0 LIMI, ]
        VDPWA (C!)
        VDPWA (C!)
        DROP ;

: [email protected]    ( addr -- )
          VDPA!
          VDPRD [email protected]
        [ 2 LIMI, ]  ;

: VC! ( c Vaddr -- )
        [ 4000 ] #OR VDPA!
          VDPWD (C!)
        [ 2 LIMI, ]
          DROP  ;

HEX
\ * VDP write to register. Kept the TI name
: VWTR   ( c reg -- )   \ Usage: 5 7 VWTR
           ><           \ TOS SWPB,
           +            \ SP+ TOS ADD,
          [ 8000 ] #OR VDPA!
          [ 2 LIMI, ] ;

: VFILL ( Vaddr cnt char -- )
           >< SWAP >R >R  ( r: cnt char)
         [ 4000 ] #OR VDPA!
           R> ( -- char)
           R> 1- FOR  [ TOS VDPWD @@ MOVB, ]  NEXT
         [ 2 LIMI, ]
           DROP ;

: VREAD ( Vaddr addr n --)
           >R >R  ( r: n addr)
           VDPA!
           R> R>  ( -- addr n)
           1- FOR  [ VDPRD @@  *TOS+ MOVB, ]  NEXT
         [ 2 LIMI, ]
           DROP ;

: VWRITE ( addr Vaddr len -- )
          >R
          4000 OR VDPA!
          R> 1- FOR   [ *TOS+ VDPWD @@ MOVB, ]  NEXT
        [ 2 LIMI, ]
          DROP ;

: PAGE    0 3C0 BL VFILL ;

CREATE A$  S" VDP Routines in MachForth" S,

PROG: DEMOVDP
       PAGE
       18 7 VWTR
       A$ COUNT 0 SWAP VWRITE
       0 3000 20 VREAD  \ read VDP screen in VDP RAM 
       NEXT,            \ return to Camel99 Forth
END.

 

 

  • Like 2

Share this post


Link to post
Share on other sites
3 hours ago, GDMike said:

Good, I'll use this sometime

My hope is that somebody else might be able to use it sometime.

What has amazed me is how you can go around in circles when you are creating a language. There is always another way to implement things that might be better. 

So you make a change. Then all the library code is wrong. So you fix that and realize you didn't think about some other thing and around you go again...

 

This is starting to feel stable now so I have two big todos:

1. Write a manual

2. Figure out how to save a complete binary program that uses wordlists/vocabularies. 

(I was going to procrastinate on that problem, but I put it off) :)

 

  • Like 2

Share this post


Link to post
Share on other sites

.. and I found there are ways of going the LONG WAY.. but I learn a bit along that way.

Bwhaha haha I'll keep at it, maybe I'll end up going backwards..

  • Like 2

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...