Machine Forth OMG

+Lee Stewart · March 11, 2017

So although machine Forth is nice to play with I am still longing to create a threaded system that uses the BL instruction based calling system with the return stack keeping old R11 values.

I know it will get bigger than ITC so what about "indirect sub-routine threading" Where we keep the Forth instruction pointer but DOCOL is looks more like:

( I am making this up here)

MOV R11 *RP

MOV *IP+ W

BL *W

I don't have the details clear in my head but does it inspire any thoughts from you guys?

BF

Here is fbForth’s inner interpreter (sans DODOES):

DOCOL DECT R

MOV IP,*R

MOV W,IP

$NEXT MOV *IP+,W

DOEXEC MOV *W+,R1

B *R1

$SEMIS MOV *R+,IP

MOV *IP+,W

MOV *W+,R1

B *R1

Obviously, this is indirect threading, but I show it for comparison because it is going to take me awhile to wrap my head around direct threading and machine Forth (mainly, because I have never thought about them seriously). At the risk of stating the obvious, B *NEXT (NEXT contains $NEXT most of the time) ends ALC bodies and the cfa of ; ends all : word bodies, causing execution to continue at $SEMIS. Pardon any ineptitude in my attempts to understand what you are doing. I promise to try harder.

...lee

+TheBF · March 11, 2017

Same here—though I am not sure what you mean by “ms timing value”. If you mean the 4.096 value, that was just >1000/1000 = 4096/1000 = 4.096, which should be the comparison factor for how Mark first ran the program in TF.

...lee

Ahh... 4096 = 2^12 DUH

BF

+TheBF · March 11, 2017

I think direct threaded might be better suited to the 9900:

Let R0 be instruction pointer. So, you'd have:
next:
    b *r0+
That's it.

High level and low-level Forth words would appear the same. No difference. You'd still have DOCOL and EXIT but they would just be the addresses of the routines, just like DUP etc.

There would be no CFA. The first cell of a definition is the address (direct address) of some executable code.

So direct threading would add 2 bytes to every header for a Branch instruction vs ITC right?

That is definitely smaller than push R11 and BL in the header.

I should be able to coax the cross-compiler to do that once I get my head around it...

And why R0 for IP versus any other register?

BF

+TheBF · March 11, 2017

Here is fbForth’s inner interpreter (sans DODOES):

DOCOL DECT R

MOV IP,*R

MOV W,IP

$NEXT MOV *IP+,W

DOEXEC MOV *W+,R1

B *R1

$SEMIS MOV *R+,IP

MOV *IP+,W

MOV *W+,R1

B *R1

Obviously, this is indirect threading, but I show it for comparison because it is going to take me awhile to wrap my head around direct threading and machine Forth (mainly, because I have never thought about them seriously). At the risk of stating the obvious, B *NEXT (NEXT contains $NEXT most of the time) ends ALC bodies and the cfa of ; ends all : word bodies, causing execution to continue at $SEMIS. Pardon any ineptitude in my attempts to understand what you are doing. I promise to try harder.

...lee

I am playing games really. I am creating assembler macros to simulate Forth words and tieing it all together with yet another macro called "CALL"

and ending each sub-routine with RT, . I was just amazed at how much faster native code goes.

CALL    RP DECT
        MOV R11 *RP
        BL  @> "some code"
        MOV *RP+ R11

The fun thing is that the Forth Assembler is so flexible that you can do this with little effort.

The other piece that you need is a target compiler.

It works just like the Forth compiler.

And for your interest a basic target compiler starts like this:

VARIABLE TARGMEM   64k ALLOT
VARIABLE TDP          \ target dictionary pointer

: THERE    ( -- addr)    TARGMEM TDP @ + ;  \  Target memory so not HERE but THERE :-)
: TALLOT   ( u -- )      TDP +!  ;
: T!       ( n addr -- ) TARGMEM +  ! ;
: TC!      ( n addr -- ) TARGMEM + C! ;
: T,       ( n addr -- ) THERE  !   2 TALLOT ;  \ like "comma"
: TC,      ( c addr -- ) THERE  C!  1 TALLOT ;  \ like C,

That's about it. That's the basic framework for a target compiler.

Rewrite the basic parts of Forth but use T, and TC, and you have a system in a new memory space.

*edit* Actually re-write your FORTH assembler first, and using T, and TC, etc...

At MPE they name these things like this: ,(t) C,(t) HERE(t) ALLOT(t)

Pretty, but more typing.

B

Edited March 11, 2017 by TheBF

Willsy · March 11, 2017

Obviously, this is indirect threading, but I show it for comparison because it is going to take me awhile to wrap my head around direct threading and machine Forth (mainly, because I have never thought about them seriously).

All explained here, Lee: http://www.bradrodriguez.com/papers/moving1.htm

+TheBF · March 15, 2017

I spent some time reworking the my cross-compiler to compile the little code fragments into the word headers (B DOVAR B DOCOL etc) for DIRECT threaded code, but it's not running yet.

The Kernel compiles and loads but in the debugger I can see that I am not pulling the correct addresses yet as the code travels through next.

Which is only 2 instructions now! It did make the kernel with tools installed ,go from 7022 bytes to 7124 (or so),

I am away from home at the moment so I don't have the exact numbers in front of me.

Thanks Willsy, now I have ANOTHER project to complete.

But thanks for real because it forced me to re-organize the cross-compiler code.

QUESTION:

This cross compiler is running on DOS because I am using my almost ANS Forth update of HsForth for DOS.

If I was to port this for general usage should I go with GForth or Win32Forth or just release it to the world for DOSBOX?

BF

Edited March 15, 2017 by TheBF

Opry99er · March 15, 2017

GForth is cool...

DOSBox is cooler.

Willsy · March 15, 2017

I'd say VFX Forth. There's a free version.

+TheBF · March 16, 2017

LOL!

So the answer is ALL of the above + VFX. OK. Sounds like for some, I could release it for DOSBOX first.

(VFX is pretty amazing. I always liked MPE products but they are bleeding edge of Forth these days)

BF

Willsy · March 16, 2017

Well it's a cross-compiler, right? Runs on a PC? In that case I'd target MPE's free version of VFX and only write it once ;-)

Willsy · March 16, 2017

There again, if it's written in ANS Forth then it *should* run on all of the big-name compilers.

*cough cough* :-)

+TheBF · March 22, 2017

There again, if it's written in ANS Forth then it *should* run on all of the big-name compilers.

*cough cough* :-)

Thanks be to G_d we are not on comp.lang.forth :-)))

Battle plans would be in progress.

B

D-Type · May 1, 2020

What is the state of the art for '99 Forth cross compilers regarding macro inlining of small code words?

I was thinking about how to improve the speed of my 6809 Vectrex/Camel Forth and came to a similar conclusion as this thread i.e. instead of rewriting the compiler as STC (not enough time, never going to happen) I could make it make it STC-ish by inclining code and reducing the call overhead.

(I remembered the inlining thread that came after this one and searched for it, but first came across this thread - will reread the inlining thread next. Simple inlining was actually what I was thinking about using initially, but of course the mind wanders...)

+TheBF · May 1, 2020

1 hour ago, D-Type said:

What is the state of the art for '99 Forth cross compilers regarding macro inlining of small code words?

I was thinking about how to improve the speed of my 6809 Vectrex/Camel Forth and came to a similar conclusion as this thread i.e. instead of rewriting the compiler as STC (not enough time, never going to happen) I could make it make it STC-ish by inclining code and reducing the call overhead.

(I remembered the inlining thread that came after this one and searched for it, but first came across this thread - will reread the inlining thread next. Simple inlining was actually what I was thinking about using initially, but of course the mind wanders...)

In-lining Assembly language in indirect threaded code is pretty space wasteful. I wrote a way to do it and here is how it goes.

You need to create the indirect links to move from threaded code to native machine code. That adds 4 bytes to enter the code.

After the machine runs the instruction pointer has moved forward but the Forth interpreter doesn't know about it so you have to move Forth IP register forward to which I did by compiling more machine code. That added 4 more bytes. Then you compile a way to run the Forth inner interpreter which in my case is another 2 bytes.

So all together you add 10 bytes just to enter and exit machine code from within ITC Forth code.

Not great.

\ put inline ASM in colon definitions
: ASM[
           HERE CELL+ ,            \ compile a pointer to the next cell
           HERE CELL+ ,            \ which is the CFA of the inline code
           [  ;  IMMEDIATE         \ switch to interpreter mode

: ]ASM     0209 ,  HERE 2 CELLS + , \ macro:  LI R9,HERE+4
                                    \ moves Forth IP reg.)
           NEXT,
           ] ;   IMMEDIATE          \ switch ON compiler

Better to just write some code words because at least they are re-useable elsewhere.

However for 1 use the size of the dictionary entry is probably bigger than using ASM[ ]ASM so … maybe it's useful.

Now in a sub-routine threaded system it is beautiful. In fact on some processors like the 9900 the Forth instruction and machine code are the same size as the CALL <ADDRESS> combination.

So for STC systems it's best to inline most of the Forth intrinsic instructions. ( + - @ ! etc.)

D-Type · May 1, 2020

Thanks for the explanation. It all seems so simple, but I don't yet understand the inner workings enough to really judge ?

10 bytes doesn't really seem much of an overhead...does anyone care about memory usage these days? Maybe it's a problem on the '99, I know it has some strange architectural challenges, maybe that's one of them.

I read also the Inlining thread, it wasn't how I remembered it, but it's food for thought for my own Vectrex future enhancements!

Currently I'm working on interfacing the Vectrex BIOS routines from Forth i.e. creating an API. Nothing public yet, but I'll be putting v1 on Github eventually. It actually already is on Github, but Private.

+TheBF · May 1, 2020

Typically the ASM code you need is not very big so 10 bytes may or may not be important. Speed/size. It's the classic trade-off.

BTW on 6809 is might be smaller because of 8 bit op-codes in some cases. However the ASM[ will be pretty much the same for any ITC Forth. (I think)

+Lee Stewart · May 2, 2020

7 hours ago, TheBF said:

In-lining Assembly language in indirect threaded code is pretty space wasteful. I wrote a way to do it and here is how it goes.

<snip>

So all together you add 10 bytes just to enter and exit machine code from within ITC Forth code.

Not great.

I might need to translate that to fbForth. I suspect it will cost me more memory real estate, however—what with vocabularies and all. |:)

...lee

+TheBF · May 2, 2020

It's a neat little hack. I think you just need the macro to use your IP register instead of R9 to make it work. ??

+TheBF · May 2, 2020

16 hours ago, Lee Stewart said:

I might need to translate that to fbForth. I suspect it will cost me more memory real estate, however—what with vocabularies and all.

...lee

In case you haven't tried this yet, I was mistaken. Very sorry.

The only time I used these two words was in a bigger program. It worked fine.

The concept is valid but I think it needs to separate the '[' ']' words to operate on their own. More testing needed to use it independently.

Here is the original usage.

\ inline.fth  a simple speedup for ITC FORTH July 2017  B Fox

\ Premis:
\ An indirect threaded code (ITC) system can spend up to 50% of its time 
\ running the Forth thread interperpreter, typically called NEXT.
\ The ITC NEXT routine is three instructions on the TMS9900.
\ The Forth Kernel contains many words called primitives, that are coded
\ in Assembler.
\ Many of these primitives are only 1 or 2 instructions.
\ INLINE[ ... ] copies the code from a primitive and compiles it in a new 
\ definition but removes the call to NEXT at the end of each primitive.
\ This can double the speed of chains of CODE words.

\ **not portable Forth code**  Uses TMS9900/CAMEL99 CARNAL Knowledge

\ INCLUDE DSK1.CODE

HEX
\ TEST for CODE word
\ CFA of a code word contains the address of the next cell
: ?CODE ( cfa -- ) DUP @ 2- - ABORT" Not code word" ;

\ scan MACHINE code looking for the NEXT, routine.
\ abort if NEXT is not found after 256 bytes. This is an arbitrary size
\ but most Forth code words are much smaller than 256 bytes.
: TONEXT ( adr --  adr2 )
           0                \ flag that falls thru if we don't succeed
           SWAP
          ( ADR) 80         \ max length of code word is $80 CELLS
           BOUNDS
           DO
             I @  045A   =   \ test each CELL for CAMEL99 NEXT (B *R10)
             IF   DROP I LEAVE
             THEN
           2 +LOOP
           DUP 0= ABORT" can't find NEXT" ;

\ : RANGE  ( cfa -- addr cnt )
\         >BODY DUP TONEXT OVER  -  ;  \ calc.  start and length of code

\ put inline ASM in colon definitions
: ASM[
           HERE CELL+ ,            \ compile a pointer to the next cell
           HERE CELL+ ,            \ which is the CFA of the inline code
           [  ;  IMMEDIATE         \ switch to interpreter mode

: ]ASM     0209 ,  HERE 2 CELLS + , \ macro:  LI R9,HERE+4
                                    \ moves Forth IP reg.)
           NEXT,
           ] ;    IMMEDIATE          \ switch ON compiler

\ create code words using primitives
: CODE[    BEGIN
             BL PARSE-WORD PAD PLACE
             PAD CHAR+ C@ [CHAR] ] <>
           WHILE
             PAD FIND 0= ABORT" not found"
             DUP ?CODE
             >BODY DUP TONEXT OVER  -     \ calc. start and len. of code
             HERE OVER ALLOT SWAP CMOVE   \ transcribe the code to HERE
           REPEAT ; IMMEDIATE

\ embed  a literal number as machine code  *HUGE* 8 bytes!!
\ equivalent: TOS PUSH,  LI TOS ( n ) , ;
: :ARG   ( n -- ) 0646 , C584 , 0204 , ( n) ,  ;  IMMEDIATE

\ compile primitives inline inside a colon definition
: INLINE[
           POSTPONE ASM[
           POSTPONE CODE[
           POSTPONE ]ASM ;  IMMEDIATE
\ ===================================


\ EXAMPLES
\ CODE 1+!  ASM[ *TOS INC,  TOS POP, ]ASM  NEXT,

\ CREATE Q  20 ALLOT
\ CODE ]Q    CODE[ 2* ]  Q :ARG  CODE[ + ] NEXT, END-CODE

 : *+       INLINE[ * + ]    ;

 : DUPC@    INLINE[ DUP C@ ] ;
\ : DUP>R    INLINE[ DUP >R ] ;
 : ^2       INLINE[ DUP *  ] ;

+Lee Stewart · May 2, 2020

14 hours ago, TheBF said:

It's a neat little hack. I think you just need the macro to use your IP register instead of R9 to make it work. ??

Unfortunately, fbForth (as does TI Forth) requires a vocabulary change to ASSEMBLER to expose the Forth Assembler and a return to the FORTH vocabulary when done with in-lining the ALC. ASM: and ;ASM do this for defining words, but invoking the ASSEMBLER vocabulary would need to be added to ASM[ to enable the Assembler words. ]ASM already covers the return to the FORTH vocabulary because it contains NEXT, , which does that for both fbForth and TI Forth.

...lee

+Lee Stewart · May 2, 2020

5 minutes ago, TheBF said:

In case you haven't tried this yet, I was mistaken. Very sorry.

The only time I used these two words was in a bigger program. It worked fine.

The concept is valid but I think it needs to separate the '[' ']' words to operate on their own. More testing needed to use it independently.

Probably only need to [COMPILE] [ (immediate) but not ] (not immediate). The following works in fbForth:

HEX
: ASM[   \ Begin Forth Assembly Code within high-level Forth
   HERE 2+ ,            \ compile a pointer to the next cell
   HERE 2+ ,            \ which is the CFA of the inline code
   [COMPILE] [          \ switch to interpreter mode  
   [COMPILE] ASSEMBLER  \ switch to ASSEMBLER vocabulary
;  IMMEDIATE

: ]ASM   \ Back to high-level Forth
   020D , HERE 4 + ,    \ LI R13,HERE+4  (move Forth IP to after NEXT,)
   ASSEMBLER NEXT,      \ NEXT, in ASSEMBLER vocabulary in kernel
   FORTH ]              \ back to FORTH vocabulary and switch ON compiler
;    IMMEDIATE
DECIMAL

...lee

+TheBF · April 6, 2021

MACHFORTH: A 9900 Machine Forth Almost Ready for prime-time:

I started this project a while back but in light of work needed on my core kernel I had put it on the sidelines.

For anyone who might be interested, Chuck Moore, the author of Forth, created this type of compiler when he began designing his own CPUs.

Machine Forth uses a regular Forth interpreter to translate little code macros that become the keywords of the Forth language

Machine Forth compiles the code into a separate memory chunk as an executable program.

There are no headers or text labels in the code. It is a genuine binary program.

Features

In my implementation you can save your finished program as an E/A5 program.

Another interesting feature is that you can RUN the machine code program then return to Forth where you can dump memory or interrogate or alter variables and run it again.

You can also freely use ASM instructions (RPN) inside a program if you know Forth's register usage.

* If this is of any interested to someone else I will write up some instructions and put together a package.

I have no delusions about Forth's popularity.

I have added the unique instruction features of the 9900 to the language in an equivalent Forth syntax. Of course you are free to add anything you want. It's Forth!

I have also taken the approach of some ideas from the Forth Programmer's Handbook (Conklin&Rather) and MACHFORTH more like a Forth Cross-compiler with separate name-spaces for different commands.

The system is still green but it does work. I have to review the library code I have for VDP I/O and I need to write some file I/O.

Chuck is wild minimalist so below is something like what his entire compiler would look like for the 9900, written with RPN assembler.

Chucks versions for his CPUs were all machine code so he didn't even need to load an assembler.

Of course he memorized all the machine instructions thus the name.

MACHFORTH is significantly bigger than this version below.

\ compile an offset byte into the 2nd byte of a JUMP instruction
: OFFSET, ( byte --)  2- 2/ SWAP 1+ C! ;
: <BACK   ( addr addr' -- ) TUCK -  OFFSET, ;

\ I have expanded IF to make use of the 9900 comparisons
\ **USE Assembler's THERE  which is not relocated
: -IF   ( -- $$ )  THERE 0 JLE, ; \ goto THEN if TOS>0
: IF               THERE 0 JEQ, ; \ goto THEN if TOS=0
: THEN   ( addr --) THERE OVER -  OFFSET, ;
: ELSE   ( -- $$ )  THERE 0 JMP,  SWAP THEN ;

COMPILER
: BEGIN   THERE ;  \ REMEMBER: THERE is the relocated address
: WHILE    IF SWAP ;  \ loop while TOS <> 0
: -WHILE  -IF SWAP ;  \ loop while TOS = 0
: AGAIN  ( addr --) THERE  0 JMP, <BACK ;
: UNTIL  ( addr --) THERE  0 JEQ, <BACK ; \ jump back until TRUE
: -UNTIL ( addr --) THERE  0 JNE, <BACK ; \ jump back until FALSE
: REPEAT ( addr -- ) AGAIN THEN ;

\ ______________________________________________________________
HOST CR .( Forth Intrinics)
\ These inline code generators are in the COMPILER name space
COMPILER
HEX
\  Moore's Machine Forth + a few extras
: -;    ( addr --)  @@ B, ;   \ Jump to word ie: tail call optimization
: DUP   ( n -- n n)  TOS PUSH, ;
: DROP  ( n -- )  *SP+ TOS MOV, ;
: #     ( n -- ) LIT, ;  \ 8 bytes w/optimizer off

: #FOR  ( n --) R15 RPUSH,  R15 SWAP  LI,  BEGIN ;  \ 8 bytes
: FOR           R15 RPUSH,  TOS R15 MOV,  TOS POP,  BEGIN ; \ 6 bytes
: NEXT ( -- )  R15 DEC,  THERE 0 JOC, <BACK R15 RPOP, ; \ 6 bytes
: I    ( -- n) TOS PUSH,  R15 TOS MOV, ;

: #@    ( addr -- n) TOS PUSH,   @@ TOS MOV, ;
: @     ( TOSaddr -- n) *TOS TOS MOV, ;
: #C@   ( addr -- c) TOS PUSH,   @@ TOS MOVB,  TOS 8 SRL, ;
: C@    ( TOSaddr -- c) *TOS TOS MOVB,  TOS 8 SRL, ;

: #!    ( n variable -- ) TOS SWAP @@ MOV,  DROP ; \ 6 bytes
: !     ( n TOSaddr --)  *SP+ TOS MOV, DROP ;
\ Additions for 9900. Needs to go here before SWAP is re-defined
: #C!   ( addr --)  TOS SWPB, TOS SWAP @@ MOVB, DROP ;
: C!    ( c TOSaddr --) 1 (SP) *TOS MOVB, SP INCT, DROP ;
[CC] .( .)
: 2*    ( n -- n ) TOS 1 SLA, ;
: 2/    ( n -- n)  TOS 1 SRA, ;

: INVERT ( n -- )  TOS INV, ;  ( ANS Forth name )
: AND,  ( n mask -- n) *SP INV, *SP+ TOS SZC, ; \ and stack values
: #AND  ( n n -- n ) TOS SWAP ANDI, ;  \ and TOS with literal number
: XOR   ( n n -- n)  *SP+ TOS XOR, ;   \ (option 1)
: +     ( n n -- n)  *SP+ TOS ADD, ;  \ (option 1) add on stack
: #+    ( n n -- )    TOS SWAP AI, ;  \ (option 2) TOS + literal number

\ return stack operators
: >R    ( n -- ) TOS RPUSH, TOS POP, ;   \ PUSH in original Machine Forth
: R>    ( -- n)  TOS PUSH,  TOS RPOP, ;  \ POP  in original Machine Forth
: R@    ( -- n ) TOS PUSH, *RP  TOS MOV, ;
: RDROP ( -- )   RP INCT, ;
: DUP>R ( n -- n) TOS RPUSH, ;
\ : !R    ( n -- )     TOS *RP+ MOV, ;  \ undocumented ??
.( .)
\ Address register (R9) Use >R R> syntax for clarity
: #>A   ( addr --)   AREG SWAP LI, ; \ load A with literal address
: >A    ( addr -- addr)  TOS AREG MOV, DROP ;  \ Load A from TOS
: A>    ( -- addr)   TOS PUSH,   AREG  TOS MOV, ;

: A@    ( -- n)  TOS PUSH,  *AREG  TOS MOV, ;
: A@+   ( -- n)  TOS PUSH,  *AREG+ TOS MOV, ;
: AC@   ( -- c)  TOS PUSH,  *AREG  TOS MOVB,  TOS 8 SLA,  ;
: AC@+  ( -- c)  TOS PUSH,  *AREG+ TOS MOVB,  TOS 8 SLA,  ;
: A!    ( addr -- ) TOS *AREG MOV, DROP ;
: A!+   ( n -- )    TOS  *AREG+ MOV, ; \ store to *A auto-increment

: OVER  ( n1 n2 -- n1 n2 n1) TOS PUSH,  2 (SP) TOS MOV, ;
: NIP   ( n1 n2 -- n2)  SP INCT, ;

Below is what a simple demo program looks like in source code with the resulting code below.

COMPILER  \ names space that has compiler directives
   NEW.
   HEX 2000 ORIGIN.

TARGET    \ Makes code compile to TARGET memory
PROG: DEMO1
      0 #           \ 0 -> DATA stack
      BEGIN
          1+         \ inc # on top of data stack
         ?TERMINAL   \ test for Break key
      UNTIL         \ MACHINE FORTH uses native status register
      DROP          \ remove number from data stack
      NEXT,         \ Return to Forth console
END.

;  MACHFORTH DEMO1 Code with comments 

2004  0646  dect R6         ; make room on the data stack
2006  C584  mov  R4,*R6     ; push accumulator onto data stack            
2008  0204  li   R4,>0000   ; load accumulator with zero            
      0000
200C  0584  inc  R4         ; inc. accumulator            
200E  06A0  bl   @>0020     ; call break key sub-routine           
      0020
2012  02C1  stst R1         ; get status register           
2014  0241  andi R1,>2000   ; mask the EQ bit           
      2000
2018  13F9  jeq  >200c      ; jump to top of loop 
201A  C136  mov  *R6+,R4    ; drop number off data stack            
201C  045A  b    *R10       ; run Forth interpreter

+Lee Stewart · April 6, 2021

13 hours ago, TheBF said:

For anyone who might be interested, Chuck Moore, the author of Forth, created this type of compiler when he began designing his own CPUs.

This is really something! It will be some time before I will grok what you have done here, but, at first blush, the use of JLE, in -IF seems problematic with the “logical lower than” rather than “arithmetic less than” meaning of the ‘L’ of ‘JLE’.

...lee

Edited April 6, 2021 by Lee Stewart
clarification

+TheBF · April 6, 2021

Thanks. It is a weird dialect. I have not tested everything so this could be a problem area.

Chuck didn't use, or even create in his CPU, all the different condition flags. There is IF and -IF. (not IF)

These were the closest instructions I could see to match the ARM version done by Sam Falvo. I think I wrote something to see if they worked but I can't remember just now.

I will get back to you with the results of a demo program using IF and -IF.

Also to allow the programmer to fully control things. IF and -IF do not consume the stack parameter. You DROP it yourself or not depending on your need. This eliminates the over head of DUP IF or ?DUP IF.

Part of the challenge is that it is pretty big now and it takes about a minute to compile.

It is a perfect application to save as a Binary program however I didn't understand how to tame my newly hatched vocabulary system and make it wake up intact after loading an E/A5 program with wordlists and search order.

So much new stuff, that I built, that needs learning by ME.

Anyway I got it figured out yesterday so it became possible to talk about releasing it.

+TheBF · April 7, 2021

Native Code Forth versus ITC Forth

Here is an old test program that I made to see how my early kernel performed.

I re-tested it today with my latest threaded Forth kernel, TurboForth and MachForth after I added DO/LOOP.

(Chuck Moore has abandoned DO/LOOP in favour of a simple down-counter loop called FOR NEXT.

Most of the Forth primitives are coded inline because they are one to three instructions.

Calling overhead with a return stack on 9900 is 4 instructions/8 bytes! So it makes sense to just inline .

The programmer is free to wrap anything in a colon/sem-colon sub-routine if it makes sense.

And since arguments are implicitly on the data stack it is simpler than making sub-routines with local variables.

Below you can see that I added rotate as a sub-routine because it's not built into the compiler.

You can seamlessly use Assembler code inside MachForth.

Colon in MachForth pushes R11 onto the return stack, semi-colon pops R11 and does RT.

The video shows us leaving Forth and running a the program in Low RAM and returning to Forth.

We can even use the Forth elapsed timer.

The finished program size is 152 bytes.

Spoiler


\ primitive bench to compare ITC Forth to Machine Forth

COMPILER
 NEW.
 HEX 2000 ORIGIN.

TARGET
: ROT ( a b c == b c a) \ add ROT instruction to MachForth
     2 (SP)    W MOV,
     *SP   2 (SP) MOV,
     TOS     *SP MOV,
     W       TOS MOV,
;

PROG: TEST
          3000 #  0 #
          DO
              AAAA #
               DUP
               SWAP
               OVER
               ROT
               DROP
               DUP AND
               DUP OR
               DUP XOR
               1+
               1-
               2+
               2-
               2*
               2/
               NEGATE
               ABS
               +
               2 # *
              DROP
         LOOP
         NEXT,
END.

\ Camel 2.67          12.11 secs
\ TurboForth 1.2.1:5  12.69 secs
\ MachForth            4.68 secs

Machine Forth OMG

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members