Jump to content
Lee Stewart

fbForth—TI Forth with File-based Block I/O [Post #1 UPDATED: 02/23/2021]

Recommended Posts

1 hour ago, TheBF said:

Lee we had discussed creating faster dictionary searches using a hashing method like PolyForth.  I found this book by Dr. Ting called inside Forth83, which is a fantastic little Forth for DOS by Laxen & Perry. It used a 4 way hash on the dictionary which is described herein. Here is the link to the book: http://www.forth.org/OffeteStore/1003_InsideF83.pdf. Page 73 shows real code on how they Laxen and Perry did it ....

 

Sounds interesting. I must check it out. Thanks.

 

...lee

Share this post


Link to post
Share on other sites

The comments by @Tursi and @Asmusr regarding methods of calculating square roots sparked a Google search, which netted the algorithm I used in the following fast routine for the square root of an unsigned double number (from Craig McQueen’s response to “Looking for an efficient integer square root algorithm for ARM Thumb2” on stackoverflow.com) :

 

Spoiler
HEX
\ Registers:   R0,R1 = udh,udl
\              R2,R3 = root (nh,nl)
\              R4,r5 = floating 1 (f1h,f1l)
\              R6,R7 = wkh,wkl
ASM: UDSQRT    ( ud -- n )
   *SP+ R0 MOV,      \ pop udh to R0
   *SP R1 MOV,       \ udl to R1
   R2 CLR,           \ clear running..
   R3 CLR,           \ ..root (nh,nl)
   R5 CLR,           \ set floating 1..
   R4 4000 LI,       \ ..to 4000 0000
   
   \ get highest power of 4 <= square (udh,udl)
   BEGIN,
      R4 R0 C,       \ f1h:udh?
      EQ IF,
         R5 R1 C,    \ f1l:udl?
      THEN,
   H WHILE,
      R4 R5 C,          \ f1h:f1l?
      H IF,             \ bit in f1h?
         R4 2 SRL,      \ yes..shift f1h right 2 bits
         EQ IF,         \ bit shifted out?
            R5 4000 LI, \ set f1l to shifted-out bit
         THEN,
      ELSE,             \ no..bit is in f1l
         R5 2 SRL,      \ shift f1l right 2 bits
      THEN, 
   REPEAT,

   \ calculate square root
   BEGIN,
      R4 R5 C,       \ f1h:f1l?
   NE WHILE,         \ f1h,f1l > 0?
      \ set up wkh,wkl = nh,nl + f1h,f1l to compare to udh,udl
      R2 R6 MOV,     \ nh to wkh
      R3 R7 MOV,     \ nl to wkl
      R5 R7 A,       \ f1l + wkl
      OC IF,         \ carry?
         R6 INC,     \ yes..increment wkh
      THEN,
      R4 R6 A,       \ f1h + wkh
      \ check if need to update running values of udh,udl & nh,nl
      R0 R6 C,       \ udh:wkh?
      EQ IF,         \ udh = wkh?
         R1 R7 C,    \ udl:wkl?
      THEN,
      HE IF,         \ udh,udl >= wkh,wkl?
         \ udh,udl = udh,udl - wkh,wkl
         R7 R1 S,    \ udl - wkl
         NC IF,
            R0 DEC,  \ reduce udh on no carry
         THEN,
         R6 R0 S,    \ udh - wkh
         \ nh,nl = wkh,wkl + f1h,f1l
         R5 R7 A,    \ wkl + f1l
         OC IF,      \ carry?
            R6 INC,  \ yes..increment wkh
         THEN,
         R4 R6 A,    \ wkh + f1h
         R6 R2 MOV,  \ wkh to nh
         R7 R3 MOV,  \ wkl to nl
      THEN,
      \ f1h,f1l >> 2 
      R4 R5 C,          \ f1h:f1l?
      H IF,             \ bit in f1h?
         R4 2 SRL,      \ yes..shift f1h right 2 bits
         EQ IF,         \ bit shifted out?
            R5 4000 LI, \ set f1l to shifted-out bit
         THEN,
      ELSE,             \ no..bit is in f1l
         R5 2 SRL,      \ shift f1l right 2 bits
      THEN,
      \ nh,nl >> 1
      R3 1 SRL,         \ shift nl right 1 bit
      R2 1 SRL,         \ shift nh right 1 bit
      OC IF,            \ carry?
         R3 8000 ORI,   \ set MSb of nl
      THEN,
   REPEAT,
   R3 *SP MOV,          \ return n on stack
;ASM

 

 

This routine (UDSQRT) is 128 bytes long, but is very fast. Even for the highest unsigned double number possible, FFFF FFFF16 (4,294,967,29510), it is a mere hesitation—quicker than I could manage a stopwatch. It uses a binary version of the paper-and-pencil method of working out square roots we learned in grade school. The result is truncated rather than rounded.

 

The basis for the above code is this C routine:

Spoiler
/**
 * \brief    Fast Square root algorithm
 *
 * Fractional parts of the answer are discarded. That is:
 *      - SquareRoot(3) --> 1
 *      - SquareRoot(4) --> 2
 *      - SquareRoot(5) --> 2
 *      - SquareRoot(8) --> 2
 *      - SquareRoot(9) --> 3
 *
 * \param[in] a_nInput - unsigned integer for which to find the square root
 *
 * \return Integer square root of the input value.
 */
uint32_t SquareRoot(uint32_t a_nInput)
{
    uint32_t op  = a_nInput;
    uint32_t res = 0;
    uint32_t one = 1uL << 30; // The second-to-top bit is set: use 1u << 14 for uint16_t type; use 1uL<<30 for uint32_t type

    // "one" starts at the highest power of four <= than the argument.
    while (one > op)
    {
        one >>= 2;
    }
    while (one != 0)
    {
        if (op >= res + one)
        {
            op = op - (res + one);
            res = res +  2 * one;
        }
        res >>= 1;
        one >>= 2;
    }
    return res;
}

 

 

...lee

Edited by Lee Stewart
Inclusion of Source Credits
  • Like 2

Share this post


Link to post
Share on other sites

Albert Van der Horst did a pile of work on this for his ciForth system. It's on comp.lang.forth.

I will see if I can find a Forth version for comparison.

Share this post


Link to post
Share on other sites

I found it but it needs a pile of library code from his system and I think it is for 64 bit integers so it will choke on our favourite CPU.

 

https://groups.google.com/g/comp.lang.forth/c/j_8Bdl7-c_s/m/j-J3E_NjAQAJ

 

Incomprehensible to me.

 

Spoiler
1,000,000,013 CONSTANT p

WANT SQ SQRT FACTOR GCD PRIME?
WANT REGRESS

\ For N return FLOOR of the square root of n.
VARIABLE T 1000 T !
: SQRT1 >R
    T @ [email protected] OVER / + 2/ \ Minimize iterations.
    BEGIN 
      [email protected] OVER / OVER + 2/ .S 2DUP > 
    WHILE
      NIP 
    REPEAT 
    DROP RDROP DUP T ! ;

 REGRESS 1001 SQRT S: 31
 REGRESS 81 SQRT S: 9

\ For N return :" n IS a square"
VARIABLE T_s 1000 T_s !
: SQ? >R
    T_s @ [email protected] OVER / + 2/ \ Minimize iterations.
    BEGIN
      [email protected] OVER / OVER + 2/ .S 2DUP > 
    WHILE
      NIP 
    REPEAT 
    DUP T_s ! * R> = ;

 REGRESS 1001 SQ? S: 0
 REGRESS 81 SQ? S: -1

: hart 
  DUP 1 
  DO
    KEY? IF RDROP LEAVE THEN
    DUP I M* THROW SQRT1 1+ DUP >R  \ s
    SQ OVER MOD \ m
    DUP SQ? .S IF
        SQRT \ t
        R> SWAP - OVER GCD LEAVE
    ELSE
        R> 2DROP
    THEN
    .S 
  LOOP ;

 

 

Share this post


Link to post
Share on other sites
56 minutes ago, Lee Stewart said:

The comments by @Tursi and @Asmusr regarding methods of calculating square roots sparked a Google search, which netted the algorithm I used in the following fast routine for the square root of an unsigned double number (I need to search again to give due credit):

 

  Hide contents

HEX
\ Registers:   R0,R1 = udh,udl
\              R2,R3 = root (nh,nl)
\              R4,r5 = floating 1 (f1h,f1l)
\              R6,R7 = wkh,wkl
ASM: UDSQRT    ( ud -- n )
   *SP+ R0 MOV,      \ pop udh to R0
   *SP R1 MOV,       \ udl to R1
   R2 CLR,           \ clear running..
   R3 CLR,           \ ..root (nh,nl)
   R5 CLR,           \ set floating 1..
   R4 4000 LI,       \ ..to 4000 0000
   
   \ get highest power of 4 <= square (udh,udl)
   BEGIN,
      R4 R0 C,       \ f1h:udh?
      EQ IF,
         R5 R1 C,    \ f1l:udl?
      THEN,
   H WHILE,
      R4 R5 C,          \ f1h:f1l?
      H IF,             \ bit in f1h?
         R4 2 SRL,      \ yes..shift f1h right 2 bits
         EQ IF,         \ bit shifted out?
            R5 4000 LI, \ set f1l to shifted-out bit
         THEN,
      ELSE,             \ no..bit is in f1l
         R5 2 SRL,      \ shift f1l right 2 bits
      THEN, 
   REPEAT,

   \ calculate square root
   BEGIN,
      R4 R5 C,       \ f1h:f1l?
   NE WHILE,         \ f1h,f1l > 0?
      \ set up wkh,wkl = nh,nl + f1h,f1l to compare to udh,udl
      R2 R6 MOV,     \ nh to wkh
      R3 R7 MOV,     \ nl to wkl
      R5 R7 A,       \ f1l + wkl
      OC IF,         \ carry?
         R6 INC,     \ yes..increment wkh
      THEN,
      R4 R6 A,       \ f1h + wkh
      \ check if need to update running values of udh,udl & nh,nl
      R0 R6 C,       \ udh:wkh?
      EQ IF,         \ udh = wkh?
         R1 R7 C,    \ udl:wkl?
      THEN,
      HE IF,         \ udh,udl >= wkh,wkl?
         \ udh,udl = udh,udl - wkh,wkl
         R7 R1 S,    \ udl - wkl
         NC IF,
            R0 DEC,  \ reduce udh on no carry
         THEN,
         R6 R0 S,    \ udh - wkh
         \ nh,nl = wkh,wkl + f1h,f1l
         R5 R7 A,    \ wkl + f1l
         OC IF,      \ carry?
            R6 INC,  \ yes..increment wkh
         THEN,
         R4 R6 A,    \ wkh + f1h
         R6 R2 MOV,  \ wkh to nh
         R7 R3 MOV,  \ wkl to nl
      THEN,
      \ f1h,f1l >> 2 
      R4 R5 C,          \ f1h:f1l?
      H IF,             \ bit in f1h?
         R4 2 SRL,      \ yes..shift f1h right 2 bits
         EQ IF,         \ bit shifted out?
            R5 4000 LI, \ set f1l to shifted-out bit
         THEN,
      ELSE,             \ no..bit is in f1l
         R5 2 SRL,      \ shift f1l right 2 bits
      THEN,
      \ nh,nl >> 1
      R3 1 SRL,         \ shift nl right 1 bit
      R2 1 SRL,         \ shift nh right 1 bit
      OC IF,            \ carry?
         R3 8000 ORI,   \ set MSb of nl
      THEN,
   REPEAT,
   R3 *SP MOV,          \ return n on stack
;ASM

 

 

This routine (UDSQRT) is 128 bytes long, but is very fast. Even for the highest unsigned double number possible, FFFF FFFF16 (4,294,967,29510), it is a mere hesitation—quicker than I could manage a stopwatch. It uses a binary version of the paper-and-pencil method of working out square roots we learned in grade school. The result is truncated rather than rounded.

 

...lee

That's a thing of beauty. :) 

  • Thanks 1

Share this post


Link to post
Share on other sites

I finally got around to testing a pre-release of fbForth 2.0:13 on real iron with a 4 MiB SAMS card in a PEB. It was not an exhaustive test, but it did correctly determine that its capacity is 4 MiB.

 

The only remaining item before release is whether I should change SPRDIST and SPRDISTXY to calculate the actual distance (rather than its square) by including UDSQRT (see my last post). It looks like I may be able to squeeze it into the remaining space in bank #1. Not only would it make sense to use actual distances, but it would also allow using all distances that are possible, including those that are too large to be represented by a signed, 16-bit integer. The square of a distance between 182 and 318 pixels (all possible) currently must be forced to 32767 (>7FFF).  One hesitation is that it would break compatibility with TI Forth and old fbForth code that relies on the present distance representation. Another is that it might present a problem for coincidence calculations for small distances because UDSQRT truncates the result. I might have enough room to rectify that, however. Thoughts?

 

...lee

  • Like 1
  • Thanks 1

Share this post


Link to post
Share on other sites
10 hours ago, Lee Stewart said:

I finally got around to testing a pre-release of fbForth 2.0:13 on real iron with a 4 MiB SAMS card in a PEB. It was not an exhaustive test, but it did correctly determine that its capacity is 4 MiB.

 

The only remaining item before release is whether I should change SPRDIST and SPRDISTXY to calculate the actual distance (rather than its square) by including UDSQRT (see my last post). It looks like I may be able to squeeze it into the remaining space in bank #1. Not only would it make sense to use actual distances, but it would also allow using all distances that are possible, including those that are too large to be represented by a signed, 16-bit integer. The square of a distance between 182 and 318 pixels (all possible) currently must be forced to 32767 (>7FFF).  One hesitation is that it would break compatibility with TI Forth and old fbForth code that relies on the present distance representation. Another is that it might present a problem for coincidence calculations for small distances because UDSQRT truncates the result. I might have enough room to rectify that, however. Thoughts?

 

...lee

Could you make the new distance calculator a block on the disk and people can use it if they need it?

 

Share this post


Link to post
Share on other sites
3 minutes ago, TheBF said:

Could you make the new distance calculator a block on the disk and people can use it if they need it?

 

Certainly.

 

...lee

Share this post


Link to post
Share on other sites
1 minute ago, Lee Stewart said:

 

Certainly.

 

...lee

Does that way not appeal to you?

  • Like 1

Share this post


Link to post
Share on other sites
12 hours ago, TheBF said:

Does that way not appeal to you?

 

At this point, I would call it my second choice. It just grates on me to use the square of the distance as the distance—and a flawed (limited) value, at that. The compatibility issue is really all that is holding me back. I will likely take your suggestion, in the end.

 

...lee

  • Like 1

Share this post


Link to post
Share on other sites

Gettin’ there!  This test did just fine on real iron with a 4 MiB SAMS card:

HEX 
EAF0 S0&TIB! DROP     \ S0 and TIB to >EAF0
\ Map SAMS to >F000.
\ Write the SAMS bank # to >F800 for every 16th bank,
\  starting with bank #>000F (pass-through).
: SAMSW  ( -- )
   SAMS? 00F DO
      I F000 >MAP    \ map next SAMS bank
      I F800 !       \ store current bank #
   010 +LOOP
;
\ Read back what SAMSW wrote.
: SAMSR  ( -- )
   SAMS? 0F DO
      I F000 >MAP    \ map next SAMS bank
      F800 ?         \ print value stored by SAMSW
   010 +LOOP
;

\ Do it...
SAMSW SAMSR

...lee

  • Like 1

Share this post


Link to post
Share on other sites
58 minutes ago, TheBF said:

Did you skip every by 16 banks just to save some time?

 

Yes and no. I wanted to see it all on one screen without having to write in a pause or write and run a true memory test, which would have taken quite a while. You see, I have no convenient way to transfer code to diskette from PC. Composing and testing on the 4A would have been a chore. I may do that tomorrow, though. It would be nice to run a full test. Nonetheless, I did accomplish my main goal of proving the SAMS card was operational and that the SAMS words were operating as they should.

 

...lee

  • Like 1

Share this post


Link to post
Share on other sites

I have been using magic file manipulator.  It's a little tedious but it works ok.

Bought an off the shelf RS232 cable and it worked first time. Never had that happen before.

Share this post


Link to post
Share on other sites

I have fbForth 2.0:13 ready to be ZIPped up and will try to get it posted tomorrow. I have FBLOCKS (19FEB2021) ready to go, as well. I put UDSQRT in block 57, See post #1652ff for details. ( I am still thinking of changing fbForth 2.0 in a later build to use actual distances instead of squares of distances for SPRDIST SPRDISTXY DXY COINC COINCXY after more discussion regarding breaking code ported from TI Forth that might use those words. After all, that is the only reason I hesitate doing it—and that just might not be a realistic consideration. )

 

...lee

  • Like 3

Share this post


Link to post
Share on other sites
On 2/19/2021 at 11:01 PM, Lee Stewart said:

I have fbForth 2.0:13 ready to be ZIPped up and will try to get it posted tomorrow. I have FBLOCKS (19FEB2021) ready to go, as well. I put UDSQRT in block 57, See post #1652ff for details. ( I am still thinking of changing fbForth 2.0 in a later build to use actual distances instead of squares of distances for SPRDIST SPRDISTXY DXY COINC COINCXY after more discussion regarding breaking code ported from TI Forth that might use those words. After all, that is the only reason I hesitate doing it—and that just might not be a realistic consideration. )

 

...lee

Does the distance computation affect the coincidence performance much or was this more an academic exercise to show it could be done?

 

Share this post


Link to post
Share on other sites
3 hours ago, TheBF said:

Does the distance computation affect the coincidence performance much or was this more an academic exercise to show it could be done?

 

That depends on a programmer’s use of coincidence. It surely does not affect the coincidence performance much (or at all) because coincidence is probably never used for anything very far from the object of interest, i.e., the coincidence tolerance likely never comes anywhere near 181 pixels, which is the limit of the current calculations because dx2, dy2 and dx2+dy2 are all limited to the highest positive integer, 32767 (1812 = 32761). The mathematical limit is 318 pixels, corner to corner (object1 [0,0] and object2 [255,191] or object1 [0,191] and object2 [255,0]).

 

Right now, SPRDIST is used by COINC and SPRDISTXY is used by COINCXY and both use DXY for their calculations. In fact, the existence of DXY is solely as an intermediary for SPRDIST and SPRDISTXY calculations. DXY only appears in the glossary. Its use is not discussed anywhere else in the TI Forth and fbForth manuals. If I can cram it all into the ROM, I would change the calculations to use actual distances.

 

The other alternative would be to let the user change a flag to use whichever of the two methods they desire—much as I do now with S|F to determine whether M/ uses SM/REM (symmetric division) or FM/MOD (floored division). Again, if I can cram it all into the ROM, this is the way I will do it.

 

For the moment, though, I must focus on packaging fbForth 2.0:13 stuff for release for all of you programmers champing at the bit to get your hands on it. |:)

 

...lee

  • Like 4

Share this post


Link to post
Share on other sites

:waving:  Finally!!   :o

 

 Post #1 has been updated with all of the files for the latest build, fbForth 2.0:13. I will update my website ere long. I will post changes for this build later—gotta get to bed. :sleep:

 

...lee

  • Like 4

Share this post


Link to post
Share on other sites

Post #1 has become pretty cluttered with older files, which most folks will probably never need, and I have been meaning to clean it up, but have put it off until now, when I was made aware of the confusion it has caused. If anyone needs older files, I will provide them upon request.

 

...lee

  • Like 3

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...