+TheBF Posted March 13, 2022 Author Share Posted March 13, 2022 I did a little "googling" to supplement my poor math skills and found this page. http://www.azillionmonkeys.com/qed/sqroot.html Section 5 is interesting and describes what I think the TI Forth engineers were using. Quote used without permission. Mea culpa. "A common application in computer graphics, is to work out the distance between two points as √(Δx2+Δy2). However, for performance reasons, the square root operation is a killer, and often, very crude approximations are acceptable. So we examine the metrics (1 / √2)*(|x|+|y|), and max(|x|,|y|)" TI-FORTH limited the values to 32K. My version shows that we can go un-signed and expand the range. And with a 32 bit accumulator we get a good "out-of-range" flag. This might be adequate for a wide range of applications , significantly improve on the TI-FORTH range of measurement and as a CODE word it would be very fast. Keeps me occupied and out of trouble. 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 13, 2022 Author Share Posted March 13, 2022 DIST^2 is a usable word. It's not super speedy at 1.6 milli-seconds, but it's not bad and it gives ranges out to 65535. DECIMAL : ^2 ( n -- d) S" DUP *" EVALUATE ; IMMEDIATE \ 1.6 milli-seconds : DIST^2 ( spr1 spr2 -- n ?) \ ? = 0, range is valid POSITION ROT POSITION ( -- x y x2 y2) ROT - -ROT - ( -- diffy diffx) ^2 SWAP ^2 ( -- dx^2 dy^2) 0 ROT 0 D+ ; \ convert to Usigned doubles and add Combined with a 16 bit square root word it works like this: : SQRT ( n -- n ) -1 TUCK DO 2+ DUP +LOOP 2/ ; : DISTANCE ( n n -- n) DIST^2 IF DROP TRUE EXIT THEN SQRT ; If you get a -1 the sprite distance is out of range. The whole thing adds 134 bytes to the system. Edit: ( If we remove the text macro for ^2 it uses 114 bytes) 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 13, 2022 Author Share Posted March 13, 2022 I had a memory of some work done by Albert Van der Horst on square roots on comp.lang.forth so I went looking. He made a version using Newtonian interpolation. You can play games to find the best seed but even using a seed of 1 the results are amazing. It is over 10 times faster! Unfortunately the current version dies on negative numbers so I am back in the trench. But it's a good start. \ By Albert Van der Horst, comp.lang.forth, Aug 29, 2017 \ For n return FLOOR of the square root of n. VARIABLE seed 1 seed ! \ While calculating roots near each other., the seed can be kept. \ Otherwise this can be used to save 10 iterations. : init-seed DUP 10 RSHIFT 1024 MAX seed ! ; : SQRT ( n -- ) DUP IF >R seed @ R@ OVER / OVER + 2/ NIP ( DUP . ) \ debug viewing BEGIN R@ OVER / OVER + 2/ ( DUP .) 2DUP > WHILE NIP REPEAT \ seed ! R> DROP THEN ; : TESTROOTAV TMR@ SWAP SQRT TMR@ SWAP . CR - 213 10 */ . ." uS" ; 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 14, 2022 Author Share Posted March 14, 2022 Bugs R Us The stuff you find when you try to do serious work with homemade code. I really should be more professional and run the HAYES test suite on this Forth. The good news: It was simple to add an un-signed division word to Forth because 9900 has an instruction in order to make Albert's SQRT work on un-signed numbers. The bad news: I found that the word 2/ should be a logical shift not an arithmetic shift. I never... actually, like, ummm... read the spec. Oops. 2/ was simple to fix but it means that Camel99 Forth has an official BUG in the wild. ? (that's a caterpillar. We don't have a bug emoji) If you need to use 2/ with negative numbers add this new definition to your program until I get a new release out. HEX CODE 2/ ( n -- n) 0914 , \ TOS 1 SRL, \ **BUG FIX** was SRA. DUH! NEXT, ENDCODE 3 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 14, 2022 Author Share Posted March 14, 2022 After thinking about it I decided that DISTANCE was a good general purpose function and so it can stand on it's own. If you want to compute the DISTANCE between two sprites it is now trivial. : SP.DIST ( spr spr -- n) POSITION ROT POSITION DISTANCE ; So here is how the DISTANCE library file looks now. The only thing I might optimize as CODE is DXY. There are lot of ROTs in code when your are manipulating x,y coordinates so removing two on an intermediate word make me feel better. Spoiler \ DISTANCE.FTH compute distance between 2 coordidates Mar 14 2022 Brian Fox \ Max range is 255 pixels with "out of range" flag. HERE \ machine code is same size as Forth HEX \ : U/ ( u1 u2 -- u3 ) 0 SWAP UM/MOD NIP ; CODE U/ ( u1 u2 -- u3 ) \ unsigned division C004 , \ TOS R0 MOV, \ divisor->R0 04C4 , \ TOS CLR, \ high word in TOS = 0 C176 , \ *SP+ R5 MOV, \ MOVE low word to r5 3D00 , \ R0 TOS DIV, NEXT, ENDCODE \ SQRT by Albert Van der Horst, comp.lang.forth, Aug 29, 2017 \ Newtonian Interpolation. ~10X faster than linear method \ Returns FLOOR of the square root of n. DECIMAL : SQRT ( n -- n') DUP IF >R 1 \ 1st seed R@ OVER U/ OVER + 2/ NIP ( DUP . ) \ debug viewing BEGIN R@ OVER U/ OVER + 2/ ( DUP .) 2DUP > WHILE NIP REPEAT DROP R> DROP THEN ; DECIMAL : DXY ( x y x y -- dx dy) ROT - -ROT - ; : SUMSQR ( n1 n2 -- d) DUP * SWAP DUP * 0 ROT 0 D+ ; : DISTANCE ( x y x y -- n) DXY SUMSQR IF DROP TRUE EXIT THEN SQRT ; HERE SWAP - . ( 170 bytes) 2 Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted March 14, 2022 Share Posted March 14, 2022 Hi, so far you have concentrated on Euclidean distance. I like the square root approximations if it is needed for say computing gravitational attraction. But for games (imagine Asteroids) you want to detect coincidence between any or all sprites, every loop. As a first pass, you want to know if square boundaries could intersect. The algorithm I learned is from Preparata & Shamos Computational Geometry (pretty old now) and probably Knuth before that. Keep the sprite list sorted by Y coordinate at all times. (X sort is not terribly useful on top of that.) As sprite positions typically change 1 pixel at a time, a bubble sort can be adequate. I like to have the sprite list in CPU Ram for updating, and to write the whole thing to VDP in each vertical interrupt interval (watch the VDPSTA bit.) For coincidence, iterate over the list, keeping a window of sprites within 8,16,32 Y pixels(depending on magnification) you need to test the Ith sprite against the last few in this window (if any.) You can reject any with too large X distance. Any slow math or slower pixel-wise comparison can be done on just these few sprites. 3 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted March 14, 2022 Share Posted March 14, 2022 (edited) 1 hour ago, FarmerPotato said: As a first pass, you want to know if square boundaries could intersect. If I understand what you mean here*, this is how I manage fbForth COINC and COINCXY ( but not SPRDIST and SPRDISTXY ), so it is much quicker than calculating Δx2+Δy2 or √(Δx2+Δy2). ____________ *What I understand this to mean is that Δx and Δy are computed and each compared to a tolerance. ...lee Edited March 15, 2022 by Lee Stewart clarification Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted March 14, 2022 Share Posted March 14, 2022 On 3/13/2022 at 4:17 PM, TheBF said: \ While calculating roots near each other., the seed can be kept. \ Otherwise this can be used to save 10 iterations. : init-seed DUP 10 RSHIFT 1024 MAX seed ! ; How is init-seed used? ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 14, 2022 Author Share Posted March 14, 2022 That's interesting stuff. I am not sure maintaining the sort would be able to keep up on our old girl here if the number of SPRITEs got too high but a neat idea just the same. I think it would have to be CODE and not Forth to work really well albeit reading and writing blocks of VDP RAM proceeds at machine speed. In the past I had re-worked the old TI-Forth code to test the square boundaries like Lee is doing, but then I found I could make a faster routine that simply computed the difference between the x,y coordinates of two sprites and compare to a tolerance. It's brute force but it seems to take less time than what I had. Here is the Forth version I had which runs in 1.4mS : COINC ( spr#1 spr#2 tol -- ?) >R POSITION ROT POSITION ( -- x1 y1 x2 y2 ) ROT - ABS R@ < -ROT - ABS R> < AND ; And here is the slightly improved version from my recent work, which runs in 1.2mS HEX CODE DXY ( x y x2 y2 -- dx dy) *SP+ R1 MOV, \ x2 *SP+ TOS SUB, \ y2=y2-y TOS ABS, R1 *SP SUB, \ x=x-x2 *SP ABS, NEXT, ENDCODE : COINC ( spr#1 spr#2 tol -- ?) >R POSITION ROT POSITION ( -- x1 y1 x2 y2 ) DXY R@ < SWAP R> < AND ; An important part of using these is to call COINCALL in the primary loop which is very fast being just a byte fetch. However what I like about your idea is that is could probably handle more sprite coincidences simultaneously. In a game like asteroids for example your method probably performs better. I also do something different for getting at the SPRITE table. I have an integer fetch routine for VDP called V@ so I read x,y at once. Then I have a SPLIT word that splits that into two bytes. For other situations I have turned the SPRITE attribute table into 4 fast arrays so I can read each field independently. : TABLE4: ( Vaddr -- ) \ create a table of 4 byte records CREATE , \ compile base address into this word ;CODE ( n -- Vaddr') \ RUN time 0A24 , \ TOS 2 SLA, ( tos = n x 4 ) A118 , \ *W TOS ADD, NEXT, ENDCODE SAT TABLE4: SP.Y SAT 1+ TABLE4: SP.X SAT 2+ TABLE4: SP.PAT SAT 3 + TABLE4: SP.COLR With this POSITION is defined as: ( removed the limit tests) : POSITION ( sprt# -- dx dy ) ( ?NDX) S" SP.Y V@ SPLIT" EVALUATE ; IMMEDIATE 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 15, 2022 Author Share Posted March 15, 2022 6 minutes ago, Lee Stewart said: How is init-seed used? ...lee I never did figure that out. But from reading the topic Albert indicated that it could save 10 iterations if you compute a good initial seed. For our application with sprites there were not many iterations required so I just locked it to 1. Here is the topic. Maybe you can glean something from it. Faster integer square roots through a seed (google.com) 3 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 15, 2022 Author Share Posted March 15, 2022 Here is a test program that I am using to work on this. Using this loop which is only polling for edges and coincidence it still misses a collision every now and then. I am thinking about trying your idea, Erik, but running the Sprite table read/write, sorting Y and collision detections it a separate process. Between auto-motion running on the interrupt and a separate process for collision detection it frees up the main program to the game itself. I have some very simple mailboxes for inter-task communication so the collision detector sends a message to the game. The game just polls the mailbox and reads the message. Only then does it deal with the sprites. One thing that might (?) improve things is stopping the motion of sprites that collided and wait for a reply message from the game to re-start them. This prevents automotion from messing up your universe. Lots to think about. Thanks Erik. Spoiler \ Sprite COINC and TRAP Test NEEDS DUMP FROM DSK1.TOOLS NEEDS SPRITE FROM DSK1.DIRSPRIT NEEDS AUTOMOTION FROM DSK1.AUTOMOTION NEEDS HZ FROM DSK1.SOUND NEEDS MARKER FROM DSK1.MARKER NEEDS RND FROM DSK1.RANDOM MARKER /TEST DECIMAL : BOUNCE.X ( spr# --) ]SMT.X DUP VC@ NEGATE SWAP VC! ; : BOUNCE.Y ( spr# --) ]SMT.Y DUP VC@ NEGATE SWAP VC! ; : BOUNCE ( spr# --) DUP BOUNCE.X BOUNCE.Y ; : TINK GEN1 1500 HZ -6 DB 40 MS ; : BONK GEN2 120 HZ -4 DB 50 MS ; : TRAPX ( spr# -- ) DUP SP.X VC@ 239 0 WITHIN IF BOUNCE.X TINK EXIT THEN DROP ; : TRAPY ( spr# -- ) DUP SP.Y VC@ 185 0 WITHIN IF BOUNCE.Y TINK EXIT THEN DROP ; : TRAP ( spr# -- ) DUP TRAPX TRAPY ; DECIMAL : SPRITES ( n -- ) \ makes n sprites ( char colr x y sp# ) [CHAR] 0 3 100 90 0 SPRITE [CHAR] 1 5 100 90 1 SPRITE [CHAR] 2 9 100 90 2 SPRITE ; : RNDV ( -- x y) 70 RND 10 + 20 - ; : RNDXY ( -- dx dy) RNDV RNDV ; : RUN ( -- ) 15 SCREEN 1 MAGNIFY PAGE ." CAMEL99 Forth" CR ." Trap/Coinc Test with Automotion" CR SPRITES 25 27 0 MOTION -31 -33 1 MOTION -13 25 2 MOTION AUTOMOTION BEGIN 0 TRAP 1 TRAP 2 TRAP 0 1 7 COINC IF 0 BOUNCE 1 BOUNCE BONK THEN 0 2 7 COINC IF 0 BOUNCE 2 BOUNCE BONK THEN 1 2 7 COINC IF 1 BOUNCE 2 BOUNCE BONK THEN GEN1 MUTE GEN2 MUTE ?TERMINAL UNTIL STOPMOTION \ DELALL 8 SCREEN ; CR .( Type RUN to start demo) TRAP and COINC.mp4 1 Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted March 15, 2022 Share Posted March 15, 2022 (edited) I hear you about the sprite auto-motion interrupt. That complicates things. But it's not hard to (disable automation and) roll your own in a user interrupt routine. (if you want to be exact, find the ISR source code in say TI Intern.) There's just too many VDP reads and writes involved in the console routine, for my liking. For position and motion, I prefer fixed point 16-bit X.x where byte X is the screen coordinate, byte x is a fraction, and the velocity is simply added (a signed 16-bit quantity but preferably -256 to +256.) The sorting is done by swapping indices - not the actual sprite data. Because sprite #1 should always be the player, with bullets having next priority, the sprites are written to VDP in original order (nothing to do with the sorted indices.) Another reason to write the whole sprite table to VDP on each interrupt, is that animating the sprite pattern can happen, too. I also update the player sprite pattern definition, for rotation (it's expensive to keep ALL the patterns loaded for just one sprite.) (the most recent time I used this sprite engine was in "parsec2020" which was only a demo of your ship and a map. But I tested automation with a bunch of asteroids flying around.) I didn't get to the coinc code. pseudocode: Spoiler Edited March 15, 2022 by FarmerPotato 3 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 15, 2022 Author Share Posted March 15, 2022 9 hours ago, FarmerPotato said: I hear you about the sprite auto-motion interrupt. That complicates things. But it's not hard to (disable automation and) roll your own in a user interrupt routine. (if you want to be exact, find the ISR source code in say TI Intern.) There's just too many VDP reads and writes involved in the console routine, for my liking. I might even do that with a process. Since it's cooperative in my system the sprites will never be out of control. When I read the motion code it surprised me how involved it was. I like using it because it save space in RAM. Quote For position and motion, I prefer fixed point 16-bit X.x where byte X is the screen coordinate, byte x is a fraction, and the velocity is simply added (a signed 16-bit quantity but preferably -256 to +256.) Ooo. I like the sound of that. Edit. Wait... I think that's what happens in the ROM code. Quote The sorting is done by swapping indices - not the actual sprite data. Because sprite #1 should always be the player, with bullets having next priority, the sprites are written to VDP in original order (nothing to do with the sorted indices.) Another great idea. Quote Another reason to write the whole sprite table to VDP on each interrupt, is that animating the sprite pattern can happen, too. I also update the player sprite pattern definition, for rotation (it's expensive to keep ALL the patterns loaded for just one sprite.) This depend on the application I guess. If a sprite needs to just change between 2 states very fast two different characters are the way to go but yes you can blit paterns in fast enough for most needs. Quote (the most recent time I used this sprite engine was in "parsec2020" which was only a demo of your ship and a map. But I tested automation with a bunch of asteroids flying around.) I didn't get to the coinc code You have given me lots to chew on. Thanks. Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted March 15, 2022 Share Posted March 15, 2022 On square roots, I was once fascinated by the long division algorithm, which was an appendix to TI’s Basic Electricity AC/DC Circuits textbook. (Community college level.) Here’s a web version: https://byjus.com/maths/square-root-long-division-method/ I have a hunch that this can be applied to 16 but numbers, where a digit is just 2 bits, and operations are mostly 1 bit shifts and JOC. My intuition is that the square root in binary has half the number of significant digits. In other words the square root of a 16 bit number is an 8 bit number. Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 15, 2022 Author Share Posted March 15, 2022 Just now, FarmerPotato said: On square roots, I was once fascinated by the long division algorithm, which was an appendix to TI’s Basic Electricity AC/DC Circuits textbook. (Community college level.) Here’s a web version: https://byjus.com/maths/square-root-long-division-method/ I have a hunch that this can be applied to 16 but numbers, where a digit is just 2 bits, and operations are mostly 1 bit shifts and JOC. My intuition is that the square root in binary has half the number of significant digits. In other words the square root of a 16 bit number is an 8 bit number. Your intuition is good. 256^2=65536 Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted March 15, 2022 Share Posted March 15, 2022 12 minutes ago, TheBF said: Your intuition is good. 256^2=65536 Thanks? Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 15, 2022 Author Share Posted March 15, 2022 24 minutes ago, TheBF said: Your intuition is good. 256^2=65536 Looking at that method I believe this is it in Forth: : SQRT ( n -- n ) -1 TUCK DO 2+ DUP +LOOP 2/ ; 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 15, 2022 Author Share Posted March 15, 2022 @Lee Stewart I played with this and got this to work for 16 bits numbers but it is still probably not optimal for small numbers. In the case of biggest numbers we get about 2x improvement using the ROOTS test word. seed=1 : 31.1 secs Init-seed : 16.4 secs 64516 5000 ELAPSE ROOTS But doing 9 5000 ELAPSE ROOTS is 9.7 seconds with seed=1 AND 11.0 seconds with INIT-SEED which is 8 Here is the file I am using to play around. Spoiler \ integer square root in Forth. Not too fast but small \ *WARNING* The 16 bit limit is: 65000 SQRT . 254 \ This is 10x faster than linear method \: SQRT ( n -- n ) -1 TUCK DO 2+ DUP +LOOP 2/ ; \ INCLUDE DSK1.TOOLS INCLUDE DSK1.ELAPSE \ : U/ ( u1 u2 -- u3 ) 0 SWAP UM/MOD NIP ; \ machine code is same size as Forth HEX CODE U/ ( u1 u2 -- u3 ) \ unsigned division C004 , \ TOS R0 MOV, \ divisor->R0 04C4 , \ TOS CLR, \ high word in TOS = 0 C176 , \ *SP+ R5 MOV, \ MOVE low word to r5 3D00 , \ R0 TOS DIV, NEXT, ENDCODE \ By Albert Van der Horst, comp.lang.forth, Aug 29, 2017 \ For n return FLOOR of the square root of n. DECIMAL : INIT-SEED ( n -- n n') DUP 10 RSHIFT 8 MAX ; \ for 16 bits only : SQRT ( n -- ) DUP IF DUP >R \ INIT-SEED ( optimized seed value) \ 64516 SQRT : 5000x 16.4 seconds 1 ( default seed value ) \ 64516 SQRT : 5000x 31.1 seconds R@ OVER U/ OVER + 2/ NIP ( DUP . ) \ debug viewing BEGIN R@ OVER U/ OVER + 2/ ( DUP .) 2DUP > WHILE NIP REPEAT DROP NIP R> DROP THEN ; : ROOTS ( n1 cnt -- n) 0 ?DO DUP SQRT DROP LOOP DROP ; Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted March 16, 2022 Share Posted March 16, 2022 On 3/13/2022 at 4:17 PM, TheBF said: Spoiler : SQRT ( n -- ) DUP IF >R seed @ R@ OVER / OVER + 2/ NIP ( DUP . ) \ debug viewing BEGIN R@ OVER / OVER + 2/ ( DUP .) 2DUP > WHILE NIP REPEAT \ seed ! R> DROP THEN ; This change was made before your update to using U/ , etc. It is more compact and about 6 % faster: : SQRT ( n1 -- n2 ) DUP IF >R seed @ R@ OVER / + 2/ ( DUP . ) \ debug viewing BEGIN R@ OVER / OVER + 2/ ( DUP .) SWAP OVER > WHILE REPEAT \ seed ! R> DROP THEN ; However, my UDSQRT is more than 8 times faster, i.e., the original routine takes 142 seconds for 10,000 iterations; the above change, 134 seconds; UDSQRT , 17 seconds—probably because there are 2 divisions in each of the first two and none in UDSQRT . ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 16, 2022 Author Share Posted March 16, 2022 5 minutes ago, Lee Stewart said: This change was made before your update to using U/ , etc. It is more compact and about 6 % faster: However, my UDSQRT is more than 8 times faster, i.e., the original routine takes 142 seconds for 10,000 iterations; the above change, 134 seconds; UDSQRT , 17 seconds—probably because there are 2 divisions in each of the first two and none in UDSQRT . ...lee Good to know. 8 times is in line with what we see going from ITC Forth to code for many routines so that sounds right. I didn't really take the time to understand algorithm you are using. It looks very clever. With all the shifting it makes me wonder if the divisions in Albert's version would net out to similar speed on 9900. I suppose to have good comparison between the two methods, I have to convert Albert's code to ALC. I wonder if I could write it in Machine Forth quicker? Might try that too. 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 16, 2022 Author Share Posted March 16, 2022 1 hour ago, Lee Stewart said: However, my UDSQRT is more than 8 times faster, i.e., the original routine takes 142 seconds for 10,000 iterations; the above change, 134 seconds; UDSQRT , 17 seconds—probably because there are 2 divisions in each of the first two and none in UDSQRT . ...lee I re-did my tests to do 10000 iterations and I get these results. \ 1 as seed value: \ Forth: 64516 SQRT -> 10000x 62.2 seconds \ Inlined: 64516 SQRT -> 10000x 42.6 seconds So we are a bit faster by inlining the stuff between the loop words but still far off 17 seconds. I also did Forth "hand optimization" by replacing DUP >R with DUP>R ( the inliner chokes on > because it doesn't end in next. I should fix that.) : SQRT ( n -- ) DUP IF INLINE[ DUP>R 1 ] INLINE[ R@ OVER U/ OVER + 2/ NIP ] BEGIN INLINE[ R@ OVER U/ OVER + 2/ 2DUP ] > WHILE NIP REPEAT INLINE[ DROP NIP R> DROP ] THEN ; 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted March 16, 2022 Share Posted March 16, 2022 12 hours ago, Lee Stewart said: UDSQRT , 17 seconds I forgot to account for the loop without UDSQRT , which is 3+ seconds, so UDSQRT itself takes ~14 seconds for 10,000 iterations—~1.4 ms for a single execution of UDSQRT . ...lee 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 16, 2022 Author Share Posted March 16, 2022 Now you are just showing off. Truth be told I am totally impressed with how you converted the C program. It's above my pay grade. 1 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 18, 2022 Author Share Posted March 18, 2022 I just did a reality check and found that I have 63Mbytes of source code in my LIB.ITC folder. Yikes. That's a lot to "maintain". Anyway while looking things over I found a limitation in my BLOCK implementation for SAMS. It works but I was not using my WINDOWS data correctly when testing if a SAMS bank was in memory or not. Indexed addressing for win! I love this processor. Here is the corrected code with better comments. It has not been fully vetted but it works as expected at the command line. \ BLOCK using 2 pages of SAMS memory in Low RAM Mar 18 2022 Brian Fox NEEDS .S FROM DSK1.TOOLS NEEDS MOV, FROM DSK1.ASM9900 NEEDS SAMSINI FROM DSK1.SAMSINI \ *NEW* common code for SAMS card \ Note: \ I realized that I was not using the WINDOWS array as the source \ of data for the 1st tests. With this code the two windows can be anywhere \ For reference this the data to manage what banks are in RAM VARIABLE USE \ index of the last bank# used CREATE BLK#S 0 , 0 , \ SAMS bank# in the windows CREATE WINDOWS 2000 , 3000 , \ array of windows in CPU RAM CODE BLOCK ( bank# -- buffer) \ FAST test if we already have the bank# in one of windows W CLR, \ W is index register = 0 BLK#S (W) TOS CMP, \ do we have the requested bank# EQ IF, \ yes we do WINDOWS (W) TOS MOV, \ use WINDOWS(0) ie: >2000 NEXT, \ Return to Forth ENDIF, W INCT, \ bump index to next "cell" BLK#S (W) TOS CMP, EQ IF, WINDOWS (W) TOS MOV, \ use windows(2) ie: >3000 NEXT, \ Return to Forth ENDIF, \ ** bank# is not in RAM. Get it \ whatever blk# was last used, switch to the other one W 0001 LI, \ init W to 1 USE @@ W XOR, \ toggle it with the last buffer we used W USE @@ MOV, \ update the USE variable. Can only be 1 or 0 W W ADD, \ "do 2*" It now has the index we will use TOS BLK#S (W) MOV, \ store the NEW bank# in blks#s array WINDOWS (W) R1 MOV, \ get the window to use \ compute address of SAMS card register for this window R1 0B SRL, \ divide by 2048 R1 4000 AI, \ Add base address of SAMS registers R12 1E00 LI, \ select CRU address of SAMS card 0 SBO, \ SAMS card on TOS SWPB, \ swap bytes on bank value TOS R1 ** MOV, \ load bank into SAMS card register 0 SBZ, \ SAMS card off WINDOWS (W) TOS MOV, \ return buffer on TOS NEXT, ENDCODE SAMSINI CR .( SAMS card initialized) 4 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 23, 2022 Author Share Posted March 23, 2022 I was never happy with the way I created the ability to load code in temporary memory and then re-link the dictionary. It seemed buggy and not clear to me and I wrote it! It always makes my head spin when I have to re-work dictionary links. Camel Forth uses LINK->NAME ->LINK linkage. It's not a easy to hold in my head as LINK->LINK->LINK. With a small sketch I was able to get a better mental picture and that helped simplify the code. (But it still is hard to understand) I also removed the input argument to TRANSIENT. It now just uses the H variable. Set that where you need it to be. H= >2000 when the system boots. Spoiler \ transient compilation Mar 19 2022 Brian Fox \ modified to default to use H @ for TRANSIENT definitions memory \ INCLUDE DSK1.TOOLS \ for debugging CR .( Compile transient code in LOW RAM and remove it later) CR .( Remove temporary words with: DETACH ) HEX VARIABLE OLDDP \ remember the dictionary pointer VARIABLE OLDH \ remember the HEAP (low RAM) VARIABLE OLDLINK \ link field of a dummy word after PERMANENT : TRANSIENT ( -- ) H @ DUP>R OLDH ! HERE OLDDP ! \ save the dictionary pointer. R> DP ! \ Point DP to transient memory ; : PERMANENT ( Marks end of transient definitions ) HERE H ! \ update heap pointer (LOW RAM) S" " HEADER, \ DUMMY word is blank. Can't be found LATEST @ NFA>LFA OLDLINK ! \ Remember LFA of DUMMY OLDDP @ DP ! \ restore normal dictionary OLDDP OFF ; \ removes everything from TRANSIENT to this definition : DETACH [ LATEST @ ] LITERAL OLDLINK @ ! ; DETACH is my new name to "detach" the TRANSIENT dictionary from the main dictionary. It replaces ERADICATE. Seemed like a better name. So far it works as expected although it's not nestable. So you use it to get the assembler, compile the code, DETACH. Then you could do that again for another file. That's not a real hardship since mostly it's to get the assembler in the system without taking up memory space. It seems to be a great use of the SUPERCART memory, especially if you are compiling programs to create EA5 executables. It means you don't have to convert all the Assembler code to machine code to make room for your program but you can still test with real data in LOW RAM if needed. CR .( SUPERTOOLS: utilities in SUPER Cart RAM Mar 22 2022) CR NEEDS TRANSIENT FROM DSK1.TRANSIENT CR .( Compile Tools in LOW RAM) HEX 6000 H ! ( put heap in SUPER CART) TRANSIENT INCLUDE DSK1.WORDLISTS ONLY FORTH DEFINITIONS INCLUDE DSK1.ELAPSE INCLUDE DSK1.TOOLS VOCABULARY ASSEMBLER ALSO ASSEMBLER DEFINITIONS INCLUDE DSK1.ASM9900 PERMANENT HEX 2000 H ! \ restore heap to normal low ram .FREE DECIMAL ONLY FORTH DEFINITIONS ALSO ASSEMBLER ORDER 3 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.