Asmusr Posted April 28, 2017 Share Posted April 28, 2017 Differential calculations are an efficient way to reduce the time it takes to do complex calculations on old computers. On modern computers this doesn't matter much, and perhaps not in BASIC either where all the calculations are floating point, but in an assembly programs this can make a lot of difference. For the first example, let's say we need to calculate x squared from 1 to 10. An obvious way to do this is: 10 for x=1 to 10 20 print x*x 30 next x But we don't like the multiplication since this takes much more time than addition or subtraction. Let's investigate how the value of x*x changes when x changes from x to x+1: dx = (x+1)*(x+1) - x*x = x*x + x + x + 1 - x*x = 2*x + 1 That's not bad since 2*x can be done as a binary shift in assembly, and +1 is an INC instruction, which together are faster than a multiplication. In BASIC it looks like this: 10 x=1 20 for n=1 to 10 30 print x 40 x=x+2*n+1 50 next n But we can go further to reduce the complexity of the calculations. Let's investigate what happens to dx=2*x+1 as x changes from x to x+1: ddx = 2*(x+1)+1 - (2*x+1) = 2*x + 2 + 1 - 2*x - 1 = 2 We like that : no more multiplications. For each loop we just need to add 2 to dx, and dx to x: 10 x=1 20 dx=1 30 for n=1 to 10 40 print x 50 dx=dx+2 60 x=x+dx 70 next n So we have ended up calculating a series of squares using additions only. In assembly dx=dx+2 can simply be written as INCT @DX which is even faster than a general addition. This was a very simple example. Let's take another where differential calculations really matter. We want to rotate all the pixels on the screen by a certain angle (a). The formulas for rotating a single point (x, y) are: x1 = x * cos(a) - y * sin(a) y1 = y * cos(a) + x * sin(a) That's pretty complex, but as we move through a scanline x always changes from x to x+1. Let's investigate what happens to x1 and y1 in this case: dx1 = (x+1) * cos(a) - y * sin(a) - (x * cos(a) - y * sin(a)) = (x+1) * cos(a) - y * sin(a) - x * cos(a) + y * sin(a) = (x+1) * cos(a) - x * cos(a) = cos(a) dy1 = y * cos(a) + (x+1) * sin(a) - (y * cos(a) + x * sin(a)) = y * cos(a) + (x+1) * sin(a) - y * cos(a) - x * sin(a) = (x+1) * sin(a) - x * sin(a) = sin(a) So as we move a pixel to the right, dx1=dx1+cos(a) and dy1=dy1+sin(a). Since we want to rotate the entire image by the same angle a, cos(a) and sin(a) are constants, so again we have reduced the complex calculations to simple additions. If we do the same math when we move one scanline down (y -> y + 1) we get dx1=dx1-sin(a) and dy1=dy1+cos(a), so we can rotate an entire image using just additions and subtractions. I hope this shows how differential calculations can be useful. If you have other examples, please post them here. 13 Quote Link to comment Share on other sites More sharing options...
matthew180 Posted April 28, 2017 Share Posted April 28, 2017 Amazing information, as usual. Very cool. 1+1=2 That's the best I can offer. 2 Quote Link to comment Share on other sites More sharing options...
Asmusr Posted April 28, 2017 Author Share Posted April 28, 2017 The calculation of a jump that was discussed the other day is also a differential example. In general, for moving something around in 2D with acceleration ddx, ddy and velocity dx, dy we have: dx = dx + ddx x = x + dx dy = dy + ddy y = y + dy Even with constant ddx and ddy these simple formulas can produce many interesting trajectories, but if we want to simulate something like space gravity it gets a bit more complex. 1 Quote Link to comment Share on other sites More sharing options...
+Vorticon Posted April 28, 2017 Share Posted April 28, 2017 Rasmus, this is going to be extremely handy for a future project I have in mind. Thank you for the crystal clear explanation! Oh and Matt, could you please elaborate on that equation a bit further? It's a little muddy... 1 Quote Link to comment Share on other sites More sharing options...
senior_falcon Posted April 28, 2017 Share Posted April 28, 2017 You are probably thinking 1 AND 1 and wondered how Matthew got 2 out of that. I can see how you might be confused. 1 Quote Link to comment Share on other sites More sharing options...
+Vorticon Posted April 28, 2017 Share Posted April 28, 2017 You are probably thinking 1 AND 1 and wondered how Matthew got 2 out of that. I can see how you might be confused. That it!!! Quote Link to comment Share on other sites More sharing options...
matthew180 Posted April 29, 2017 Share Posted April 29, 2017 Since our posts have now degraded (thanks to me), it might be interesting to observe that: 1&0 == 2 In a certain context. Quote Link to comment Share on other sites More sharing options...
BJGuillot Posted May 5, 2017 Share Posted May 5, 2017 I love math "hacks". Thanks for the clear examples. Also, thanks for a nice real-life use of the INCT instruction. I rarely find myself using it and always thought it was a bit odd to be included in the CPU. Quote Link to comment Share on other sites More sharing options...
+jedimatt42 Posted May 5, 2017 Share Posted May 5, 2017 I always imagined the INCT instruction would show itself useful when playing with addressing. To iterate across a chunk of ram performing 16 bit data operations, INCT makes sense. We just don't get to do a lot of 16 bit work given that the peripherals vdp/sound all want 8 bit data. --- For the rotation example, I am assuming the reference point is (0,0) ? -M@ Quote Link to comment Share on other sites More sharing options...
Asmusr Posted May 5, 2017 Author Share Posted May 5, 2017 I always imagined the INCT instruction would show itself useful when playing with addressing. To iterate across a chunk of ram performing 16 bit data operations, INCT makes sense. We just don't get to do a lot of 16 bit work given that the peripherals vdp/sound all want 8 bit data. --- For the rotation example, I am assuming the reference point is (0,0) ? -M@ I use INCT to iterate through lists of pointers when it's not possible to use post-increment. I use DECT to decrement the counter in CPU RAM to CPU RAM copy loops, although I could often use DEC with half the initial value. Yes the reference point is (0,0) for the rotation. So if you rotate around another point (x0,y0) you need to subtract that from (x,y) before calculating the initial values of (x1,y1) for the top left corner of the image to rotate. However, once that's done the incremental calculations are the same. 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted May 5, 2017 Share Posted May 5, 2017 In the fbForth 2.0 source code, I have used INCT 36 times and DECT 525 times. Most of the DECT uses are for reserving 16-bit stack space (parameter and return stacks grow downward). There are probably an equal number of indirect 16-bit increments via *SP+ or *R+ for popping the stacks. ...lee Quote Link to comment Share on other sites More sharing options...
BJGuillot Posted May 6, 2017 Share Posted May 6, 2017 In the fbForth 2.0 source code, I have used INCT 36 times and DECT 525 times. Most of the DECT uses are for reserving 16-bit stack space (parameter and return stacks grow downward). There are probably an equal number of indirect 16-bit increments via *SP+ or *R+ for popping the stacks. Interesting. I just checked my QR Code generator source, and it looks like I used INCT just 2 times (and 1 of those wasn't code I wrote, but a copy of GPLLNK routine), and used DECT 9 times. That reminds me that I probably should put up the assembly language code for my QR Code generator up on GitHub soon before I have a disk crash or something. I've been meaning to do it, but was a bit embarrassed by how spaghetti-like it is. Though I just just go ahead and put it up as-is this weekend. Quote Link to comment Share on other sites More sharing options...
sometimes99er Posted May 6, 2017 Share Posted May 6, 2017 (edited) I have deducted by observing and experimenting (using high-level programming languages on modern PCs), rather than going through the math. Which I apparently should. I think one can quite easily calculate any precision sine wave table (in Assembly), but then the output rather asks for ROM than RAM. If your time progression is linear then one can avoid sine, but for jumping in time, a table is more or less essential. Interesting. I just checked my QR Code generator source, and it looks like I used INCT just 2 times (and 1 of those wasn't code I wrote, but a copy of GPLLNK routine), and used DECT 9 times. Yes, with auto-increment, one would perhaps expect to see more DECT than INCT. I ran through my 100+ 8K Assembly ditties. I use one DECT in the general "return using stack" segment. Otherwise sometimes one more DECT. The use of INCT ranges from 0 to 8. Operating on stacks, lists and structures, and probably more or less coincidentally when on word boundaries. That reminds me that I probably should put up the assembly language code for my QR Code generator up on GitHub soon before I have a disk crash or something. I've been meaning to do it, but was a bit embarrassed by how spaghetti-like it is. Though I just just go ahead and put it up as-is this weekend. Since accidentally hitting FCTN QUIT in 1982, I think I've done a good deal of backups. Edited May 7, 2017 by sometimes99er Quote Link to comment Share on other sites More sharing options...
RXB Posted May 6, 2017 Share Posted May 6, 2017 XB ROMs use: INCT 29 times DECT19 times Example of ROM Page 2: * Move 26 Characters from TOPSTK to VDP ROLLOUT LN7A90 LI R1,TOPSTK *# LN7A94 LI R3,VDPROA * VDP ROLL OUT AREA >O3CO LN7A98 MOVB @LR3,*R15 * VDP Address LN7A9C ORI R3,>4000 LN7AA0 MOVB R3,*R15 * VDP Address LN7AA2 LI R0,>001A * Put 26 in BYTE COUNTER LN7AA6 MOVB *R1+,@VDPWD *# VDP WRITE DATA LN7AAA DEC R0 * BYTE COUNTER-1 LN7AAC JNE LN7AA6 * 0? NO LOOP LN7AAE CLR @FAC8 LN7AB2 INCT @SUBSTK *# SUBSTACK+2 LN7AB6 MOVB @SUBSTK,R9 LN7ABA SRL R9,8 * Strip High byte from word LN7ABC AI R9,VAR0 LN7AC0 MOV R10,*R9 LN7AC2 B *R11 * RETURN * * Move 26 Characters from VDP ROLLOUT to TOPSTK LN7AC4 LI R1,TOPSTK *# LN7AC8 MOVB @>7A97,*R15 * >C0 to VDP Address LN7ACC MOVB @>7A96,*R15 * >03 to VDP Address LN7AD0 LI R0,>001A * BYTE COUNTER (26=>001A) LN7AD4 MOVB @VDPRD,*R1+ *# VDP READ DATA LN7AD8 DEC R0 * BYTE COUNTER-1 LN7ADA JNE LN7AD4 * 0? NO LOOP LN7ADC CLR @FAC8 LN7AE0 MOVB @SUBSTK,R9 *# LN7AE4 SRL R9,8 * Strip High byte from word LN7AE6 AI R9,VAR0 LN7AEA MOV *R9,R11 LN7AEC DECT @SUBSTK * SUBSTACK-2 LN7AF0 B *R11 * RETURN Quote Link to comment Share on other sites More sharing options...
+TheBF Posted May 8, 2017 Share Posted May 8, 2017 (edited) Using this little script here is what I count in console ROMs ( find INCT & DECT ) VARIABLE ICNT ( counter variables) VARIABLE DCNT : CELLS 2* ; : INCT? ( n -- ) 0FF0 AND 05C0 = IF 1 ICNT +! THEN ; : DECT? ( n -- ) 0FF0 AND 0640 = IF 1 DCNT +! THEN ; : COUNTEM ( addr cnt -- ) OVER + SWAP ( convert to address range) DO I @ INCT? ( read each address) I @ DECT? 2 +LOOP ( advance by 2 bytes) DECIMAL CR ." INCT : " ICNT ? CR ." DECT : " DCNT ? CR ." done" ; ( test data ) CREATE T 05C0 , 05C0 , 05C0 , 05C0 , 05C0 , 05C0 , 0640 , 0640 , 0640 , 0640 , 0640 , 0640 , DECIMAL T 12 CELLS COUNTEM ( returns 6 & 6 ) HEX 0 2000 CELLS COUNTEM INCT : 34 DECT : 27 done ok Edited May 8, 2017 by TheBF 2 Quote Link to comment Share on other sites More sharing options...
RXB Posted May 9, 2017 Share Posted May 9, 2017 Using this little script here is what I count in console ROMs ( find INCT & DECT ) VARIABLE ICNT ( counter variables) VARIABLE DCNT : CELLS 2* ; : INCT? ( n -- ) 0FF0 AND 05C0 = IF 1 ICNT +! THEN ; : DECT? ( n -- ) 0FF0 AND 0640 = IF 1 DCNT +! THEN ; : COUNTEM ( addr cnt -- ) OVER + SWAP ( convert to address range) DO I @ INCT? ( read each address) I @ DECT? 2 +LOOP ( advance by 2 bytes) DECIMAL CR ." INCT : " ICNT ? CR ." DECT : " DCNT ? CR ." done" ; ( test data ) CREATE T 05C0 , 05C0 , 05C0 , 05C0 , 05C0 , 05C0 , 0640 , 0640 , 0640 , 0640 , 0640 , 0640 , DECIMAL T 12 CELLS COUNTEM ( returns 6 & 6 ) HEX 0 2000 CELLS COUNTEM INCT : 34 DECT : 27 done ok This has to be for GROM and VDP access as that is exactly why you see it in the XB ROMs. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.