Jump to content

Photo

Differential calculations


15 replies to this topic

#1 Asmusr OFFLINE  

Asmusr

    River Patroller

  • 2,270 posts
  • Location:Denmark

Posted Fri Apr 28, 2017 12:47 PM

Differential calculations are an efficient way to reduce the time it takes to do complex calculations on old computers. On modern computers this doesn't matter much, and perhaps not in BASIC either where all the calculations are floating point, but in an assembly programs this can make a lot of difference. 
 
For the first example, let's say we need to calculate x squared from 1 to 10. An obvious way to do this is:
 
10 for x=1 to 10
20 print x*x
30 next x
 
But we don't like the multiplication since this takes much more time than addition or subtraction. Let's investigate how the value of x*x changes when x changes from x to x+1:
 
dx = (x+1)*(x+1) - x*x = x*x + x + x + 1 - x*x = 2*x + 1
 
That's not bad since 2*x can be done as a binary shift in assembly, and +1 is an INC instruction, which together are faster than a multiplication.
 
In BASIC it looks like this: 
 
10 x=1
20 for n=1 to 10
30 print x
40 x=x+2*n+1
50 next n
 
But we can go further to reduce the complexity of the calculations. Let's investigate what happens to dx=2*x+1 as x changes from x to x+1:
 
ddx = 2*(x+1)+1 - (2*x+1) = 2*x + 2 + 1 - 2*x - 1 = 2
 
We like that :) : no more multiplications. For each loop we just need to add 2 to dx, and dx to x:  
 
10 x=1
20 dx=1
30 for n=1 to 10
40 print x
50 dx=dx+2
60 x=x+dx
70 next n
 
So we have ended up calculating a series of squares using additions only. In assembly dx=dx+2 can simply be written as INCT @DX which is even faster than a general addition. 
 
This was a very simple example. Let's take another where differential calculations really matter. We want to rotate all the pixels on the screen by a certain angle (a). The formulas for rotating a single point (x, y) are:
 
x1 = x * cos(a) - y * sin(a) 
y1 = y * cos(a) + x * sin(a)
 
That's pretty complex, but as we move through a scanline x always changes from x to x+1. Let's investigate what happens to x1 and y1 in this case: 
 
dx1 = (x+1) * cos(a) - y * sin(a) - (x * cos(a) - y * sin(a)) = (x+1) * cos(a) - y * sin(a) - x * cos(a) + y * sin(a) = (x+1) * cos(a) - x * cos(a) = cos(a)
dy1 = y * cos(a) + (x+1) * sin(a) - (y * cos(a) + x * sin(a)) = y * cos(a) + (x+1) * sin(a) - y * cos(a) - x * sin(a) = (x+1) * sin(a) - x * sin(a) = sin(a)
 
So as we move a pixel to the right, dx1=dx1+cos(a) and dy1=dy1+sin(a). Since we want to rotate the entire image by the same angle a, cos(a) and sin(a) are constants, so again we have reduced the complex calculations to simple additions.
 
If we do the same math when we move one scanline down (y -> y + 1) we get dx1=dx1-sin(a) and dy1=dy1+cos(a), so we can rotate an entire image using just additions and subtractions.
 
I hope this shows how differential calculations can be useful. If you have other examples, please post them here.


#2 matthew180 OFFLINE  

matthew180

    River Patroller

  • 2,302 posts
  • Location:Castaic, California

Posted Fri Apr 28, 2017 2:30 PM

Amazing information, as usual.  Very cool.

 

1+1=2

 

That's the best I can offer. ;-)



#3 Asmusr OFFLINE  

Asmusr

    River Patroller

  • Topic Starter
  • 2,270 posts
  • Location:Denmark

Posted Fri Apr 28, 2017 3:01 PM

The calculation of a jump that was discussed the other day is also a differential example. In general, for moving something around in 2D with acceleration ddx, ddy and velocity dx, dy we have:

 

dx = dx + ddx

x = x + dx

 

dy = dy + ddy

y = y + dy

 

Even with constant ddx and ddy these simple formulas can produce many interesting trajectories, but if we want to simulate something like space gravity it gets a bit more complex. 



#4 Vorticon OFFLINE  

Vorticon

    River Patroller

  • 2,593 posts
  • Location:Eagan, MN, USA

Posted Fri Apr 28, 2017 3:01 PM

Rasmus, this is going to be extremely handy for a future project I have in mind. Thank you for the crystal clear explanation! 

Oh and Matt, could you please elaborate on that equation a bit further? It's a little muddy...  :D



#5 senior_falcon OFFLINE  

senior_falcon

    Dragonstomper

  • 828 posts
  • Location:Lansing, NY, USA

Posted Fri Apr 28, 2017 3:13 PM

 You are probably thinking 1 AND 1 and wondered how Matthew got 2 out of that.  I can see how you might be confused. :)



#6 Vorticon OFFLINE  

Vorticon

    River Patroller

  • 2,593 posts
  • Location:Eagan, MN, USA

Posted Fri Apr 28, 2017 4:04 PM

 You are probably thinking 1 AND 1 and wondered how Matthew got 2 out of that.  I can see how you might be confused. :)

 

That it!!! 



#7 matthew180 OFFLINE  

matthew180

    River Patroller

  • 2,302 posts
  • Location:Castaic, California

Posted Fri Apr 28, 2017 6:08 PM

Since our posts have now degraded (thanks to me), it might be interesting to observe that:

 

1&0 == 2

 

In a certain context.



#8 BJGuillot OFFLINE  

BJGuillot

    Star Raider

  • 51 posts

Posted Fri May 5, 2017 7:44 AM

I love math "hacks". Thanks for the clear examples.

Also, thanks for a nice real-life use of the INCT instruction. I rarely find myself using it and always thought it was a bit odd to be included in the CPU.

#9 jedimatt42 OFFLINE  

jedimatt42

    Stargunner

  • 1,121 posts
  • Location:Beaverton, OR

Posted Fri May 5, 2017 9:02 AM

I always imagined the INCT instruction would show itself useful when playing with addressing.

To iterate across a chunk of ram performing 16 bit data operations, INCT makes sense.

We just don't get to do a lot of 16 bit work given that the peripherals vdp/sound all want 8 bit data.

---

For the rotation example, I am assuming the reference point is (0,0) ?

-M@

#10 Asmusr OFFLINE  

Asmusr

    River Patroller

  • Topic Starter
  • 2,270 posts
  • Location:Denmark

Posted Fri May 5, 2017 9:37 AM

I always imagined the INCT instruction would show itself useful when playing with addressing.

To iterate across a chunk of ram performing 16 bit data operations, INCT makes sense.

We just don't get to do a lot of 16 bit work given that the peripherals vdp/sound all want 8 bit data.

---

For the rotation example, I am assuming the reference point is (0,0) ?

-M@

 

I use INCT to iterate through lists of pointers when it's not possible to use post-increment. I use DECT to decrement the counter in CPU RAM to CPU RAM copy loops, although I could often use DEC with half the initial value.

 

Yes the reference point is (0,0) for the rotation. So if you rotate around another point (x0,y0) you need to subtract that from (x,y) before calculating the initial values of (x1,y1) for the top left corner of the image to rotate. However, once that's done the incremental calculations are the same.



#11 Lee Stewart ONLINE  

Lee Stewart

    River Patroller

  • 3,149 posts
  • Location:Silver Run, Maryland

Posted Fri May 5, 2017 11:58 AM

In the fbForth 2.0 source code, I have used INCT 36 times and DECT 525 times.  Most of the DECT uses are for reserving 16-bit stack space (parameter and return stacks grow downward).  There are probably an equal number of indirect 16-bit increments via *SP+ or *R+ for popping the stacks.

 

...lee



#12 BJGuillot OFFLINE  

BJGuillot

    Star Raider

  • 51 posts

Posted Fri May 5, 2017 8:35 PM

In the fbForth 2.0 source code, I have used INCT 36 times and DECT 525 times.  Most of the DECT uses are for reserving 16-bit stack space (parameter and return stacks grow downward).  There are probably an equal number of indirect 16-bit increments via *SP+ or *R+ for popping the stacks.

 

Interesting.  I just checked my QR Code generator source, and it looks like I used INCT just 2 times (and 1 of those wasn't code I wrote, but a copy of GPLLNK routine), and used DECT 9 times.

 

That reminds me that I probably should put up the assembly language code for my QR Code generator up on GitHub soon before I have a disk crash or something.  I've been meaning to do it, but was a bit embarrassed by how spaghetti-like it is.  Though I just just go ahead and put it up as-is this weekend.



#13 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 3,806 posts
  • Location:Denmark

Posted Sat May 6, 2017 12:18 AM

I have deducted by observing and experimenting (using high-level programming languages on modern PCs), rather than going through the math. Which I apparently should. ;)
 
I think one can quite easily calculate any precision sine wave table (in Assembly), but then the output rather asks for ROM than RAM. If your time progression is linear then one can avoid sine, but for jumping in time, a table is more or less essential.
 

Interesting. I just checked my QR Code generator source, and it looks like I used INCT just 2 times (and 1 of those wasn't code I wrote, but a copy of GPLLNK routine), and used DECT 9 times.

 

Yes, with auto-increment, one would perhaps expect to see more DECT than INCT.
 
I ran through my 100+ 8K Assembly ditties.
 
I use one DECT in the general "return using stack" segment. Otherwise sometimes one more DECT.
 
The use of INCT ranges from 0 to 8. Operating on stacks, lists and structures, and probably more or less coincidentally when on word boundaries.
  

That reminds me that I probably should put up the assembly language code for my QR Code generator up on GitHub soon before I have a disk crash or something. I've been meaning to do it, but was a bit embarrassed by how spaghetti-like it is.  Though I just just go ahead and put it up as-is this weekend.

 

Since accidentally hitting FCTN QUIT in 1982, I think I've done a good deal of backups. ;)


Edited by sometimes99er, Sun May 7, 2017 3:24 AM.


#14 RXB OFFLINE  

RXB

    River Patroller

  • 2,513 posts
  • Location:Vancouver, Washington, USA

Posted Sat May 6, 2017 11:33 AM

XB ROMs use:

INCT 29 times

DECT19 times

 

Example of ROM Page 2:

* Move 26 Characters from TOPSTK to VDP ROLLOUT                
LN7A90 LI   R1,TOPSTK      *#            
LN7A94 LI   R3,VDPROA      * VDP ROLL OUT AREA >O3CO            
LN7A98 MOVB @LR3,*R15      * VDP Address             
LN7A9C ORI  R3,>4000                   
LN7AA0 MOVB R3,*R15        * VDP Address             
LN7AA2 LI   R0,>001A       * Put 26 in BYTE COUNTER              
LN7AA6 MOVB *R1+,@VDPWD    *# VDP WRITE DATA             
LN7AAA DEC  R0             * BYTE COUNTER-1             
LN7AAC JNE  LN7AA6         * 0? NO LOOP              
LN7AAE CLR  @FAC8                     
LN7AB2 INCT @SUBSTK        *# SUBSTACK+2              
LN7AB6 MOVB @SUBSTK,R9                  
LN7ABA SRL  R9,8           * Strip High byte from word            
LN7ABC AI   R9,VAR0                    
LN7AC0 MOV  R10,*R9                    
LN7AC2 B    *R11           * RETURN
*
* Move 26 Characters from VDP ROLLOUT to TOPSTK               
LN7AC4 LI   R1,TOPSTK      *#            
LN7AC8 MOVB @>7A97,*R15    * >C0  to VDP Address            
LN7ACC MOVB @>7A96,*R15    * >03  to VDP Address            
LN7AD0 LI   R0,>001A       * BYTE COUNTER (26=>001A)            
LN7AD4 MOVB @VDPRD,*R1+    *# VDP READ DATA             
LN7AD8 DEC  R0             * BYTE COUNTER-1             
LN7ADA JNE  LN7AD4         * 0? NO LOOP               
LN7ADC CLR  @FAC8                     
LN7AE0 MOVB @SUBSTK,R9     *#             
LN7AE4 SRL  R9,8           * Strip High byte from word            
LN7AE6 AI   R9,VAR0                    
LN7AEA MOV  *R9,R11                    
LN7AEC DECT @SUBSTK        * SUBSTACK-2              
LN7AF0 B    *R11           * RETURN


#15 TheBF OFFLINE  

TheBF

    Chopper Commander

  • 164 posts
  • Location:The Great White North

Posted Mon May 8, 2017 9:43 AM

Using this little script here is what I count in console ROMs

( find INCT & DECT )
VARIABLE ICNT        ( counter variables)
VARIABLE DCNT
: CELLS  2* ;
: INCT?   ( n -- )   0FF0 AND   05C0 =  IF   1 ICNT +! THEN ;
: DECT?   ( n -- )   0FF0 AND   0640 =  IF   1 DCNT +! THEN ;
: COUNTEM ( addr cnt -- )
          OVER + SWAP        ( convert to address range)
          DO
              I @  INCT?     ( read each address)
              I @  DECT?
          2 +LOOP            ( advance by 2 bytes)
          DECIMAL
          CR ." INCT : "  ICNT ?
          CR ." DECT : "  DCNT ?
          CR ." done"
;
( test data )
CREATE T  05C0 , 05C0 , 05C0 , 05C0 , 05C0 , 05C0 ,
          0640 , 0640 , 0640 , 0640 , 0640 , 0640 ,
DECIMAL
T  12 CELLS COUNTEM  ( returns 6 & 6 )


HEX 0 2000 CELLS COUNTEM

INCT   : 34

DECT :  27

done ok

 

 

 


Edited by TheBF, Mon May 8, 2017 9:45 AM.


#16 RXB OFFLINE  

RXB

    River Patroller

  • 2,513 posts
  • Location:Vancouver, Washington, USA

Posted Mon May 8, 2017 10:06 PM

Using this little script here is what I count in console ROMs

( find INCT & DECT )
VARIABLE ICNT        ( counter variables)
VARIABLE DCNT
: CELLS  2* ;
: INCT?   ( n -- )   0FF0 AND   05C0 =  IF   1 ICNT +! THEN ;
: DECT?   ( n -- )   0FF0 AND   0640 =  IF   1 DCNT +! THEN ;
: COUNTEM ( addr cnt -- )
          OVER + SWAP        ( convert to address range)
          DO
              I @  INCT?     ( read each address)
              I @  DECT?
          2 +LOOP            ( advance by 2 bytes)
          DECIMAL
          CR ." INCT : "  ICNT ?
          CR ." DECT : "  DCNT ?
          CR ." done"
;
( test data )
CREATE T  05C0 , 05C0 , 05C0 , 05C0 , 05C0 , 05C0 ,
          0640 , 0640 , 0640 , 0640 , 0640 , 0640 ,
DECIMAL
T  12 CELLS COUNTEM  ( returns 6 & 6 )


HEX 0 2000 CELLS COUNTEM

INCT   : 34

DECT :  27

done ok

 

 

 

This has to be for GROM and VDP access as that is exactly why you see it  in the XB ROMs.






0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users