# Differential calculations

15 replies to this topic

### #1 AsmusrOFFLINE

Asmusr

River Patroller

• 2,580 posts
• Location:Denmark

Posted Fri Apr 28, 2017 12:47 PM

Differential calculations are an efficient way to reduce the time it takes to do complex calculations on old computers. On modern computers this doesn't matter much, and perhaps not in BASIC either where all the calculations are floating point, but in an assembly programs this can make a lot of difference.

For the first example, let's say we need to calculate x squared from 1 to 10. An obvious way to do this is:

```10 for x=1 to 10
20 print x*x
30 next x```

But we don't like the multiplication since this takes much more time than addition or subtraction. Let's investigate how the value of x*x changes when x changes from x to x+1:

`dx = (x+1)*(x+1) - x*x = x*x + x + x + 1 - x*x = 2*x + 1`

That's not bad since 2*x can be done as a binary shift in assembly, and +1 is an INC instruction, which together are faster than a multiplication.

In BASIC it looks like this:

```10 x=1
20 for n=1 to 10
30 print x
40 x=x+2*n+1
50 next n```

But we can go further to reduce the complexity of the calculations. Let's investigate what happens to dx=2*x+1 as x changes from x to x+1:

`ddx = 2*(x+1)+1 - (2*x+1) = 2*x + 2 + 1 - 2*x - 1 = 2`

We like that  : no more multiplications. For each loop we just need to add 2 to dx, and dx to x:

```10 x=1
20 dx=1
30 for n=1 to 10
40 print x
50 dx=dx+2
60 x=x+dx
70 next n```

So we have ended up calculating a series of squares using additions only. In assembly dx=dx+2 can simply be written as INCT @DX which is even faster than a general addition.

This was a very simple example. Let's take another where differential calculations really matter. We want to rotate all the pixels on the screen by a certain angle (a). The formulas for rotating a single point (x, y) are:

```x1 = x * cos(a) - y * sin(a)
y1 = y * cos(a) + x * sin(a)```

That's pretty complex, but as we move through a scanline x always changes from x to x+1. Let's investigate what happens to x1 and y1 in this case:

dx1 = (x+1) * cos(a) - y * sin(a) - (x * cos(a) - y * sin(a)) = (x+1) * cos(a) - y * sin(a) - x * cos(a) + y * sin(a) = (x+1) * cos(a) - x * cos(a) = cos(a)
dy1 = y * cos(a) + (x+1) * sin(a) - (y * cos(a) + x * sin(a)) = y * cos(a) + (x+1) * sin(a) - y * cos(a) - x * sin(a) = (x+1) * sin(a) - x * sin(a) = sin(a)

So as we move a pixel to the right, dx1=dx1+cos(a) and dy1=dy1+sin(a). Since we want to rotate the entire image by the same angle a, cos(a) and sin(a) are constants, so again we have reduced the complex calculations to simple additions.

If we do the same math when we move one scanline down (y -> y + 1) we get dx1=dx1-sin(a) and dy1=dy1+cos(a), so we can rotate an entire image using just additions and subtractions.

I hope this shows how differential calculations can be useful. If you have other examples, please post them here.

### #2 matthew180OFFLINE

matthew180

River Patroller

• 2,450 posts
• Location:Castaic, California

Posted Fri Apr 28, 2017 2:30 PM

Amazing information, as usual.  Very cool.

1+1=2

That's the best I can offer.

### #3 AsmusrOFFLINE

Asmusr

River Patroller

• Topic Starter
• 2,580 posts
• Location:Denmark

Posted Fri Apr 28, 2017 3:01 PM

The calculation of a jump that was discussed the other day is also a differential example. In general, for moving something around in 2D with acceleration ddx, ddy and velocity dx, dy we have:

dx = dx + ddx

x = x + dx

dy = dy + ddy

y = y + dy

Even with constant ddx and ddy these simple formulas can produce many interesting trajectories, but if we want to simulate something like space gravity it gets a bit more complex.

### #4 VorticonONLINE

Vorticon

River Patroller

• 2,991 posts
• Location:Eagan, MN, USA

Posted Fri Apr 28, 2017 3:01 PM

Rasmus, this is going to be extremely handy for a future project I have in mind. Thank you for the crystal clear explanation!

Oh and Matt, could you please elaborate on that equation a bit further? It's a little muddy...

### #5 senior_falconONLINE

senior_falcon

Stargunner

• 1,048 posts
• Location:Lansing, NY, USA

Posted Fri Apr 28, 2017 3:13 PM

You are probably thinking 1 AND 1 and wondered how Matthew got 2 out of that.  I can see how you might be confused.

### #6 VorticonONLINE

Vorticon

River Patroller

• 2,991 posts
• Location:Eagan, MN, USA

Posted Fri Apr 28, 2017 4:04 PM

You are probably thinking 1 AND 1 and wondered how Matthew got 2 out of that.  I can see how you might be confused.

That it!!!

### #7 matthew180OFFLINE

matthew180

River Patroller

• 2,450 posts
• Location:Castaic, California

Posted Fri Apr 28, 2017 6:08 PM

Since our posts have now degraded (thanks to me), it might be interesting to observe that:

1&0 == 2

In a certain context.

### #8 BJGuillotOFFLINE

BJGuillot

Star Raider

• 52 posts

Posted Fri May 5, 2017 7:44 AM

I love math "hacks". Thanks for the clear examples.

Also, thanks for a nice real-life use of the INCT instruction. I rarely find myself using it and always thought it was a bit odd to be included in the CPU.

### #9 jedimatt42OFFLINE

jedimatt42

Stargunner

• 1,400 posts
• Location:Beaverton, OR

Posted Fri May 5, 2017 9:02 AM

I always imagined the INCT instruction would show itself useful when playing with addressing.

To iterate across a chunk of ram performing 16 bit data operations, INCT makes sense.

We just don't get to do a lot of 16 bit work given that the peripherals vdp/sound all want 8 bit data.

---

For the rotation example, I am assuming the reference point is (0,0) ?

-M@

### #10 AsmusrOFFLINE

Asmusr

River Patroller

• Topic Starter
• 2,580 posts
• Location:Denmark

Posted Fri May 5, 2017 9:37 AM

I always imagined the INCT instruction would show itself useful when playing with addressing.

To iterate across a chunk of ram performing 16 bit data operations, INCT makes sense.

We just don't get to do a lot of 16 bit work given that the peripherals vdp/sound all want 8 bit data.

---

For the rotation example, I am assuming the reference point is (0,0) ?

-M@

I use INCT to iterate through lists of pointers when it's not possible to use post-increment. I use DECT to decrement the counter in CPU RAM to CPU RAM copy loops, although I could often use DEC with half the initial value.

Yes the reference point is (0,0) for the rotation. So if you rotate around another point (x0,y0) you need to subtract that from (x,y) before calculating the initial values of (x1,y1) for the top left corner of the image to rotate. However, once that's done the incremental calculations are the same.

### #11 Lee StewartOFFLINE

Lee Stewart

River Patroller

• 3,554 posts
• Location:Silver Run, Maryland

Posted Fri May 5, 2017 11:58 AM

In the fbForth 2.0 source code, I have used INCT 36 times and DECT 525 times.  Most of the DECT uses are for reserving 16-bit stack space (parameter and return stacks grow downward).  There are probably an equal number of indirect 16-bit increments via *SP+ or *R+ for popping the stacks.

...lee

### #12 BJGuillotOFFLINE

BJGuillot

Star Raider

• 52 posts

Posted Fri May 5, 2017 8:35 PM

In the fbForth 2.0 source code, I have used INCT 36 times and DECT 525 times.  Most of the DECT uses are for reserving 16-bit stack space (parameter and return stacks grow downward).  There are probably an equal number of indirect 16-bit increments via *SP+ or *R+ for popping the stacks.

Interesting.  I just checked my QR Code generator source, and it looks like I used INCT just 2 times (and 1 of those wasn't code I wrote, but a copy of GPLLNK routine), and used DECT 9 times.

That reminds me that I probably should put up the assembly language code for my QR Code generator up on GitHub soon before I have a disk crash or something.  I've been meaning to do it, but was a bit embarrassed by how spaghetti-like it is.  Though I just just go ahead and put it up as-is this weekend.

### #13 sometimes99erOFFLINE

sometimes99er

River Patroller

• 4,082 posts
• Location:Denmark

Posted Sat May 6, 2017 12:18 AM

I have deducted by observing and experimenting (using high-level programming languages on modern PCs), rather than going through the math. Which I apparently should.

I think one can quite easily calculate any precision sine wave table (in Assembly), but then the output rather asks for ROM than RAM. If your time progression is linear then one can avoid sine, but for jumping in time, a table is more or less essential.

Interesting. I just checked my QR Code generator source, and it looks like I used INCT just 2 times (and 1 of those wasn't code I wrote, but a copy of GPLLNK routine), and used DECT 9 times.

Yes, with auto-increment, one would perhaps expect to see more DECT than INCT.

I ran through my 100+ 8K Assembly ditties.

I use one DECT in the general "return using stack" segment. Otherwise sometimes one more DECT.

The use of INCT ranges from 0 to 8. Operating on stacks, lists and structures, and probably more or less coincidentally when on word boundaries.

That reminds me that I probably should put up the assembly language code for my QR Code generator up on GitHub soon before I have a disk crash or something. I've been meaning to do it, but was a bit embarrassed by how spaghetti-like it is.  Though I just just go ahead and put it up as-is this weekend.

Since accidentally hitting FCTN QUIT in 1982, I think I've done a good deal of backups.

Edited by sometimes99er, Sun May 7, 2017 3:24 AM.

### #14 RXBOFFLINE

RXB

River Patroller

• 2,904 posts
• Location:Vancouver, Washington, USA

Posted Sat May 6, 2017 11:33 AM

XB ROMs use:

INCT 29 times

DECT19 times

Example of ROM Page 2:

```* Move 26 Characters from TOPSTK to VDP ROLLOUT
LN7A90 LI   R1,TOPSTK      *#
LN7A94 LI   R3,VDPROA      * VDP ROLL OUT AREA >O3CO
LN7A98 MOVB @LR3,*R15      * VDP Address
LN7A9C ORI  R3,>4000
LN7AA0 MOVB R3,*R15        * VDP Address
LN7AA2 LI   R0,>001A       * Put 26 in BYTE COUNTER
LN7AA6 MOVB *R1+,@VDPWD    *# VDP WRITE DATA
LN7AAA DEC  R0             * BYTE COUNTER-1
LN7AAC JNE  LN7AA6         * 0? NO LOOP
LN7AAE CLR  @FAC8
LN7AB2 INCT @SUBSTK        *# SUBSTACK+2
LN7AB6 MOVB @SUBSTK,R9
LN7ABA SRL  R9,8           * Strip High byte from word
LN7ABC AI   R9,VAR0
LN7AC0 MOV  R10,*R9
LN7AC2 B    *R11           * RETURN
*
* Move 26 Characters from VDP ROLLOUT to TOPSTK
LN7AC4 LI   R1,TOPSTK      *#
LN7AC8 MOVB @>7A97,*R15    * >C0  to VDP Address
LN7ACC MOVB @>7A96,*R15    * >03  to VDP Address
LN7AD0 LI   R0,>001A       * BYTE COUNTER (26=>001A)
LN7AD8 DEC  R0             * BYTE COUNTER-1
LN7AE0 MOVB @SUBSTK,R9     *#
LN7AE4 SRL  R9,8           * Strip High byte from word
LN7AE6 AI   R9,VAR0
LN7AEA MOV  *R9,R11
LN7AEC DECT @SUBSTK        * SUBSTACK-2
LN7AF0 B    *R11           * RETURN
```

### #15 TheBFOFFLINE

TheBF

Moonsweeper

• 484 posts
• Location:The Great White North

Posted Mon May 8, 2017 9:43 AM

Using this little script here is what I count in console ROMs

```( find INCT & DECT )
VARIABLE ICNT        ( counter variables)
VARIABLE DCNT
: CELLS  2* ;
: INCT?   ( n -- )   0FF0 AND   05C0 =  IF   1 ICNT +! THEN ;
: DECT?   ( n -- )   0FF0 AND   0640 =  IF   1 DCNT +! THEN ;
: COUNTEM ( addr cnt -- )
OVER + SWAP        ( convert to address range)
DO
I @  DECT?
2 +LOOP            ( advance by 2 bytes)
DECIMAL
CR ." INCT : "  ICNT ?
CR ." DECT : "  DCNT ?
CR ." done"
;
( test data )
CREATE T  05C0 , 05C0 , 05C0 , 05C0 , 05C0 , 05C0 ,
0640 , 0640 , 0640 , 0640 , 0640 , 0640 ,
DECIMAL
T  12 CELLS COUNTEM  ( returns 6 & 6 )

```

HEX 0 2000 CELLS COUNTEM

INCT   : 34

DECT :  27

done ok

Edited by TheBF, Mon May 8, 2017 9:45 AM.

### #16 RXBOFFLINE

RXB

River Patroller

• 2,904 posts
• Location:Vancouver, Washington, USA

Posted Mon May 8, 2017 10:06 PM

Using this little script here is what I count in console ROMs

```( find INCT & DECT )
VARIABLE ICNT        ( counter variables)
VARIABLE DCNT
: CELLS  2* ;
: INCT?   ( n -- )   0FF0 AND   05C0 =  IF   1 ICNT +! THEN ;
: DECT?   ( n -- )   0FF0 AND   0640 =  IF   1 DCNT +! THEN ;
: COUNTEM ( addr cnt -- )
OVER + SWAP        ( convert to address range)
DO
I @  DECT?
2 +LOOP            ( advance by 2 bytes)
DECIMAL
CR ." INCT : "  ICNT ?
CR ." DECT : "  DCNT ?
CR ." done"
;
( test data )
CREATE T  05C0 , 05C0 , 05C0 , 05C0 , 05C0 , 05C0 ,
0640 , 0640 , 0640 , 0640 , 0640 , 0640 ,
DECIMAL
T  12 CELLS COUNTEM  ( returns 6 & 6 )

```

HEX 0 2000 CELLS COUNTEM

INCT   : 34

DECT :  27

done ok

This has to be for GROM and VDP access as that is exactly why you see it  in the XB ROMs.

#### 0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users