Jump to content

Photo

multiply by .0375


20 replies to this topic

#1 tschak909 OFFLINE  

tschak909

    River Patroller

  • 3,108 posts
  • Location:USA

Posted Wed Jan 16, 2019 1:09 PM

Does anyone have an equivalent 9900 asm routine for this?

_mul0375:
        sta ptr2        ; save original value
        stx ptr2+1
        stx ptr1+1      ; msb of shifted value

        asl     ; double it
        rol ptr1+1

        clc     ; get * 3
        adc ptr2
        sta ptr2

        lda ptr1+1
        adc ptr2+1

        lsr     ; now divide by 8
        ror ptr2
        lsr
        ror ptr2
        lsr
        ror ptr2

        tax
        lda ptr2
        rts

-Thom



#2 FarmerPotato OFFLINE  

FarmerPotato

    Moonsweeper

  • 272 posts
  • Location:Austin, TX

Posted Wed Jan 16, 2019 1:28 PM

Does anyone have an equivalent 9900 asm routine for this?

_mul0375:
        sta ptr2        ; save original value
        stx ptr2+1
        stx ptr1+1      ; msb of shifted value

        asl     ; double it
        rol ptr1+1

        clc     ; get * 3
        adc ptr2
        sta ptr2

        lda ptr1+1
        adc ptr2+1

        lsr     ; now divide by 8
        ror ptr2
        lsr
        ror ptr2
        lsr
        ror ptr2

        tax
        lda ptr2
        rts

-Thom

 

Here's one way

 

* R1: input value, -32768 to 32767 (signed)
* R2: output is R1 * 3/8 
 mov r1,r2
 sra r2,1      shift right arithmetic (signed divide by 2)
 a   r1,r2     sum is 1.5 times r1, 
 sra r2,2      now divide by 4 (signed)

I'm not familiar with the C calling conventions, maybe you can patch that into your code. (awesome progress on PLATOterm by the way!)



#3 tschak909 OFFLINE  

tschak909

    River Patroller

  • Topic Starter
  • 3,108 posts
  • Location:USA

Posted Wed Jan 16, 2019 1:43 PM

oh cool, thanks! and thanks for the compliment. :)

-Thom



#4 ChildOfCv OFFLINE  

ChildOfCv

    Chopper Commander

  • 184 posts

Posted Wed Jan 16, 2019 1:44 PM

Actually 0.0375 is 3/80.  You missed a zero :)



#5 TheBF ONLINE  

TheBF

    Dragonstomper

  • 923 posts
  • Location:The Great White North

Posted Wed Jan 16, 2019 1:46 PM

An alternative can be to make a sub-routine that leverages hardware divide and mulitply in the 9900.
It will be slower but very versatile.
 
The theory is multiply by 3 keeping a 32 bit result (2 registers) and then divide by 8.
 
This operation is resident in Forth and is called */   because it multiplies then divides the product.
It is very handy for this type of fractional multiplication.


I realized my code is too Forth specific to be of real value as is.

I believe you could code this up in C using a long for the product, but I have never done it.

Edited by TheBF, Wed Jan 16, 2019 1:52 PM.


#6 adamantyr ONLINE  

adamantyr

    Stargunner

  • 1,451 posts

Posted Wed Jan 16, 2019 1:51 PM

On the TI, you can also use the MPY and DIV operations to do this in a tighter fashion:

* Multiply by 0.375 (Which is what I think the original poster meant...)
* R1 - Contains 2-byte value (0 to 65535)
* Returns in same value
MP0375 LI   R0,3             * Set R0 to 3
       MPY  R0,R1            * Multiply target value by 3
       LI   R0,8             * Set R0 to value 8
       DIV  R0,R1            * Divide (value*3) by 8, quotient is in R1, remainder in R2
       RT


#7 adamantyr ONLINE  

adamantyr

    Stargunner

  • 1,451 posts

Posted Wed Jan 16, 2019 1:57 PM

Actually 0.0375 is 3/80.  You missed a zero :)

 

Actually, based on the original code, I think the post topic was supposed to be 0.375. So a zero was added. :)



#8 FarmerPotato OFFLINE  

FarmerPotato

    Moonsweeper

  • 272 posts
  • Location:Austin, TX

Posted Wed Jan 16, 2019 2:02 PM

Actually 0.0375 is 3/80.  You missed a zero :)

 

My 6502 is pretty poor, but I read the original code as finding 3x the 16 bit value then shifting 3 times to divide by 8.

You can't do a 3/80 with just a few adds and bit shifts.



#9 ChildOfCv OFFLINE  

ChildOfCv

    Chopper Commander

  • 184 posts

Posted Wed Jan 16, 2019 2:13 PM

Yeah I think adamantyr is correct, that the OP added the extra zero.

 

Multiplying by 3/80 is a bit trickier, but multiplying by 2457 into a 32-bit product and chopping the low 16 bits gets you close.  Not that this is relevant to the OP's apparent intent...


Edited by ChildOfCv, Wed Jan 16, 2019 2:26 PM.


#10 tschak909 OFFLINE  

tschak909

    River Patroller

  • Topic Starter
  • 3,108 posts
  • Location:USA

Posted Wed Jan 16, 2019 3:08 PM

it's 0.375, effectively scaling a value from 0-511 to 0-191

 

-Thom



#11 TheBF ONLINE  

TheBF

    Dragonstomper

  • 923 posts
  • Location:The Great White North

Posted Wed Jan 16, 2019 3:17 PM

 

On the TI, you can also use the MPY and DIV operations to do this in a tighter fashion:

* Multiply by 0.375 (Which is what I think the original poster meant...)
* R1 - Contains 2-byte value (0 to 65535)
* Returns in same value
MP0375 LI   R0,3             * Set R0 to 3
       MPY  R0,R1            * Multiply target value by 3
       LI   R0,8             * Set R0 to value 8
       DIV  R0,R1            * Divide (value*3) by 8, quotient is in R1, remainder in R2
       RT

 

Thank you.  That's what I was gettin' on about.  The only difference the Forth  "star-slash" operation is that the three arguments are popped off a stack so you can do arbitrary scaling.

Easily done with a set of input registers.



#12 adamantyr ONLINE  

adamantyr

    Stargunner

  • 1,451 posts

Posted Wed Jan 16, 2019 3:19 PM

 

Thank you.  That's what I was gettin' on about.  The only difference the Forth  "star-slash" operation is that the three arguments are popped off a stack so you can do arbitrary scaling.

Easily done with a set of input registers.

 

Yeah, if I was making such a routine for general use in my applications, I'd make it a BLWP to isolate it, allow you to set any ratio you want.



#13 TheBF ONLINE  

TheBF

    Dragonstomper

  • 923 posts
  • Location:The Great White North

Posted Wed Jan 16, 2019 3:46 PM

Here is a working version of "star-slash" for my system. It's not standards compliant for Forth these days to do it like this, so I just wrote it now.
Seems to work as planned. It's called  "unsigned-star-slash".  It works like adamantyr's code but uses variable arguments.
(sorry about the instructions being on the wrong side) :-)
 
EDIT2: Never published quickly written code. I really messed this up the first time. Found the bugs trying to time it today. 
            It now correctly takes three 16bit args. 32bit result is internal only.  :-)

\ usigned scaling routine with 32 bit intermediate product

NEEDS MOV, FROM DSK1.ASM9900
 
CODE U*/ ( n n n -- n )
       R4    R0 MOV,          \ move TOS cache register R0 (divisor)
      *R6+   R1 MOV,          \ POP multiplier to R1
      *R6+   R4 MOV,          \ multiplicand -> TOS
       R1    R4 MPY,          \ 32 bit multiply
       R5    R3 MOV,          \ low order of multiplicand to R3
       R0    R4 DIV,          \ unsigned division, quotient in R4
               NEXT,          \ return to Forth
            ENDCODE


Edited by TheBF, Thu Jan 17, 2019 3:29 PM.


#14 mizapf OFFLINE  

mizapf

    River Patroller

  • 3,506 posts
  • Location:Germany

Posted Wed Jan 16, 2019 4:14 PM

When reading the 6502 code I wondered what these repeated shift operations were used for ... until I saw that the accumulator is only 8 bits wide, and the repeated LSR / ROR is handing over the bits shifted out on the right side.

 

How glad I am to have learned assembly language programming on a 16-bit platform. ;)



#15 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 4,205 posts
  • Location:Denmark

Posted Thu Jan 17, 2019 5:47 AM

it's 0.375, effectively scaling a value from 0-511 to 0-191
 
-Thom


Ahh. You can scale a random number by dividing by the top of range, and then the remainder is in range. So nice.

#16 TheBF ONLINE  

TheBF

    Dragonstomper

  • 923 posts
  • Location:The Great White North

Posted Thu Jan 17, 2019 8:29 AM

Ahh. You can scale a random number by dividing by the top of range, and then the remainder is in range. So nice.

 

That is absolutely true.  The modulus operation is very handy for that in many languages and the 9900 gives it to us for free in hardware.

That works more like a range limiter so numbers never exceed a maximum value.

 

I think Tschak909 is looking for true scaling where you change the range of numbers from some input range to a smaller range but the distances between units is still in proportion to the original input range. So that's real division. 

 

It's more challenging in integer math to do division by a specific fraction ( eg: .375 or 3/8) so that's why using 9900 MPY, taking the 32 bit result ( 0..~4,000,000,000) and then using DIV to divide the 32bit number by a 16 bit divisor, let's us scale to a very wide range of  values.

 

As mentioned the 9900 made this kind of thing sooo much easier than it is on a 6502.

 

Apologies if I am preaching to the choir here.  It's my morning coffee talking.



#17 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 4,205 posts
  • Location:Denmark

Posted Thu Jan 17, 2019 9:35 AM

Ahh. You can scale a random number by dividing by the top of range, and then the remainder is in range. So nice.

 

That works more like a range limiter so numbers never exceed a maximum value.


Well, it's not a bit cutter.
 

... so that's why using 9900 MPY, taking the 32 bit result ( 0..~4,000,000,000) and then using DIV to divide the 32bit number by a 16 bit divisor, let's us scale to a very wide range of  values.


Just remember ... from the Editor/Assembler Manual ...
 

When the source operand is greater than the first word of the destination operand, normal division occurs. If the source operand is less than or equal to the first word of the destination operand, normal division results in a quotient that cannot be represented in a 16-bit word. In this case, the computer sets the overflow status bit, leaves the destination operand unchanged, and cancels the division operation.



#18 ChildOfCv OFFLINE  

ChildOfCv

    Chopper Commander

  • 184 posts

Posted Thu Jan 17, 2019 10:04 AM

Of course the other achilles heel of the MPY/DIV instructions is the number of clocks it takes.  If you can accomplish what you want with a couple of adds and bit shifts, then you'll likely beat the performance of the full multiply/divide by at least half.  If time is of the essence, you always look for the optimal case.



#19 adamantyr ONLINE  

adamantyr

    Stargunner

  • 1,451 posts

Posted Thu Jan 17, 2019 10:15 AM

Of course the other achilles heel of the MPY/DIV instructions is the number of clocks it takes.  If you can accomplish what you want with a couple of adds and bit shifts, then you'll likely beat the performance of the full multiply/divide by at least half.  If time is of the essence, you always look for the optimal case.

 

It depends on the context of the use. Part of thousands of repetitive operations, performance is definitely a concern.

 

If I had a performance-critical operation that depended upon determining the value of a fraction of a number, I wouldn't even use shifts and adds. I would generate a look-up table to instantly fetch the value needed.



#20 ChildOfCv OFFLINE  

ChildOfCv

    Chopper Commander

  • 184 posts

Posted Thu Jan 17, 2019 10:27 AM

 

It depends on the context of the use. Part of thousands of repetitive operations, performance is definitely a concern.

 

If I had a performance-critical operation that depended upon determining the value of a fraction of a number, I wouldn't even use shifts and adds. I would generate a look-up table to instantly fetch the value needed.

 

 

Well, even that can have its drawbacks.  In order to perform the lookup, you have to set an address base into a register, add in the lookup value (possibly with a shift), and then incur the wrath of a memory access.  If bank switching is a thing, that's another potential pothole.  If your calculation only requires, say, 3 bit twiddles, the lookup table may actually be a loss.  Then the other issue is that if you have a domain of, say 500 items, in a ROM space of 2K, that's a big f'n deal.

 

So on older systems, lookup tables tend to be reserved for more complex calculations such as trig functions.



#21 adamantyr ONLINE  

adamantyr

    Stargunner

  • 1,451 posts

Posted Thu Jan 17, 2019 12:37 PM

 

 

Well, even that can have its drawbacks.  In order to perform the lookup, you have to set an address base into a register, add in the lookup value (possibly with a shift), and then incur the wrath of a memory access.  If bank switching is a thing, that's another potential pothole.  If your calculation only requires, say, 3 bit twiddles, the lookup table may actually be a loss.  Then the other issue is that if you have a domain of, say 500 items, in a ROM space of 2K, that's a big f'n deal.

 

So on older systems, lookup tables tend to be reserved for more complex calculations such as trig functions.

 

Yes, I use a look-up table for my integer-based trig functions in my CRPG for creating circling sprites in fact.

 

Another area I was going to use a look-up table for was damage by player type in my Gauntlet clone. If you recall, the various classes have 0, 10%, 20% and 30% damage reduction. Instead of calculating damage on the fly, I would have a look-up table for each monster type and class type to just fetch the exact amount of health damage caused. Since it's an action game, speed would be essential.






0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users