tschak909 Posted January 16, 2019 Share Posted January 16, 2019 Does anyone have an equivalent 9900 asm routine for this? _mul0375: sta ptr2 ; save original value stx ptr2+1 stx ptr1+1 ; msb of shifted value asl ; double it rol ptr1+1 clc ; get * 3 adc ptr2 sta ptr2 lda ptr1+1 adc ptr2+1 lsr ; now divide by 8 ror ptr2 lsr ror ptr2 lsr ror ptr2 tax lda ptr2 rts -Thom Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted January 16, 2019 Share Posted January 16, 2019 Does anyone have an equivalent 9900 asm routine for this? _mul0375: sta ptr2 ; save original value stx ptr2+1 stx ptr1+1 ; msb of shifted value asl ; double it rol ptr1+1 clc ; get * 3 adc ptr2 sta ptr2 lda ptr1+1 adc ptr2+1 lsr ; now divide by 8 ror ptr2 lsr ror ptr2 lsr ror ptr2 tax lda ptr2 rts -Thom Here's one way * R1: input value, -32768 to 32767 (signed) * R2: output is R1 * 3/8 mov r1,r2 sra r2,1 shift right arithmetic (signed divide by 2) a r1,r2 sum is 1.5 times r1, sra r2,2 now divide by 4 (signed) I'm not familiar with the C calling conventions, maybe you can patch that into your code. (awesome progress on PLATOterm by the way!) 1 Quote Link to comment Share on other sites More sharing options...
tschak909 Posted January 16, 2019 Author Share Posted January 16, 2019 oh cool, thanks! and thanks for the compliment. -Thom Quote Link to comment Share on other sites More sharing options...
ChildOfCv Posted January 16, 2019 Share Posted January 16, 2019 Actually 0.0375 is 3/80. You missed a zero Quote Link to comment Share on other sites More sharing options...
+TheBF Posted January 16, 2019 Share Posted January 16, 2019 (edited) An alternative can be to make a sub-routine that leverages hardware divide and mulitply in the 9900. It will be slower but very versatile. The theory is multiply by 3 keeping a 32 bit result (2 registers) and then divide by 8. This operation is resident in Forth and is called */ because it multiplies then divides the product. It is very handy for this type of fractional multiplication. I realized my code is too Forth specific to be of real value as is. I believe you could code this up in C using a long for the product, but I have never done it. Edited January 16, 2019 by TheBF Quote Link to comment Share on other sites More sharing options...
+adamantyr Posted January 16, 2019 Share Posted January 16, 2019 On the TI, you can also use the MPY and DIV operations to do this in a tighter fashion: * Multiply by 0.375 (Which is what I think the original poster meant...) * R1 - Contains 2-byte value (0 to 65535) * Returns in same value MP0375 LI R0,3 * Set R0 to 3 MPY R0,R1 * Multiply target value by 3 LI R0,8 * Set R0 to value 8 DIV R0,R1 * Divide (value*3) by 8, quotient is in R1, remainder in R2 RT Quote Link to comment Share on other sites More sharing options...
+adamantyr Posted January 16, 2019 Share Posted January 16, 2019 Actually 0.0375 is 3/80. You missed a zero Actually, based on the original code, I think the post topic was supposed to be 0.375. So a zero was added. 1 Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted January 16, 2019 Share Posted January 16, 2019 Actually 0.0375 is 3/80. You missed a zero My 6502 is pretty poor, but I read the original code as finding 3x the 16 bit value then shifting 3 times to divide by 8. You can't do a 3/80 with just a few adds and bit shifts. Quote Link to comment Share on other sites More sharing options...
ChildOfCv Posted January 16, 2019 Share Posted January 16, 2019 (edited) Yeah I think adamantyr is correct, that the OP added the extra zero. Multiplying by 3/80 is a bit trickier, but multiplying by 2457 into a 32-bit product and chopping the low 16 bits gets you close. Not that this is relevant to the OP's apparent intent... Edited January 16, 2019 by ChildOfCv 1 Quote Link to comment Share on other sites More sharing options...
tschak909 Posted January 16, 2019 Author Share Posted January 16, 2019 it's 0.375, effectively scaling a value from 0-511 to 0-191 -Thom Quote Link to comment Share on other sites More sharing options...
+TheBF Posted January 16, 2019 Share Posted January 16, 2019 On the TI, you can also use the MPY and DIV operations to do this in a tighter fashion: * Multiply by 0.375 (Which is what I think the original poster meant...) * R1 - Contains 2-byte value (0 to 65535) * Returns in same value MP0375 LI R0,3 * Set R0 to 3 MPY R0,R1 * Multiply target value by 3 LI R0,8 * Set R0 to value 8 DIV R0,R1 * Divide (value*3) by 8, quotient is in R1, remainder in R2 RT Thank you. That's what I was gettin' on about. The only difference the Forth "star-slash" operation is that the three arguments are popped off a stack so you can do arbitrary scaling. Easily done with a set of input registers. Quote Link to comment Share on other sites More sharing options...
+adamantyr Posted January 16, 2019 Share Posted January 16, 2019 Thank you. That's what I was gettin' on about. The only difference the Forth "star-slash" operation is that the three arguments are popped off a stack so you can do arbitrary scaling. Easily done with a set of input registers. Yeah, if I was making such a routine for general use in my applications, I'd make it a BLWP to isolate it, allow you to set any ratio you want. 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted January 16, 2019 Share Posted January 16, 2019 (edited) Here is a working version of "star-slash" for my system. It's not standards compliant for Forth these days to do it like this, so I just wrote it now.Seems to work as planned. It's called "unsigned-star-slash". It works like adamantyr's code but uses variable arguments.(sorry about the instructions being on the wrong side) :-) EDIT2: Never published quickly written code. I really messed this up the first time. Found the bugs trying to time it today. It now correctly takes three 16bit args. 32bit result is internal only. :-) \ usigned scaling routine with 32 bit intermediate product NEEDS MOV, FROM DSK1.ASM9900 CODE U*/ ( n n n -- n ) R4 R0 MOV, \ move TOS cache register R0 (divisor) *R6+ R1 MOV, \ POP multiplier to R1 *R6+ R4 MOV, \ multiplicand -> TOS R1 R4 MPY, \ 32 bit multiply R5 R3 MOV, \ low order of multiplicand to R3 R0 R4 DIV, \ unsigned division, quotient in R4 NEXT, \ return to Forth ENDCODE Edited January 17, 2019 by TheBF 1 Quote Link to comment Share on other sites More sharing options...
+mizapf Posted January 16, 2019 Share Posted January 16, 2019 When reading the 6502 code I wondered what these repeated shift operations were used for ... until I saw that the accumulator is only 8 bits wide, and the repeated LSR / ROR is handing over the bits shifted out on the right side. How glad I am to have learned assembly language programming on a 16-bit platform. 4 Quote Link to comment Share on other sites More sharing options...
sometimes99er Posted January 17, 2019 Share Posted January 17, 2019 it's 0.375, effectively scaling a value from 0-511 to 0-191 -Thom Ahh. You can scale a random number by dividing by the top of range, and then the remainder is in range. So nice. Quote Link to comment Share on other sites More sharing options...
+TheBF Posted January 17, 2019 Share Posted January 17, 2019 Ahh. You can scale a random number by dividing by the top of range, and then the remainder is in range. So nice. That is absolutely true. The modulus operation is very handy for that in many languages and the 9900 gives it to us for free in hardware. That works more like a range limiter so numbers never exceed a maximum value. I think Tschak909 is looking for true scaling where you change the range of numbers from some input range to a smaller range but the distances between units is still in proportion to the original input range. So that's real division. It's more challenging in integer math to do division by a specific fraction ( eg: .375 or 3/8) so that's why using 9900 MPY, taking the 32 bit result ( 0..~4,000,000,000) and then using DIV to divide the 32bit number by a 16 bit divisor, let's us scale to a very wide range of values. As mentioned the 9900 made this kind of thing sooo much easier than it is on a 6502. Apologies if I am preaching to the choir here. It's my morning coffee talking. Quote Link to comment Share on other sites More sharing options...
sometimes99er Posted January 17, 2019 Share Posted January 17, 2019 Ahh. You can scale a random number by dividing by the top of range, and then the remainder is in range. So nice. That works more like a range limiter so numbers never exceed a maximum value. Well, it's not a bit cutter. ... so that's why using 9900 MPY, taking the 32 bit result ( 0..~4,000,000,000) and then using DIV to divide the 32bit number by a 16 bit divisor, let's us scale to a very wide range of values. Just remember ... from the Editor/Assembler Manual ... When the source operand is greater than the first word of the destination operand, normal division occurs. If the source operand is less than or equal to the first word of the destination operand, normal division results in a quotient that cannot be represented in a 16-bit word. In this case, the computer sets the overflow status bit, leaves the destination operand unchanged, and cancels the division operation. 1 Quote Link to comment Share on other sites More sharing options...
ChildOfCv Posted January 17, 2019 Share Posted January 17, 2019 Of course the other achilles heel of the MPY/DIV instructions is the number of clocks it takes. If you can accomplish what you want with a couple of adds and bit shifts, then you'll likely beat the performance of the full multiply/divide by at least half. If time is of the essence, you always look for the optimal case. Quote Link to comment Share on other sites More sharing options...
+adamantyr Posted January 17, 2019 Share Posted January 17, 2019 Of course the other achilles heel of the MPY/DIV instructions is the number of clocks it takes. If you can accomplish what you want with a couple of adds and bit shifts, then you'll likely beat the performance of the full multiply/divide by at least half. If time is of the essence, you always look for the optimal case. It depends on the context of the use. Part of thousands of repetitive operations, performance is definitely a concern. If I had a performance-critical operation that depended upon determining the value of a fraction of a number, I wouldn't even use shifts and adds. I would generate a look-up table to instantly fetch the value needed. Quote Link to comment Share on other sites More sharing options...
ChildOfCv Posted January 17, 2019 Share Posted January 17, 2019 It depends on the context of the use. Part of thousands of repetitive operations, performance is definitely a concern. If I had a performance-critical operation that depended upon determining the value of a fraction of a number, I wouldn't even use shifts and adds. I would generate a look-up table to instantly fetch the value needed. Well, even that can have its drawbacks. In order to perform the lookup, you have to set an address base into a register, add in the lookup value (possibly with a shift), and then incur the wrath of a memory access. If bank switching is a thing, that's another potential pothole. If your calculation only requires, say, 3 bit twiddles, the lookup table may actually be a loss. Then the other issue is that if you have a domain of, say 500 items, in a ROM space of 2K, that's a big f'n deal. So on older systems, lookup tables tend to be reserved for more complex calculations such as trig functions. Quote Link to comment Share on other sites More sharing options...
+adamantyr Posted January 17, 2019 Share Posted January 17, 2019 Well, even that can have its drawbacks. In order to perform the lookup, you have to set an address base into a register, add in the lookup value (possibly with a shift), and then incur the wrath of a memory access. If bank switching is a thing, that's another potential pothole. If your calculation only requires, say, 3 bit twiddles, the lookup table may actually be a loss. Then the other issue is that if you have a domain of, say 500 items, in a ROM space of 2K, that's a big f'n deal. So on older systems, lookup tables tend to be reserved for more complex calculations such as trig functions. Yes, I use a look-up table for my integer-based trig functions in my CRPG for creating circling sprites in fact. Another area I was going to use a look-up table for was damage by player type in my Gauntlet clone. If you recall, the various classes have 0, 10%, 20% and 30% damage reduction. Instead of calculating damage on the fly, I would have a look-up table for each monster type and class type to just fetch the exact amount of health damage caused. Since it's an action game, speed would be essential. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.