While posting my Unsigned Integer Division Routines on NesDev, a new member there said he was looking for a divide by 40 and mod 40. I wrote a few routines, but this one really sticks out as a neat idea.
;Divide by and Mod 40 combined;38 bytes, 45 cycles;Y = value to be dividedInterlacedMultiplyByFortyTable: lda #0 ; dummy load, #0 used in LUT lda #0 cpy #40 adc #0 cpy #80 adc #0 cpy #120 adc #0 cpy #160 adc #0 cpy #200 adc #0 cpy #240 adc #0 sta divideResult ; Integer divide 40 asl asl tax tya sec sbc InterlacedMultiplyByFortyTable+1,X ; A = Mod 40
The basic idea is to re-use the values compared in the code as a look-up table in the end. Since they are 4 bytes apart I simply do a multiply by 4 to index the bytes in the code. I also added a dummy LDA #0 at the beginning to get the 0 value for the table. This can be reduced to 1 byte if the routine is jumped or branched to.
I don't know how useful the routine is, but I always like to do things in creative ways.
Here are the other divide by 40 and mod 40 routines I came up with:
;Divide by 40;19 bytes, 34 cycles lsr sta temp lsr adc temp ror lsr lsr adc temp ror adc temp ror lsr lsr lsr lsr
Alternate routine, uses 1 byte more but saves 2 cycles:
;Divide by 40;20 bytes, 32 cycles sta temp lsr adc temp ror lsr lsr adc temp ror adc #1 adc temp and #$C0 rol rol rol
;Mod 40 sta temp lsr adc #13 adc temp ror lsr lsr adc temp ror adc temp ror and #$E0 sta temp2 ;x32 lsr lsr ;x8 adc temp2 ;x40 sbc temp eor #$FF
Combined divide and mod 40:
sta temp lsr adc temp ror lsr lsr adc temp ror adc #1 adc temp and #$C0 rol rol rol sta divideResult ; divide 40 result... TAY, TAX, PHA could be used asl asl asl sta temp2 ;x8 asl asl ;x32 adc temp2 ;x40 sbc temp eor #$FF ; mod 40 result