Jump to content
IGNORED

shortest 8bx16b multiplication by 16


Recommended Posts

Edit: Forget this, it's your "mul2 routine", but I wonder why you 'CLC' before 'ASL'ing...?

 

Original post: 

Does this qualify?

lda val
pha
asl
asl
asl
asl
sta result
pla
lsr
lsr
lsr
lsr
sta result+1
rts

(Extra candy: X and Y are not touched....)

Edited by Irgendwer
  • Like 4
Link to comment
Share on other sites

Perhaps I'm missing something crucial, but it seems more concise to rotate the memory contents directly:

 

	lda #2
	sta val
	jsr mul4

...

.proc mul4
	lda val
	sta result
	lda #0
	sta result+1
	ldy #3
Loop:
	asl result
	rol result+1
	dey
	bpl Loop
	rts
.endp

Note I'm initialising the upper 8 bits of the result; lda #0/sta result+1 can be removed if that's not needed. It makes sense to pass value in A as well, which saves more space:

 

	lda #2
	jsr mul5

...

.proc mul5
	sta result
	lda #0
	sta result+1
	ldy #3
Loop:
	asl result
	rol result+1
	dey
	bpl Loop
	rts
.endp

Down to 15 bytes using absolute addresses if you get rid of the upper 8 bit initialisation.

  • Like 2
Link to comment
Share on other sites

3 minutes ago, flashjazzcat said:

Down to 15 bytes using absolute addresses if you get rid of the upper 8 bit initialisation.

Compared to the version I posted, which also would be 15 bytes without 'LDA'ing first, your's needs Y-register, is bigger if result is non-ZP and slower too.

Link to comment
Share on other sites

	ldx val
	lda lsbtab,x
	sta result
	lda msbtab,x
	sta result+1

 

12 bytes if val and result are on ZP. You said shortest code :D

 

But you need 512 bytes of LUT.

Edited by ivop
first it an empty post, after that I fixed a typo
  • Haha 1
Link to comment
Share on other sites

Or this one:

    asl
    rol result+1
    asl
    rol result+1
    asl
    rol result+1
    asl
    rol result+1
    sta result


shorter, but trashes X

	ldx #3
loop
	asl
	rol result+1
	dex
	bpl loop
	sta result

Enter with value in A and result+1 set to 0.

 

And a way to set res+1 to 0 cheaply. Still trashes X.

 

    ldx #0
    stx res+1
    lda val
loop
    asl
    rol res+1
    inx
    cpx #4
    bne loop
    sta res

 

Edited by ivop
added more variations
  • Like 1
Link to comment
Share on other sites

A different approach. Not smaller though, but it might help others with thinking about this problem :)

 

; swap nibbles
    asl  
    adc  #$80
    rol  
    asl
    adc  #$80
    rol

; split and store result
    pha
    and #$f0
    sta res
    pla
    and #$0f
    sta res+1

 

Edited by ivop
Link to comment
Share on other sites

4 minutes ago, ilmenit said:

that's exactly one of my original attempts in my first post ?

Haha, sorry. Missed that somehow :)

 

Edit: oh, I never looked at your asm file, but to the quoted code. It was mul1 :)

Edited by ivop
Link to comment
Share on other sites

Okay, how about this one? It trashes the value though. 16 bytes with val on ZP.

    lda #0
    asl val
    rol
    asl val
    rol
    asl val
    rol
    asl val
    rol
    sta val+1

Or trashing X, too:

    lda #0

    ldx #3
loop
    asl val
    rol
    dex
    bpl loop

    sta val+1

12 bytes.

 

Edited by ivop
typoo
  • Like 1
Link to comment
Share on other sites

2 minutes ago, ilmenit said:

Good ideas, but I will need the value ?

Yeah, I guessed so. Combining that incredibly neat trick rolling #$10 four times that barrym95838 introduced, with the val trashing code results in this:

 

    lda #$10

loop
    asl val
    rol
    bcc loop

    sta val+1

9 bytes.

 

Perhaps you can keep track of the original val somewhere else?

  • Like 2
Link to comment
Share on other sites

I made one that  is using OS and requires placement on a special location, has 12 bytes and does not destroy the val:

	org $3
result_hi .ds[1]
result_lo .ds[1]
val .byte 121 	
	org $2000
.proc os_mul16
	lda val
	sta result_lo
	ldx #0
	stx result_hi
	jsr $DBED
	rts
.endp	

one that is destroying the val and has 8 bytes ?

	org $3
result_hi .ds[1]
result_lo:
val .byte 121 	
	org $2000
.proc os_mul16_destr_val
	ldx #0
	stx result_hi
	jsr $DBED
	rts
.endp	

I think it's good enough comparing to initial 21-25 bytes.

Edited by ilmenit
  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...