Hm. I'll take a peek. This is what the IntyBASIC manual says about division:

A=A/B Note it does division by repeated substraction (can be slow)

Division by 2/4/8/16 is internally enhanced as logical shift.

Division by 256 is internally optimized.

"can be slow" could be interpreted as "9 subtractions is slower than a single shift". Or it could be horrendously slow when operating with big numbers (think 5 digit). The latter is what I see, when trying to break down a value into its component digits. It can literally take almost a full second to work out the digits in a score once the value gets high enough (tested on code that does nothing but this and printing the result to the screen).

Because I'm a purist I was avoiding any inline ASM, but I'm gonna take a peek at these routines. What I want is completely infeasible otherwise. Thanks!

Also I think that EVERY div/mult by ANY power of 2 should be optimized as a shift by a compiler, but that's just me

Well, it appears IntyBASIC implements general division by repeated subtraction, subtracting the divisor from the dividend repeatedly, like so:

; B = 1
MVII #1,R0
MVO R0,V1
; A = 30000 / B
MVII #30000,R0
MVI V1,R4
MOVR R0,R5
TSTR R4
BEQ T1
MVII #-1,R0
T2:
INCR R0
SUBR R4,R5
BC T2
T1:
MVO R0,V2

This particular division would compute 30000/1 as 30000 separate subtracts in a 21 cycle loop. That's 610,000 cycles, and would take 2/3rd of a second. Ouch!

What my PRNUM routine does is repeated subtraction with various powers of 10. The number 99999 would only require 50 "subtract" steps, 10 steps for each digit. My subtraction steps are more costly than 21 cycles each, but still, it manages to get a number on the display pretty quickly, in 100s of cycles, not 100,000s of cycles.

SDK-1600 also includes DIVI and DIVU routines for signed and unsigned division. IntyBASIC is free to use these routines if nanochess wants to pick them up. That 30000/1 divide earlier would execute in about 1200 cycles or less, about 500x faster.

Ironically, I don't have a multiply routine directly in SDK-1600, probably because multiplies are usually by constants, and it's better to reduce those to the minimal sequence of shifts/adds inline than call a library function. Attached is a macro file with multiplies from 1 to 127, and C code that was used to generate it. Anyone is welcome to incorporate this into their code. nanochess: If you want to include this algorithm in IntyBASIC (even this C code), go for it. I'm putting it in the public domain.