Maybe it's been discussed before, but is this ecs basic interpreter slow because it's written poorly or because the intellivision is too slow?
A little of both.
A 895kHz CP-1610 pulls about 100,000 - 110,000 instructions per second in the context of an Intellivision with a typical instruction mix. (This includes cycle stealing during active display.) A 1MHz 6502 pulls closer to 250,000 - 300,000 instructions per second. So, on raw instruction rates you're behind the curve. The CP-1610 does have 16-bit arithmetic, but it only helps in places where you need it. Most of the BASIC interpreter doesn't benefit from it. So, you start out with a factor of 2 or more slowdown going from 6502 to CP-1610.
The ECS BASIC implementation itself is designed to be extensible, and so it it has a number of hooks in highly trafficked code paths. While that's nice and flexible, it also slows it down. The hooks could have been implemented much more efficiently than they were. You have code that periodically checks for the existence of a ROM (possibly page-flipping), directly in high-traffic code paths, when really it could have checked once and then configured a handful of lower-overhead vector addresses in RAM.
It also has some straight-up WTFs, such as zeroing out a 42-byte "instruction buffer" and then copying each statement into that buffer, for each statement it executes. There's no good reason for this--it should just execute the instruction directly from the tokenized program store.
Or, this massively crappy shift loop. If they'd put any thought into their number representation--e.g. used little endian because it matches the machine better--they could implement this floating-point accumulator shift far, far, faster. This loop looks like someone glanced at the 6502 version of the code and wrote a slavish reproduction of it.
PSHR R5 ; E1FD Save return address
MVII #L_47D5, R1 ; E1FF Point to FACC
CLRR R5 ; E202
MVI@ R1, R2 ; E203 Read a byte
SLR R2, 1 ; E204 Right shift it 1
XORR R5, R2 ; E205 XOR in bit from prev
MOVR R2, R5 ; E206 Copy to R5
MVI@ R1, R2 ; E207 Reload original byte
MVO@ R5, R1 ; E208 Store out updated byte
ANDI #$0001, R2 ; E209 See if LSB was 1
BEQ L_E211 ; E20B No: Clear R5
MVII #$0080, R5 ; E20D Yes: Set bit 7 of R5
B L_E212 ; E20F Iterate
CLRR R5 ; E211
INCR R1 ; E212 Move to next byte
CMPI #L_47DB, R1 ; E214 At end of FACC?
BNEQ L_E203 ; E217 Keep going...
PULR R7 ; E219 Return
I went through ECS BASIC and cut out the hooks and replaced that craptastic loop above with something slightly more sane. I also replaced that repeated-zeroing loop with something lighter weight. I got maybe a 10% - 15% speedup. Nothing earth shattering, but not bad for just some quick minor tweaks.