How fast CAN a SIDE 2 go? LET'S FIND OUT!

tschak909 · September 4, 2014

What happens, when you drop in a 21mhz 65816, and use a SIDE 2 with compact flash? The results are astounding.

Amazing, indeed!

But wait, there's more!

If you drop in a VBXE, load S_VBXE and CON, and switch to an 80 column VBXE console, YOU GET:

MUUUUUUAAAHAHAHAHAHAHAHAHAHAHAHA

-Thom

flashjazzcat · September 4, 2014

Did you see Avery's 50fps video player? He made Antic pull data from SIDE2 at around 400KB/s.

tschak909 · September 4, 2014

Oh yes, I did. Jaw hit floor.

-Thom

phaeron · September 4, 2014

Was that with 128 byte sectors? It goes a lot higher with 512 byte sectors, ~400KB/sec. IDE uses 512 byte sectors, so using 128 or 256 byte blocks is inefficient -- it forces the driver to transfer a lot more data than is actually used.

It's worth noting that these timings depend on Altirra's '816 emulation timings and YMMV will vary on actual accelerators. A rather big factor is whether the transfer buffers and IDE driver are executing from fast memory -- if they are in base system memory, the 65816 will be severely handicapped. Even with fast memory, you can see the effect of hardware register accesses on the inner loop (the timestamp is in the form of frame:vpos:hpos.subcycle):

    23690:259: 67.0 | A=FF:20 X=02 Y=02 (      ) | 00:13F3: AD F0 D5          LDA $D5F0
    23690:259: 69.0 | A=FF:A4 X=02 Y=02 (N     ) | 00:13F6: 91 32             STA (BUFRLO),Y
    23690:259: 69.6 | A=FF:A4 X=02 Y=02 (N     ) | 00:13F8: C8                INY
    23690:259: 69.8 | A=FF:A4 X=02 Y=03 (      ) | 00:13F9: AD F0 D5          LDA $D5F0
    23690:259: 71.0 | A=FF:50 X=02 Y=03 (      ) | 00:13FC: 91 32             STA (BUFRLO),Y
    23690:259: 71.6 | A=FF:50 X=02 Y=03 (      ) | 00:13FE: C8                INY
    23690:259: 71.8 | A=FF:50 X=02 Y=04 (      ) | 00:13FF: D0 F2             BNE $13F3

$D5F0 is the IDE data register, so accessing it requires going over the chip bus... which then requires slowing down the 65816 from 21MHz to 1.79MHz. This makes the instruction take 15-26 CPU cycles instead of 4: 3 to read the instruction, 0-11 to synchronize, and 12 to read. The loop above would be 27 ideal cycles per word without this effect, but instead it takes 60. (The IDE driver managed to get relocated just right for its inner loop to cross a page boundary... DOH!)

More serious is whenever the SDX library is called, which executes in place from banked memory in the cartridge. The default '816 settings in Altirra are for all external memory to reside on the chip bus, because it would be impractical for a real accelerator to run those faster unless they were part of the accelerator itself. This completely kneecaps the 65816 because the library code runs almost at 1.79MHz speed. Another bad case like this would be trying to use a Black Box with an internal '816 -- the SCSI disk transfers would still be slow because the disk driver would be running out of uncached firmware ROM on the slow PBI bus.

The easiest way to avoid this problem is to get as much code into fast RAM as possible, which is the easiest for the accelerator to handle. That is complicated by needing to accommodate the original Atari 8-bit architecture and also by the 65816's stubborn insistence on having interrupt vectors, stacks, and direct page only in bank 0. This means that having a way to accelerate at least part of bank 0 is critical for an '816 accelerator to run existing software faster than a stock 6502 -- it's pointless to try to run the '816 faster than 1.79MHz if it has to slow down to that speed virtually all the time to fetch instructions. The Apple IIGS solves this problem by having bank 0 be fast RAM by default and optionally shadowing it with Mega II regions from bank $E0 for Apple II compatibility.

Rybags · September 4, 2014

The IO runs with VBlank totally enabled, doesn't it? ie, CRITIC not set.

Some more speed could probably be wringed out of it - do a user VBI that's streamlined and only does the minimum required, should free up a few scanlines per frame for a little extra boost.

tschak909 · September 5, 2014

Yes, I am using 512 byte sectors...

-Thom

Sign In

How fast CAN a SIDE 2 go? LET'S FIND OUT!

Recommended Posts

tschak909

Link to comment

Share on other sites

flashjazzcat

Link to comment

Share on other sites

tschak909

Link to comment

Share on other sites

phaeron

Link to comment

Share on other sites

Rybags

Link to comment

Share on other sites

tschak909

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Apps

My Activity Streams

More