I really haven't looked at this in great detail - I thought it might be worth looking into for ScumSoft since he's doing a dpc+bitmap kernel. I would have to sit down with pen and paper to look at the possibilities - I presume you're hitting respx once for each player? I just thought something might be possible with the cycle savings from dpc+, I was hoping you where using a loop though for a few more cycles saved to be honest though
The players are set to tripple close mode. I hit RESP0 and RESP1 four times each per line, but each time you hit them within a scan line it seems to restart the scanning at the start of the first space, so you get two copies of the player that look like[space][player][space][player]. The arrangement of player 0 and player 1 is critical since you have to load graphics and strobe the resets at very particular times that in most configurations will either overlap meaning that you need to write a RESPx and a GRPx register in the same cycle, or they are so close together that you can't fetch data fast enough. It takes 18 pixels to load and store one byte from zero page:
lda zp ; 3 cycles, 9 pixels
sta zp ; 3 cycles, 9 pixels
I presume your using zero page loads for the data, so that would be one cycle saved per load for 8 cycles. He could also lose txs,tsx since you can just load immediate, for 4 more cycles.
Immediate can only work for constant data, not characters that change every line.
Doing the ball is a no brainer lda #,sta enabl could be done at any point to use 5 cycles. Hopefully we could keep that data in the accumulator and ror, sta enamx at the appropriate point for 4 more cycles. We'd have to use the 1st or 4th copy of the missile only and disable or have disabled the missile before the 2nd or 1st copy appears.
The vertical delay and all the registers including the stack pointer must be used to attain the timing needed for venetian blinds. Data is loaded into the delay registers of P0, P1, and the display of P0, data is also loaded into SP,A,X, and Y, then the first store to RESP0, then it's a race to load and store all the register to position and write the pixels of the 8 bytes of sprites on that single height scan line.
I hadn't been thinking about using all the registers since i didn't realise this was needed when the copies were 8 pixels apart, I guess that's an artifact of having to hit respx at the right point? So I was hoping Y say could be kept with the d1 bit set to 0. So just sty enamx at any point. That would just squeeze into the cycles saved but I thought there might have been a few more available.
Remember that the players are set to triple single width mode and the missiles will be too. You can have different settings for both, otherwise, there are plenty of cycles for setting ball/missile and even playfield graphics.
I guess it's possible to shave off quite few cycles by not using venetian blinds?
Yes, venetian blinds take more cycles and are harder because you need to reset the graphics registers more frequently and have to use a single scan line kernel.
I may try and knock something up - any hints to the respx timing I've always been a bit confused by where the extra copies appear, I presume you have to hit at some point between the first and second copies, instinct would be to hit it at exactly the start of the second copy, but I'm not sure that's right?
Yes, like I said, when in triple player mode and you write RESPx more than once a scan line, it appears that the player graphics will appear 8 pixels to the current pixel location at the end of the sta instruction, and it will only show two copies. Set up a test case and play with it in Stella, use the debugger to tweak values or set up many scan line experiments at one time to view results.
Edit: Didn't remember respx trick right, I have to hit it three times between changing grpx right?