I ended up moving everything over to the GPU because I ran into hardware bugs in the DSP (something about external writes failing under certain conditions) that made the program not work on real hardware.
If your hope is to not experience bugs coding GPU, you might have more luck playing lottery
Are you saying you're doing all that in one GPU program?
- Yes, the 3D Engine runs off GPU in a single chunk.
- While I got the code swapping to work few months ago, I'm not a huge fan of how much performance gets destroyed (just bloody wiped out) even if you have just one another chunk to run.
- Swapping code is OK if you don't mind falling down to 15 fps very near down the road, but I'd rather not take the easy way out.
- Also, there's no way I could run just the track at 60 fps (internal average framerate fell down from 80.00 to 73.17 after I fixed the front-plane gap last weekend) if I was lazily swapping the code (even just one swap).
- Once you start swapping, there's no incentive to optimize. It's much easier to keep the first (horribly inefficient) version of code and just swap it out once done
So, I prefer to get creative in refactoring
The following runs on 68000 in parallel with GPU:
- The OP List refresh
- AI finite state machine,
- LevelOfDetail computations,
- various distances calculations,
- interpolating camera position to follow the track
- there's quite a few slow divisions there
Contrary to popular misconception that has been constantly spread over last few decades, none of the code above is slowing GPU down. There's no measurable difference whether I run a 2-line endless loop on 68000 or all the code above. I have about a dozen compile flags that turn whole sections on and off. And I run about 20-30 benchmarks each week (takes 10 seconds to adjust flags and redeploy). Of course, It's been carefully designed from the get go to allow for complete parallel execution (no chokepoints between GPU and 68k).
Of course, if you turned 68000 off, you would get some little boost in GPU performance, as 68k would not be hitting the bus. But people conveniently never mention that then you have to start swapping the code, which will cause much greater performance impact than 68000 hitting the bus.
If I had to shut off 68000 and code all the above stuff on GPU, I would need at least 2 more chunks (but most probably three, as GPU often needs 5x-10x more instructions to achieve same thing, as it's a RISC), which means 3-4 swaps per frame. I could forget 60 fps for Time Trial. Forget 30 fps for regular gameplay.
But hey, I could then claim (best if read with Eddie Izzard's voice&performance) "Hey, Player! No 68000 has been used on this game. Now go and enjoy your 15 fps !"