This took way longer than expected, but we finally solved the stability problems and are now ready to move on.
These past weeks Eric (speccery) was kind enough to use his logic analyzer to analyze the faulty signals of the FinalGROM. And sure enough, he found two major issues.
First, and this didn't involve the logic analyzer, I didn't connect AVcc on the microcontroller.
I didn't read the datasheet and left AVcc unconnected, as I wasn't using AREF. Well, easy fix.
Second, Eric found out that the control signals from the microcontroller to the CPLD weren't stable. During normal operation, they should remain constant, but he saw that occasionally at least one of them flopped to the other value for some nanoseconds. This would mess with the processing logic of the CPLD and thus yield arbitrary results. I think the most likely explanation for this is some kind of electrical problem, although we didn't find anything conclusive so far.
But then the prototype #2 boards arrived, also two weeks late. Beside a new SRAM chip, the new boards also use a different voltage regulator. So I build a new cart, fixed AVcc, and voila: it is working flawlessly, having tested it for 8+ hours!
So strictly speaking we didn't solve the problem, but it solved itself.
Next, I'm going to design and send off the final board with the latest modifications to support the latest software updates. Once those boards are done, I'll send some out for general testing. But to speed things up, I'm also going to contact one or two testers for the current prototype board.
I cannot thank Eric enough for his generous help in analyzing the FinalGROM. Without him, the FinalGROM wouldn't work today. And our collaboration was a lot of fun as well!