Jump to content





Reindeer Rescue Bug

Posted by , 15 December 2005 · 675 views

2600 Game Development Reindeer Rescue
JUST WHEN I THOUGHT I WAS DONE it turns out I'm not. :|

Anyway, I'll just reproduce what Al wrote to me and ask: anybody have any explanations or a fix?

Hi Bob,When playing the game (NTSC) on a real 2600, I ran into an interesting bug. When starting the game, if you don't do anything, Santa ends up running right through the playfield, and the game shortly resets back to the title screen after that. I burned two games to boards to test, and only discovered this after I played int the third level. It was a fluke that I noticed this at all, but it's a pretty serious problem so I'm going to start erasing the 100 (well, uhhr, 98) EPROMs I burned.This does not happen in Stella, which is probably why nobody saw it. before. Do any of the other testers have Krokodile Carts or Cuttle Carts they can use to test this on a real 2600?

Message Two:

I did some more digging-- it seems that the game only fails on one type of EPROM, the Texas Instruments 27C128. This is strange, as I use these all the time for Thrust+ Platinum. The game does seem to work fine with all the other types of EPROMs I programmed (five other types). Unfortunately, the TI 27C128s are the ones I have the most of, and comprise probably 60 of the EPROMs I programmed. Really strange that this problem only occurs with this one part. I assume it's some kind of weird timing issue. I'm hoping that you might be able to come up with some theories as to why this might be happening and maybe a fix? I'm a bit nervous about soldering more boards until I hear back from you.

If anybody has a Kroc cart or a CC and want to test Reindeer Runner for me? Manuel, I'm assuming that you have already, since you ran this on a real TV to test the PAL colors.This has me baffled; it is beyond my "expertise."The only weird thing that the game does is hit HMOVE at an odd time and, as mentioned in Eric B's weblog, it outputs 270 scanlines (NTSC); I do store the value of CXP0FB in RAM halfway down the screen (right before drawing the string of Christmas lights) and then use that stored value to process collisions between Santa and the playfield; so I suppose that maybe that value is being corrupted or lost somehow...? I dunno.Help?
(Note: I'm going to post this same thing on the [stella] list...)

EDIT: This is the pic referred to in comment #58:Attached Image




Are you using any undocumented opcodes? Are you initializing everything (TIA & RIOT registers, RAM)? Could it be starting in the wrong bank? (i.e. what are $FFFA - $FFFF pointing to in each bank) What are the dependencies of the Santa/playfield collision routine?
  • Report
Since it is EPROM related IMO it must have something to do with the timings... :)

Could the bankswitching be part of the problem? ;)
  • Report
I have a Krokocart and I'd test the ROM for you.

I'm not very good at a lot of games, but it sounds like this bug shouldn't be hard to test for(?)
  • Report

Are you using any undocumented opcodes?  Are you initializing everything (TIA & RIOT registers, RAM)?  Could it be starting in the wrong bank?  (i.e. what are $FFFA - $FFFF pointing to in each bank)  What are the dependencies of the Santa/playfield collision routine?

I am using undocumented opcodes; I think only DCP (as part of SkipDraw).

Starting in the wrong bank...hmmmm. I'll double check right now, but I'm pretty sure every bank except the first has RESET/BRK vectors pointing at $F000 and, at $F000 in every bank but the first is a 'sta $1FF6'. The first bank has the vectors pointing to $F003, where the CLEAN_START macro sits.

The main tricky thing about Santa-to-playfield collisions is that, since the playfield and the players overlap in the status portion of the screen (at the bottom), I have to store CXP0FB in RAM right above the status section. Then all collisions are based on the value in RAM, rather than directly on the value in the collision register. But unless the EPROM is corrupting RIOT RAM (!?! Probably not!) I don't know how that would affect anything.

Could the bankswitching be part of the problem?

Hmmm...I wonder if the bank switch interferes with writes to RAM? Seems odd, and perhaps unlikely, but I'll look into that.

I have a Krokocart and I'd test the ROM for you.

PM (and ROM) sent!

Thanks, everyone!

EDIT: Ok, I checked, after drawing most of the screen, the ROM bankswitches before drawing the floor and, immediately after drawing the floor is when I store the value of CXP0FB in RAM - this happens over 6 scanlines after bankswitching and the ROM doesn't bankswitch again for another ~30 scanlines. So...I'm still baffled. And I double checked the startup routines; as far as I can tell the ROM should have no problem starting up in any bank; it's all as I described it above.
  • Report
Hi there!

If anybody has a Kroc cart or a CC and want to test Reindeer Runner for me? Manuel, I'm assuming that you have already, since you ran this on a real TV to test the PAL colors.


I did all PAL testing on the real thing with a CC.
I just ran it again and those 2 scenarios flawlessly work for me:

- Play up to level 3. Die there. Wait until the GO sequence is done. Start again.
- Play up to level 3. Hit RESET there. Start again.

If there's more to it, (like starting the game with Fire Button vs. RESET or certain diff switch settings (I had B/B here) or stuff like "die with x lives remaining" let me know and I'll run it again immediately!

Ideas:

- Bankswitching!
- Does it compile with "branch out of reach"?
- Check all used constants for the #

Greetings,
Manuel
  • Report

I did all PAL testing on the real thing with a CC.
I just ran it again and those 2 scenarios flawlessly work for me:

- Play up to level 3. Die there. Wait until the GO sequence is done. Start again.
- Play up to level 3. Hit RESET there. Start again.

Ideas:

- Bankswitching!
- Does it compile with "branch out of reach"?
- Check all used constants for the #

Hi Manuel, I don't think I ever sent you the final binary :) I'll PM it to you right now (or check your email, I distributed it on the [stella] list).

What should I look for when looking for problems with bankswitching?
Does not compile with any "branch out of reach" errors...or any other errors/messages of any kind, for that matter.
Will check constants for #.

Thanks, everyone.

-bob
  • Report
I tried to let Santa run through the playfield on both areas 1 and 2... but they seem to be working correctly. I'll keep at the game for a while to try area 3.

Nice game, btw! :)
  • Report

I tried to let Santa run through the playfield on both areas 1 and 2... but they seem to be working correctly.  I'll keep at the game for a while to try area 3.

Nice game, btw!  :)

Thanks, and thanks!
  • Report

Hi Manuel, I don't think I ever sent you the final binary :)

;) Manuel did the PAL tests for us.
  • Report

Hi Manuel, I don't think I ever sent you the final binary ;)

;) Manuel did the PAL tests for us.

I know - the binary went through about 5 "final" versions in the last two days, mostly fixing bugs related to the score and the high score (I kinda, uh, forgot to set the decimal flag in one spot ;) and I the routine that compared the current score to the high score was buggy), and I didn't copy everyone with every revision. I didn't want you all to hate me. :)

So anyway, the versions posted on [stella] are the "final" binaries that I sent to Al and that he began burning to EPROMs yesterday.
  • Report
Well... shoot. Sorry about that - I guess I jinxed you with the "Congratulations" comment yesterday. :)

I would have tested it on real hardware except I still don't have my Kroko Cart. ;)
  • Report
Hi folks,

First, thanks for the quick feedback and good suggestions everyone. It was supercat (in a PM) and, a little after the fact on [stella], David Galloway, who tipped me off to what (I *think*) was the real problem. Namely, that this code just plain doesn't work on certain EPROMs (why?):
  lda CXP0FB
   sta SantaTemp
Changing that code to the following fixed the problem:
  bit CXP0FB
   bpl NoCollision
   lda #$FF
   .byte $2C
NoCollision
   lda #0
   sta SantaTemp
Thanks again for all the suggestions and ideas; and anybody have an explanation???

I assume it is the timing difference between LDA and BIT, but according to this webpage (which, admittedly, is for the 6510), the timing for those two operations is the same.
 Zero page addressing

     Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
                        LAX, NOP)

        #  address R/W description
       --- ------- --- ------------------------------------------
        1    PC     R  fetch opcode, increment PC
        2    PC     R  fetch address, increment PC
        3  address  R  read from effective address

-bob
  • Report

I assume it is the timing difference between LDA and BIT, but according to this webpage (which, admittedly, is for the 6510), the timing for those two operations is the same.

 Zero page addressing

     Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
                        LAX, NOP)

        #  address R/W description
       --- ------- --- ------------------------------------------
        1    PC     R  fetch opcode, increment PC
        2    PC     R  fetch address, increment PC
        3  address  R  read from effective address

-bob

The 6510 is functionally the same as the 6507, so the above is fine.

However, I understand that LDA, LDX, LDY and LAX operate a little differently than the others. I don't know if this has any effect, but I think (and I hope someone will correctly if I am wrong) that the other instructions actually perform the operation during the 4th cycle but simultaneously fetch the opcode for the next instruction, so it seems like 3 cycles.

Regardless, my gut tells me that it's a problem with how the EPROM affects the data bus. If the data bus has open-collector outputs (I am not sure if it does or not) then it can easily be pulled low. In LDA CXP0FB, the data bus will contain $A5...$02...then the contents of $02. Maybe the $02 is staying on the bus too long. I wonder if you changed it to LDA $E000+CXP0FB, then the bus would contain $AD...$02...$E0 and I wonder if this would work, since the high byte would be 1 instead of 0.
  • Report

The 6510 is functionally the same as the 6507, so the above is fine.

However, I understand that LDA, LDX, LDY and LAX operate a little differently than the others.  I don't know if this has any effect, but I think (and I hope someone will correctly if I am wrong) that the other instructions actually perform the operation during the 4th cycle but simultaneously fetch the opcode for the next instruction, so it seems like 3 cycles.

Regardless, my gut tells me that it's a problem with how the EPROM affects the data bus.  If the data bus has open-collector outputs (I am not sure if it does or not) then it can easily be pulled low.  In LDA CXP0FB, the data bus will contain $A5...$02...then the contents of $02.  Maybe the $02 is staying on the bus too long.  I wonder if you changed it to LDA $E000+CXP0FB, then the bus would contain $AD...$02...$E0 and I wonder if this would work, since the high byte would be 1 instead of 0.

I don't know if you've been following this on the [stella] list or not, but Al has been kind enough to test a couple of different permutations...

So:
This doesn't work:
  lda CXP0FB
   sta SantaTemp
This does work:
  bit CXP0FB
   bpl NoCollision
   lda #$FF
   .byte $2C
NoCollision
   lda #0
   sta SantaTemp
However, this also works:
  lda CXP0FB
   bpl NoCollision
   lda #$FF
   .byte $2C
NoCollision
   lda #0
   sta SantaTemp
But this does not work:
  lda CXP0FB
   nop
   nop
   nop
   nop
   nop
   nop
   nop
   sta CXP0FB
And, elsewhere in the code, this does work:
  lda CXPPMM
   bmi CollisionHappened
Does that fit with your theory?
  • Report

And, elsewhere in the code, this does work:

  lda CXPPMM
   bmi CollisionHappened
Does that fit with your theory?

This one doesn't really, but then again CXPPMM is different than CXM0FB, so I'm not totally sure.

I guess I'm a couple of hours late. But I still wonder if my suggestion will work. If you try it and it works, please post the result here.
  • Report
You're definitely not too late!

EDIT: Ok, Al tested that version and it did not work, Fred.

But supercat had a theory of his own and his own fix to try, which ended up working.

His looked something like this:

Replace
  lda CXP0FB
   sta SantaTemp
With this:
  lda CXP0FB
   and #$C0
   ora #2
   jmp ImHere
ImHere
   sta SantaTemp
I didn't quite understand his explanation of why that might work, so I'm waiting to hear back from him.
  • Report

However, I understand that LDA, LDX, LDY and LAX operate a little differently than the others.  I don't know if this has any effect, but I think (and I hope someone will correctly if I am wrong) that the other instructions actually perform the operation during the 4th cycle but simultaneously fetch the opcode for the next instruction, so it seems like 3 cycles.


On the 6502, the processor has to know what it's going to do with the memory bus on any given cycle before the previous cycle completes, with two exceptions:

-1- The low-order part of the address may be taken from the previous cycle's data, or
-2- The high-order part of the address may be taken from the previous cycle's data.

If you perform any instruction that ends with a read operation, the processor will start processing the opcode fetch for the next instruction before it finishes processing the read data. This is fine, though, since the opcode fetch will be unaffected by the previous operation.

Regardless, my gut tells me that it's a problem with how the EPROM affects the data bus.  If the data bus has open-collector outputs (I am not sure if it does or not) then it can easily be pulled low.  In LDA CXP0FB, the data bus will contain $A5...$02...then the contents of $02.  Maybe the $02 is staying on the bus too long.  I wonder if you changed it to LDA $E000+CXP0FB, then the bus would contain $AD...$02...$E0 and I wonder if this would work, since the high byte would be 1 instead of 0.


When performing a "LDA $02" instruction, there are about 420ns between the time A12 goes low (indicating the EPROM should shut up) and the time the TIA starts outputting its data, and another 420ns before the CPU actually reads the data. Any EPROM which can't shut up within 420ns is broken.

One source of incompatibility, though, is that the TIA doesn't bother to say anything on bits 0-5. They may float high, they may float low, or they may just sit at the last driven value. This would most likely be 2, but it could at least theoretically be something else (if the EPROM is faster than the GAL, the changing address could cause it to output a goofy data byte before the chip becomes deselected). Thus, bits 0-5 of any TIA read should be regarded as indeterminate.

Note that this only affects bits 0-5. Bits 6-7 will be actively driven by the TIA, and it should have no trouble whatsoever doing so.
  • Report

You're definitely not too late!

EDIT: Ok, Al tested that version and it did not work, Fred.

But supercat had a theory of his own and his own fix to try, which ended up working.

His looked something like this:

Replace

  lda CXP0FB
   sta SantaTemp
With this:
  lda CXP0FB
   and #$C0
   ora #2
   jmp ImHere
ImHere
   sta SantaTemp
I didn't quite understand his explanation of why that might work, so I'm waiting to hear back from him.


Well, my theory was that it might be an address-sensitivity problem elsewhere in the code, solved by expanding this bit of code by seven bytes. Using seven NOPs might not be an adequate substitute because they take too long to execute. Try changing the above code to
With this:
  lda CXP0FB
   jmp ImHere
   and #$C0
   ora #2
ImHere
Same size. Four cycles faster, but that shouldn't matter--the key thing is that it's not fourteen cycles slower. I must confess to still being puzzled.

Incidentally, the earlier code would force the accumulator to have the value it would normally have if D0-D5 float cleanly.
  • Report

  lda CXP0FB
   jmp ImHere
   and #$C0
   ora #2
ImHere


If this happens to re-break things, then try:
  lda CXP0FB
   and #$FF
   ora #0
   jmp ImHere
ImHere

In this version, the "AND" and "ORA" don't affect the accumulator contents, but they do take time to execute. If this version is also re-broken, then something in the code depends upon D0-D5 holding certain values. If the code works with the second version but not the first, then there is something that requires the right amount of delay. And if both versions work, then there's something elsewhere in the code that's address-sensitive. In that case, you should try removing this extra code and inserting seven extra bytes at various other places (probably between routines). You may find a routine which will work if seven extra bytes are placed before it but not if they're placed after.
  • Report

  lda CXP0FB
   jmp ImHere
   and #$C0
   ora #2
ImHere


If this happens to re-break things, then try:
  lda CXP0FB
   and #$FF
   ora #0
   jmp ImHere
ImHere

In this version, the "AND" and "ORA" don't affect the accumulator contents, but they do take time to execute. If this version is also re-broken, then something in the code depends upon D0-D5 holding certain values. If the code works with the second version but not the first, then there is something that requires the right amount of delay. And if both versions work, then there's something elsewhere in the code that's address-sensitive. In that case, you should try removing this extra code and inserting seven extra bytes at various other places (probably between routines). You may find a routine which will work if seven extra bytes are placed before it but not if they're placed after.


Thanks, I'll send 'em off to Al for testing :)
  • Report