DavidEth Posted September 23, 2010 Share Posted September 23, 2010 Okay, so to teach myself the nitty-gritty details of 2600 programming, I'm writing an emulator. I'm far enough along that Pitfall, among others, is playable albeit with bugs. The only cart that I've run into serious problems with so far is Berzerk, because it polls INTIM without ever having written to it first, and the loop doesn't exit until it reads back a zero. The problem is that the initial decrement period is 1 cycle, so there's no guarantee it's ever going to match, causing the rom to hang on startup. I was able to get the rom to load and run the attract mode by hacking my emulator to start off in 8 cycle mode, and also to turn off the automatic switch to a period of one cycle after hitting zero. Is there a known issue with Berzerk? I searched the forums here but couldn't find anything. Otherwise, it's entirely possible some other unrelated bug is causing the problem because the cart eventually does hit INT64T properly. -Dave Quote Link to comment Share on other sites More sharing options...
Nukey Shay Posted September 23, 2010 Share Posted September 23, 2010 Berzerk isn't the only game that utilizes the timer before anything has been written. A LOT of games do...it's often shortcut (when the restart routine is coded near the VBLANK routine, etc). If you are "correcting" the timer, be aware that INTIM is also used in the game (among some others) to seed the random number generator. Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 Berzerk isn't the only game that utilizes the timer before anything has been written. A LOT of games do...it's often shortcut (when the restart routine is coded near the VBLANK routine, etc). If you are "correcting" the timer, be aware that INTIM is also used in the game (among some others) to seed the random number generator. (Cool, somebody referenced by name in Racing the Beam replied to me) How can a loop that polls INTIM for exactly zero reliably exit then if the period is 1? I'm decrementing the timer by the number of elapsed CPU cycles, so in a loop like that it's going to always be a multiple of 7 I think unless my cycle counts are off. Hmm.. maybe that's the problem, I'm not emulating the correct number of cycles so it's never hitting zero. -Dave Quote Link to comment Share on other sites More sharing options...
RevEng Posted September 23, 2010 Share Posted September 23, 2010 It looks like both stella and the z26 emulator initialize the timer in the 1024 cycle mode. I looked through the datasheet for the 6532 to no avail, but I'm thinking that the interval on powerup isn't the 1 cycle mode. Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 It looks like both stella and the z26 emulator initialize the timer in the 1024 cycle mode. I looked through the datasheet for the 6532 to no avail, but I'm thinking that the interval on powerup isn't the 1 cycle mode. Here's the loop it's sticking on: $F4FD LDA $0284 $F500 BNE $F4FD It's getting stuck because each insn takes four cycles, so INTIM might not ever hit zero exactly. All the docs I've seen online say a branch takes 2 cycles if it's not taken, 3 if it's taken, and 4 if it's taken and crosses a page boundary (which the above does). Is this not correct? Thanks, -Dave Quote Link to comment Share on other sites More sharing options...
Nukey Shay Posted September 23, 2010 Share Posted September 23, 2010 That is correct. 65xx uses 4 cycles for the read, and 4 cycles for the branch. And what was mentioned above is correct, Z26 was using 1024 intervals on cold start every time. Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 That is correct. 65xx uses 4 cycles for the read, and 4 cycles for the branch. And what was mentioned above is correct, Z26 was using 1024 intervals on cold start every time. Is the initial value of INTIM guaranteed to be something far from zero? Even with 1024 intervals, if it randomly gets an initial value of 1 it might hit zero and reset to "interval 1" mode before that instruction sequence can be reached? I tried setting the initial mode to 1024, but that didn't help. Initializing INTIM to 1 or 2 wasn't sufficient either, but setting INTIM to 255 worked. Of course, that renders it useless for a random number seed. I have a way around it for now, but I would like to understand the problem better. -Dave Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 (edited) For the record, I figured out the problem. Going back a few steps, I noticed in surround that when you moved down you'd skip two positions instead of only one. lda playerVertPos,x ; get the player's vertical position clc adc #1 ; increment vertical position by 1 sbc #YMAX-1 bmi .wrapPlayerToTop : : : .wrapPlayerToTop adc #YMAX sta playerVertPos,x Looking carefully at the disassembly and runtime trace, I could see that the accumulator was being incremented twice here somehow. I eventually figured out that my implementation of the SBC instruction had the sense of the output carry incorrect (ironically, when NOT in decimal mode), which meant that the adc #YMAX was adding one more than expected. Once I fixed that, when combined with properly setting the initial INTIM mode to 1024 cycles, allowed Berzerk to start up with any value in INTIM. So that busy loop must be within 1024 cycles of the program start. Thanks everybody! -Dave ps. it always amazes me how well games still work when you don't even have the cpu quite right yet. Edited September 23, 2010 by DavidEth Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 (edited) Okay, next question (figured I probably shouldn't flood the forum with a bunch of new threads) -- Anything weird about the collision registers? Collisions seem to work in Pitfall and Barnstorming, but they don't work in Surround or Combat. My collision register update is as follows: #define update_col(reg,a,b,c) col_ram[reg] |= (a&b&0x80) | ((a&c&0x80)>>1) // update all of the collision latches update_col(CXM0P,m0_serial,p1_serial,p0_serial); update_col(CXM1P,m1_serial,p0_serial,p1_serial); *** EDIT p0/p1 were reversed. Didn't fix Combat or Surround though. update_col(CXP0FB,p0_serial,pf_serial,bl_serial); update_col(CXP1FB,p1_serial,pf_serial,bl_serial); update_col(CXM0FB,m0_serial,pf_serial,bl_serial); update_col(CXM1FB,m1_serial,pf_serial,bl_serial); col_ram[CXBLPF] |= (bl_serial & pf_serial & 0x80); col_ram[CXPPMM] |= (p0_serial&p1_serial&0x80) | ((m0_serial&m1_serial&0x80)>>1); All of the "serial" variables are eight bits wide and have 0x80 in the MSB if we're scanning out that signal. Collision ram is initialized to zero on startup and also by a write to CXCLR. Do the LSB's (below bits 7 and 6) read back as something other than zero on real hardware? I guess my other last question is when are GRP0 and GRP1 actually "latched" during scanout? The scores and Activision logos don't quite work correctly (I believe they're implemented as two "three close copies" sprites and hitting GRP0/1 at the exact right times during the current scanline. I've tried adjusting the emulation so the write happens on either the last cycle or next-to-last cycle (either before or after emulating the TIA cycle). -Dave Edited September 23, 2010 by DavidEth Quote Link to comment Share on other sites More sharing options...
Thomas Jentzsch Posted September 23, 2010 Share Posted September 23, 2010 If a loop lasts 5 cycles, it will eventually hit 0 (within 5 full 256 cycle loops). There would only be a problem, if the loop has 2^n cycles. Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 If a loop lasts 5 cycles, it will eventually hit 0 (within 5 full 256 cycle loops). There would only be a problem, if the loop has 2^n cycles. Right, but the loop we were talking about was eight cycles long, which is 2^3. Unless you're talking about one of my other questions. (I suspect collision between player and playfield isn't working for me since collisions in Pitfall are working fine with the snake and scorpion and fire, but they aren't working in Surround or Combat) -Dave Quote Link to comment Share on other sites More sharing options...
+batari Posted September 23, 2010 Share Posted September 23, 2010 Okay, next question (figured I probably shouldn't flood the forum with a bunch of new threads) -- Anything weird about the collision registers? Collisions seem to work in Pitfall and Barnstorming, but they don't work in Surround or Combat. My collision register update is as follows: #define update_col(reg,a,b,c) col_ram[reg] |= (a&b&0x80) | ((a&c&0x80)>>1) // update all of the collision latches update_col(CXM0P,m0_serial,p1_serial,p0_serial); update_col(CXM1P,m1_serial,p1_serial,p0_serial); update_col(CXP0FB,p0_serial,pf_serial,bl_serial); update_col(CXP1FB,p1_serial,pf_serial,bl_serial); update_col(CXM0FB,m0_serial,pf_serial,bl_serial); update_col(CXM1FB,m1_serial,pf_serial,bl_serial); col_ram[CXBLPF] |= (bl_serial & pf_serial & 0x80); col_ram[CXPPMM] |= (p0_serial&p1_serial&0x80) | ((m0_serial&m1_serial&0x80)>>1); All of the "serial" variables are eight bits wide and have 0x80 in the MSB if we're scanning out that signal. Collision ram is initialized to zero on startup and also by a write to CXCLR. Do the LSB's (below bits 7 and 6) read back as something other than zero on real hardware? I guess my other last question is when are GRP0 and GRP1 actually "latched" during scanout? The scores and Activision logos don't quite work correctly (I believe they're implemented as two "three close copies" sprites and hitting GRP0/1 at the exact right times during the current scanline. I've tried adjusting the emulation so the write happens on either the last cycle or next-to-last cycle (either before or after emulating the TIA cycle). -Dave Bits 0-5 are not driven on TIA read registers and generally shouldn't matter. Two things you should check are if you are not mirroring the TIA read registers to all possible addresses where they are mapped, and make sure the BIT instruction is setting the V and N flags based on bits 6 and 7, respectively. Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 Bits 0-5 are not driven on TIA read registers and generally shouldn't matter. Two things you should check are if you are not mirroring the TIA read registers to all possible addresses where they are mapped, and make sure the BIT instruction is setting the V and N flags based on bits 6 and 7, respectively. Here's my address read decode logic: // http://www.bjars.com/resources/2600_mem_map.txt u8 read(u16 addr) { if (addr & 0x1000) return read_rom(addr); else if (addr & 0x80) { if (addr & 0x200) return riot_read(addr & 0x1F); else return ram[addr & 0x7F]; } else return col_ram[addr & 0x3F]; } There was a bug in BIT last night, but it was because I wasn't correctly setting the zero flag based on the result of the accumulator AND the test memory location. #define BIT(am) EA_##am(); t8 = READ_##am(); p = (p & ~(SF|VF|ZF)) | (t8 & (SF|VF)) | ((t8&a)? 0 : ZF) Read the operand into a temporary, set p by turning off SVZ, then or'ing in SF and VF from the source operand, and setting the zero flag if the logical and of the source and accumulator is zero. I think it's correct now. (Also, I edited my original post -- there was a mistake in the m1/p1 m1/p0 logic (p0 and p1 were reversed) but it didn't fix any bugs. -Dave Quote Link to comment Share on other sites More sharing options...
Thomas Jentzsch Posted September 23, 2010 Share Posted September 23, 2010 Right, but the loop we were talking about was eight cycles long, which is 2^3. Unless you're talking about one of my other questions. My bad, I should have read your message a bit more precisely. Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 (edited) Figured out the collision issue -- Surround was reading collision data from 0x30. The doc I was using said TIA reads repeated every 64 bytes, but I took a close look at the decode logic in the stella pdf and it only uses the four LSB's. So once I put the appropriate mask in, it started seeing the correct address. Batari, thanks for suggesting the TIA mirroring, that turned out to be the problem! Collisions in Combat and Surround work now. Still can't fire a shot in Combat yet for some reason (and I even added support for writing to the three unused bits in swchb since Combat supposedly uses that for something). Guess the other main question I have left is the one about GRP0/1 scanout. The Activision logo works in Laser Blast, for example, but not Pitfall? -Dave Edited September 23, 2010 by DavidEth Quote Link to comment Share on other sites More sharing options...
+batari Posted September 23, 2010 Share Posted September 23, 2010 (edited) Yep, only the writes are mirrored every 64 bytes. Reads are mirrored every 16 bytes. The same mirroring applies to INPT0-INPT5, which I assume are being handled separately. Edited September 23, 2010 by batari Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 (edited) Last night I'd noticed that INC, DEC, ROR, ROL, ASL, and LSR all had "unusual" instruction counts compared to the other read-modify-write ALU operations, but I only added one extra cycle instead of two. Fixing that definitely stabilized a few visual things (the copyright in Pitfall is *almost* exactly correct now, for example). -Dave Edited September 23, 2010 by DavidEth Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 (edited) I rewrote a bunch of my instruction timing code, so that I trigger a cpu cycle (and therefore three TIA cycles) every time I read or write memory, etc. This stabilized the text display in Pitfall -- almost. All three lines are perfectly correct now, with the exception of the very first character cell of the six. This loop is trickier than it looks: ShowDigits SUBROUTINE sta WSYNC ; 3 ;--------------------------------------- sta HMOVE ; 3; 3 lda colorLst ; 3; 6 sta COLUP0 ; 3; 9 sta COLUP1 ; 3; 12 ldy #0 ; 2; 14 sty REFP0 ; 3; 17 sty REFP1 ; 3; 20 ldx #$10|THREE_COPIES; 2; 22 stx NUSIZ0 ; 3; 25 sta RESP0 ; 3; 28; <== horizontal position sta RESP1 ; 3; 31 stx HMP1 ; 3; 34 sta WSYNC ; 3; 37 ;--------------------------------------- sta HMOVE ; 3; 3 (this is directly after a WSYNC) stx NUSIZ1 ; 3; 6 iny ; 2; 8 sty CTRLPF ; 3; 11 enable playfield reflection lda #DIGIT_H-1 ; 2; 13 sta VDELP0 ; 3; 16 sta VDELP1 ; 3; 19 sta temp2 ; 3; 22 sta HMCLR ; 3; 25 jsr SkipIny ; 47 just waste 22 cycles lda temp3 ; 50 just waste three cycles .loopDigits: ldy temp2 ; 3; 3 lda (digitPtr+10),y ; 5; 8 sta temp1 ; 3; 11 lda (digitPtr+,y ; 5; 16 tax ; 2; 18 lda (digitPtr),y ; 5; 23 ora temp3 ; 3; 26 show lives when drawing time; 26 cycles to here sta HMOVE ; 3; 29 / 0 produce HMOVE blanks (50+26 cycles=76, new scanline) sta GRP0 ; 3; 32 / 3 lda (digitPtr+2),y ; 5; 37 / 8 sta GRP1 ; 3; 40 / 11 lda (digitPtr+4),y ; 5; 45 / 16 sta GRP0 ; 3; 48 / 19 - this somehow does NOT get scanned out lda (digitPtr+6),y ; 5; 53 ldy temp1 ; 3; 56 sta GRP1 ; 3; 59 stx GRP0 ; 3; 62 sty GRP1 ; 3; 65 sta GRP0 ; 3; 68 dec temp2 ; 5; 73 bpl .loopDigits ; 2³; 76 cycles per iteration just like we'd hope Looks like RESP0 is reset at cpu cycle 28, but we re-write GRP0 on cycle 19 of a later scanline and the original GRP0 doesn't get scanned out? What I'm seeing in my code is the third character appearing in both the first and third cells. The behavior makes sense because GRP0 is rewritten with a new value before my code is latching it. What accounts for this delay in processing? It almost seems like GRP0 gets latched for the initial copy a specific number of cycles before we start scanning out the copy, but later copies are latched fewer cycles before they're scanned out? -Dave Edited September 23, 2010 by DavidEth Quote Link to comment Share on other sites More sharing options...
+batari Posted September 23, 2010 Share Posted September 23, 2010 Are you implementing VDEL correctly? If VDELP0 and VDELP1 are both set, a write to GRP0 will be delayed until a write to GRP1 takes place, and vice-versa. Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 23, 2010 Author Share Posted September 23, 2010 Are you implementing VDEL correctly? If VDELP0 and VDELP1 are both set, a write to GRP0 will be delayed until a write to GRP1 takes place, and vice-versa. No, I'm definitely not implementing VDELP0/1 correctly, because I haven't been able to make sense of it in the stella manual and it wasn't clear whether they were necessary for games that updated every scanline? So if neither VDELP0 or VDELP1 are set, writes to GRP0 and GRP1 go straight through? What happens when: VDELP0 is 1 and you write to GRP0 VDELP0 is 1 and you write to GRP1 VDELP1 is 1 and you write to GRP0 VDELP1 is 1 and you write to GRP1 It seemed like there was some weird cross-dependency on each other. -Dave Quote Link to comment Share on other sites More sharing options...
+batari Posted September 24, 2010 Share Posted September 24, 2010 Are you implementing VDEL correctly? If VDELP0 and VDELP1 are both set, a write to GRP0 will be delayed until a write to GRP1 takes place, and vice-versa. No, I'm definitely not implementing VDELP0/1 correctly, because I haven't been able to make sense of it in the stella manual and it wasn't clear whether they were necessary for games that updated every scanline? So if neither VDELP0 or VDELP1 are set, writes to GRP0 and GRP1 go straight through? I believe so. Think of the TIA as having two copies of GRP0 and GRP1, called GRP0.A, GRP0.B, GRP1.A, and GRP1.B. When neither is set, only the "A" copy is used and displayed. Others: correct me if I'm wrong on any of these... What happens when: VDELP0 is 1 and you write to GRP0 The value is copied to GRP0.B, while whatever was in GRP0.A will continue to be displayed until a write to GRP1 occurs. VDELP0 is 1 and you write to GRP1 GRP0.B is copied to GRP0.A. VDELP1 is 1 and you write to GRP0 GRP1.B is copied to GRP1.A. VDELP1 is 1 and you write to GRP1 The value is copied to GRP1.B, while whatever was in GRP1.A will continue to be displayed until a write to GRP0 occurs. Note that there is also a VDELBL for the ball, which operates similar to the above. Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 24, 2010 Author Share Posted September 24, 2010 (edited) Okay, looking at page 30 of stella.pdf, it seems like: 1. Writing to GRP0 actually writes to undelayed GRP0 and delayed GRP1. 2. Writing to GRP1 actually writes to undelayed GRP1 and delayed GRP0 3. VDELP0 selects whether GRP0 undelayed or GRP0 delayed actually gets latched. 4. VDELP1 selects whether GRP1 undelayed or GRP1 delayed actually gets latched. Can you change VDELP0/1 at any point in a scanline and it will take effect immediately? EDIT - looks like our posts crossed each other. I don't think I understand it quite well enough yet to turn it into C code. -Dave Edited September 24, 2010 by DavidEth Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 24, 2010 Author Share Posted September 24, 2010 (edited) Okay, in pseudocode: if (addr == GRP0) { if (VDELP0 == 0) grp0_current = value; else grp0_delayed = value; if (VDELP1 == 1) grp1_current = grp1_delayed; } else if (addr == GRP1) { if (VDELP1 == 0) grp1_current = value; else grp1_delayed = value; if (VDELP0 == 1) grp0_current = grp0_delayed; } TIA always latches grp0_current and grp1_current. Is this correct? EDIT - I implemented this, and it does seem to fix the score / copyright display in Pitfall! I'm a little unclear how this logic extends to the ball directly since there's no "GRBL" register. Thanks again, -Dave Edited September 24, 2010 by DavidEth Quote Link to comment Share on other sites More sharing options...
+batari Posted September 24, 2010 Share Posted September 24, 2010 For the ball, there is the ENABL register, so just think if this as having two values, and is triggered by a write to GRP1. Quote Link to comment Share on other sites More sharing options...
DavidEth Posted September 24, 2010 Author Share Posted September 24, 2010 (edited) For the ball, there is the ENABL register, so just think if this as having two values, and is triggered by a write to GRP1. Happen to know any games offhand that use VDELBL? Combat is still pretty flaky -- can't get the bullets to fire reliably and while the tanks show up, the planes only flicker briefly. Laser Blast is kinda funny too -- the enemy shots work fine but your laser is invisible even though it still destroys things. EDIT - Turns out the ladder in Pitfall is implemented with the ball, so I was able to use it as a test case. -Dave Edited September 24, 2010 by DavidEth Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.