Cybernoid Posted July 6, 2005 Share Posted July 6, 2005 I am trying to write a cycle accurate 6502 hardware implementation in Verilog, and have a few things that make me go, "Hmmmm".... From what addresses will the 65C02 grab the address in the following cases: If the instruction is "LDA ($ff,X)" and X=0, will the address for the load be taken from $00ff and $0100 or from $00ff and $0000? Same question for "LDA ($ff),Y" and Y=??. Will the address be pulled from $00ff and $0100 or from $00ff and $0000? Just trying to make this complete... Thanks! Quote Link to comment Share on other sites More sharing options...
Kroko Posted July 6, 2005 Share Posted July 6, 2005 I am trying to write a cycle accurate 6502 hardware implementation in Verilog, and have a few things that make me go, "Hmmmm".... From what addresses will the 65C02 grab the address in the following cases: If the instruction is "LDA ($ff,X)" and X=0, will the address for the load be taken from $00ff and $0100 or from $00ff and $0000? Same question for "LDA ($ff),Y" and Y=??. Will the address be pulled from $00ff and $0100 or from $00ff and $0000? Just trying to make this complete... Thanks! 886965[/snapback] Zero page indexed addressing Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT, LAX, NOP) # address R/W description --- --------- --- ------------------------------------------ 1 PC R fetch opcode, increment PC 2 PC R fetch address, increment PC 3 address R read from address, add index register to it 4 address+I* R read from effective address Notes: I denotes either index register (X or Y). * The high byte of the effective address is always zero, i.e. page boundary crossings are not handled. # address R/W description --- --------- --- --------------------------------------------- 1 PC R fetch opcode, increment PC 2 PC R fetch address, increment PC 3 address R read from address, add index register X to it 4 address+X* R read from effective address 5 address+X* W write the value back to effective address, and do the operation on it 6 address+X* W write the new value to effective address Note: * The high byte of the effective address is always zero, i.e. page boundary crossings are not handled. So I guess you always need to forget the highbyte in these adress modes. Quote Link to comment Share on other sites More sharing options...
LocalH Posted July 6, 2005 Share Posted July 6, 2005 (edited) As far as I know, the only bug is in the indirect jump. If you do an indirect jump with a low byte of FF, then the CPU will not increment the high byte and read from the wrong page. For example, if you do JMP ($0FFF), then the CPU will get it's final target address from $0FFF and $0F00. As far as I know, all other indirect opcodes work properly. I'm pretty sure this bug is fixed in the 65C02 and 65816, but most if not all NMOS 6502 implementations have the bug. Edit: Gah, I misread, and thought you were asking about the bug. ZP always ignores the high byte for a saving of 1 cycle. Basically, indirect jumps forget to handle the high byte like ZP opcodes are designed to do. It still reads both bytes, and thus takes the non-ZP cycle count. It just forgets to increment the high byte. Edited July 6, 2005 by LocalH Quote Link to comment Share on other sites More sharing options...
djmips Posted July 6, 2005 Share Posted July 6, 2005 (edited) In batari's post about undocumented opcodes you can find a link to a document that seems to be quite authoritative on the 6502. It's more geared toward the 6510 but it is probably applicable to the 65c02. I've seen that the 6502 family isn't as straightforward as I personally thought with a few quirks that are important to duplicate, including the undocumented opcodes. Edited July 6, 2005 by djmips Quote Link to comment Share on other sites More sharing options...
LocalH Posted July 6, 2005 Share Posted July 6, 2005 Also, make sure you remember the RMW instructions that read the target address, modify it (by writing $FF), then write the target value. This comes in handy in some instances, for example on the C64 to ack a raster IRQ you can just do LSR $D019 and save a few cycles. Quote Link to comment Share on other sites More sharing options...
Bryan Posted July 6, 2005 Share Posted July 6, 2005 (edited) All zero-page instructions use an 8-bit PC (they actually shut off the high-byte) so they wrap at $FF. Oh, and the Indirect Jump from XXFF issue is not really a bug. It's simply part of keeping the chip simple (align your vectors with even addresses, and it won't happen!). -Bry P.S. The 6502 and 65C02 behave differently on the wasted cycles. The 65C02 makes sure that they are harmless re-reads of the previously read address, but on the 6502, they can be reads or writes involving incompletely calculated addresses or data. Edited July 6, 2005 by Bryan Quote Link to comment Share on other sites More sharing options...
LocalH Posted July 6, 2005 Share Posted July 6, 2005 Well, it's a bug in the sense that I'm pretty sure JMP is the only instruction that exhibits that behavior. So, if you do LDA ($0FFF,X) with X=0, then I'm sure it will read from $0FFF and $1000. Like I said, I haven't done them myself, as I do tend to align my vectors on even bytes. But nonetheless, if you're aiming to replicate 100% accurate behavior, then you must take it into account. Quote Link to comment Share on other sites More sharing options...
Bryan Posted July 6, 2005 Share Posted July 6, 2005 Well, it's a bug in the sense that I'm pretty sure JMP is the only instruction that exhibits that behavior. So, if you do LDA ($0FFF,X) with X=0, then I'm sure it will read from $0FFF and $1000. Like I said, I haven't done them myself, as I do tend to align my vectors on even bytes. But nonetheless, if you're aiming to replicate 100% accurate behavior, then you must take it into account. 887021[/snapback] Actually, it will read from $0FFF and $0F00. -Bry Quote Link to comment Share on other sites More sharing options...
Cybernoid Posted July 6, 2005 Author Share Posted July 6, 2005 (edited) Well, it's a bug in the sense that I'm pretty sure JMP is the only instruction that exhibits that behavior. So, if you do LDA ($0FFF,X) with X=0, then I'm sure it will read from $0FFF and $1000. Like I said, I haven't done them myself, as I do tend to align my vectors on even bytes. But nonetheless, if you're aiming to replicate 100% accurate behavior, then you must take it into account. 887021[/snapback] Actually, it will read from $0FFF and $0F00. -Bry 887043[/snapback] I didn't think that the "LDA ($0FFF,X)" was possible, since the instruction is only 2 bytes long. I think you can only do a zero page memory location: "LDA ($FF,X)". Ok, so it looks like the (Indirect,X) addressing is understood to wrap around. I know that the "Zero Page,X" addressing wraps around. What is strange is that the (Indirect),Y addressing, I believe does take the page boundary into account for the final address, but not the inital zero page address... I think... That is the following will happen (please correct me if I am wrong): Y=$23 LDA ($FF),Y Memory @ address $00FF = $FE Memory @ address $0000 = $40 Cycle: what happens -------------------------- 1: read opcode $B1 2: read $FF 3: read byte from $00FF (=$FE) 4: read byte from $0000 (=$40), add Y to $FE = $121 5: add 1 to high byte ($40) now $41. 6: Load A from $4121 I think this is what happens, but if someone wants to double check cycle 4 above and see if the 6502 does indeed read from $0000 or from $0100... Thanks again, Chris Edited July 6, 2005 by Cybernoid Quote Link to comment Share on other sites More sharing options...
Bryan Posted July 6, 2005 Share Posted July 6, 2005 I didn't think that the "LDA ($0FFF,X)" was possible, since the instruction is only 2 bytes long. I think you can only do a zero page memory location: "LDA ($FF,X)". You're right. I got mixed up with the discussion of JMP(). -Bry Quote Link to comment Share on other sites More sharing options...
LocalH Posted July 6, 2005 Share Posted July 6, 2005 Well, I haven't actively programmed 6502 in about a year and a half, so I guess I need to bone up on it again. =P Quote Link to comment Share on other sites More sharing options...
djmips Posted July 6, 2005 Share Posted July 6, 2005 (edited) What is strange is that the (Indirect),Y addressing, I believe does take the page boundary into account for the final address, but not the inital zero page address... I think... That is the following will happen (please correct me if I am wrong): Y=$23 LDA ($FF),Y Memory @ address $00FF = $FE Memory @ address $0000 = $40 Cycle: what happens -------------------------- 1: read opcode $B1 2: read $FF 3: read byte from $00FF (=$FE) 4: read byte from $0000 (=$40), add Y to $FE = $121 5: add 1 to high byte ($40) now $41. 6: Load A from $4121 I think this is what happens, but if someone wants to double check cycle 4 above and see if the 6502 does indeed read from $0000 or from $0100... Thanks again, Chris 887055[/snapback] I think that's covered in the doc I mentioned (but maybe my indirect addressing confused you) It's in http://www.viceteam.org/plain/64doc.txt under "Indexed indirect addressing". There are listed three different outcomes depending on wether it is a Read instruction, Read-Modify-Write or Write instruction. For your example of a Read instruction, I'll cut and paste. You will see there is a posibility of an extra read, and then the first read may happen at an invalid address. You should review the other addressing modes to confirm that you are getting all of the dummy reads correct. (Note: I cannot gurantee that this document is authoritative) Indirect indexed addressing Read instructions (LDA, EOR, AND, ORA, ADC, SBC, CMP) # address R/W description --- ----------- --- ------------------------------------------ 1 PC R fetch opcode, increment PC 2 PC R fetch pointer address, increment PC 3 pointer R fetch effective address low 4 pointer+1 R fetch effective address high, add Y to low byte of effective address 5 address+Y* R read from effective address, fix high byte of effective address 6+ address+Y R read from effective address Notes: The effective address is always fetched from zero page, i.e. the zero page boundary crossing is not handled. * The high byte of the effective address may be invalid at this time, i.e. it may be smaller by $100. + This cycle will be executed only if the effective address was invalid during cycle #5, i.e. page boundary was crossed. Edited July 6, 2005 by djmips Quote Link to comment Share on other sites More sharing options...
Cybernoid Posted July 6, 2005 Author Share Posted July 6, 2005 I think that's covered in the doc I mentioned (but maybe my indirect addressing confused you) It's in http://www.viceteam.org/plain/64doc.txt under "Indexed indirect addressing". There are listed three different outcomes depending on wether it is a Read instruction, Read-Modify-Write or Write instruction. Thanks!!! Yup, I am just now getting a chance to read through this doc. Okay, this doc definitely covers the zero page boundary issues. I did not realize that the write operations will always take 6 cycle regardless... grrr... that will have to be handled properly. I am starting to look at the "undocumented" instructions. I did not code for instructions taking more than 7 cycles... I will have to fix this for the read-modify-write instructions. Interesting. Thanks! Chris Quote Link to comment Share on other sites More sharing options...
supercat Posted July 6, 2005 Share Posted July 6, 2005 I am starting to look at the "undocumented" instructions. I did not code for instructions taking more than 7 cycles... I will have to fix this for the read-modify-write instructions. Interesting. 887108[/snapback] What do SLO, SRE, RLA, RRA, ISB, and DCP do? Quote Link to comment Share on other sites More sharing options...
djmips Posted July 6, 2005 Share Posted July 6, 2005 I am starting to look at the "undocumented" instructions. I did not code for instructions taking more than 7 cycles... I will have to fix this for the read-modify-write instructions. Interesting. 887108[/snapback] What do SLO, SRE, RLA, RRA, ISB, and DCP do? 887128[/snapback] SLO shift left memory and OR with A SRE shift right memory and EOR with A RLA rotate left memory and AND with A RRA rotate right memory and ADC with A ISB increment memory and SBC from A DCP decrement memory and CMP with A I've also seen some alternative mnemonics. Quote Link to comment Share on other sites More sharing options...
djmips Posted July 6, 2005 Share Posted July 6, 2005 Did you know. It's time for 'Did you know' ... that modern 6502s exist and that the instruction set has been filled out with BRA (branch always), BBR0 through BBR7 (branch on bit reset) BBS0 through BBS7 (branch on bit set), RMB0 through RMB7 (reset memory bit) and SMB0 through SMB7 (set memory bit)..... Quote Link to comment Share on other sites More sharing options...
LocalH Posted July 6, 2005 Share Posted July 6, 2005 Which is why you can only use illegals if you're pretty sure they're consistent on a platform, since they will break on the expanded instruction set. Even so, there are some illegals that are unstable between, say the C64 and C128, or even between different vintages of C64. I'm sure the same is true to some extent with other 6502-based systems. But, the ones that are stable come in handy for the tightest of the tight cycle-timed code, such as is common on the C64. Quote Link to comment Share on other sites More sharing options...
supercat Posted July 7, 2005 Share Posted July 7, 2005 ... that modern 6502s exist and that the instruction set has been filled out with BRA (branch always), BBR0 through BBR7 (branch on bit reset) BBS0 through BBS7 (branch on bit set), RMB0 through RMB7 (reset memory bit) and SMB0 through SMB7 (set memory bit)..... 887152[/snapback] Some nice features added, but ADC #immed sometimes takes 3 cycles now instead of two. BTW, I never did understand why there wasn't a 'BIT #immed' instruction. Would have seemed logical. Quote Link to comment Share on other sites More sharing options...
djmips Posted July 7, 2005 Share Posted July 7, 2005 (edited) I always wanted shift and rotate by n instructions that took 2 cycles. (barrel shifter) Edited July 7, 2005 by djmips Quote Link to comment Share on other sites More sharing options...
Bryan Posted July 7, 2005 Share Posted July 7, 2005 I am starting to look at the "undocumented" instructions. I did not code for instructions taking more than 7 cycles... I will have to fix this for the read-modify-write instructions. Interesting. Thanks! Chris 887108[/snapback] Here's an explanation: Many 6502 instructions sorta sit on top of each other. That is, you get a particular behavior by enabling one output function or another (specified by the ending 01 or 10 designation) in the instruciton. When you specify both functions (11), you get a (sometimes longer) compound instruction. The 6502 performs instructions in phases, and will wait for one to finish before moving on to the next. Here's an example: ASO ($0F) is an ASL followed by an ORA ASL/ORA phase 1 - get opcode cycle 1: read opcode (select instruction) ASL/ORA phase 2 - handle addressing mode cycle 2: read address cycle 3: read address cycle 4: read data -> temp register ASL phase 3 - ALU to temp register cycle 5: write unmodified data (& perform shift on temp register) cycle 6: write modified data Now, we've also selected an ORA operation, which signals that A should be relatched concurrent with the next opcode fetch cycle. ORA phase 3 - ALU to A cycle 5/1: read next opcode and re-latch A through the OR (A|temp register) logic. This is my best understanding of what it going on from everything I've read. -Bry Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.