Jump to content
IGNORED

Cycle counting-- page boundary on indirect access?


Propane13

Recommended Posts

Hello!

 

So, I have a bug that I've been trying to squish, but I don't really understand where the bug lies.

 

I have a loop, and that loop on occasion takes too many cycles.

But on occasion, I mean in one special situation for one single frame, I end up at 77 cycles.

 

My loop has 2 items in it that according to the cycle counting guide could be variable.

 

Item 1 is the end of the loop of course:

DEY

DEX

BPL PFScanline4Loop

 

The instance seems to happen the last time through, when the branch is NOT taken.

From what I read, I believe that the only time that this page-boundary issue occurs is when the branch is taken, and the branch crosses a page boundary.

So, if my interpretation is right, that means that PFScanline4Loop starts at say, $FEF0 and the end of the loop is at $FF10, so when the branch is taken, the branch crosses the $FF00 page, adding an extra cycle. Is that the correct interpretation? If so, not taking the branch would have no effect, correct?

 

That leads me to the only other variable parameter in the loop

At 2 places, I have the following code:

LDA (P1Color),Y

STA COLUP1

 

...

 

LDA (P1GfxPtr),Y

STA GRP1

 

I am thinking that for the one instance where there's a problem, one (or both) of these fetches are crossing a page boundary.

The problem is... I have absolutely no idea what this means.

 

P1GfxPtr and P1Color are in zero-page. So, there's no need to worry about a page boundary there.

Of course they point to a physical location.

 

Let's say that location is $FEF0.

Does that mean that if we add Y to it, and Y is 17, I'll have an issue because then the added result crosses a page boundary and results at $FF01?

 

Just making sure my interpretation is correct here.

 

Thanks!

-John

 

 

-John

Link to comment
Share on other sites

Branch not taken always = 2 cycles. Page crossing only comes into effect for branches that are taken.

 

Indirect addressing r (nn,X) isn't affected by Page boundaries. In fact, Indirect,X will just wrap around from $FF to $00 as 16 bit processing of the operand doesn't take place.

 

ed - fixed... (nn),Y is affected by page boundaries.

Edited by Rybags
Link to comment
Share on other sites

The instance seems to happen the last time through, when the branch is NOT taken.

From what I read, I believe that the only time that this page-boundary issue occurs is when the branch is taken, and the branch crosses a page boundary.

Correct!

So, if my interpretation is right, that means that PFScanline4Loop starts at say, $FEF0 and the end of the loop is at $FF10, so when the branch is taken, the branch crosses the $FF00 page, adding an extra cycle. Is that the correct interpretation? If so, not taking the branch would have no effect, correct?

Not quite correct. If the branch is not taken then the cycle time is down to 2 cycles. For a branch :-

 

2 cycles - Not taken

3 cycles - Taken

4 cycles - Taken over a page boundary.

P1GfxPtr and P1Color are in zero-page. So, there's no need to worry about a page boundary there.

Of course they point to a physical location.

Correct!

Let's say that location is $FEF0.

Does that mean that if we add Y to it, and Y is 17, I'll have an issue because then the added result crosses a page boundary and results at $FF01?

When using indirect indexed its easier to think of it causing an extra carry inside the CPU. In your example when you add 17 to LSB of the source address $F0 then the answer is $101. This generates an "address carry" (for want of better words) which can only be resolved by adding another cycle to the instruction time.

 

EDIT: If theres no carry generated the extra cycle isn't needed.

Edited by GroovyBee
Link to comment
Share on other sites

When I add cycle counts next to my code, I like to put one asterisk to indicate the possible addition of another cycle, and two asterisks for the possible addition of two cycles. I usually show a minimum of three columns-- the number of cycles for the current instruction, the total number of cycles for the current line, and the color clock for the current line (total cycles times 3). I also like to indicate the "entry" cycle for the beginning of a loop, and an exit cycle for a subroutine (so if I say "JSR something," and the timing is important, I'll know that I need to enter the subroutine at a certain cycle, and that I'll be at a certain cycle when it comes back).

 

   LDA #0	 ; +02  ; ??  ; ???
  STA WSYNC  ; +03  ; 00  ; 000
  STA VBLANK ; +03  ; 03  ; 009
  LDX #192   ; +02  ; 05  ; 015

loop		  ; enter : 05

  INC COLOR  ; +05  ; 10  ; 030
  LDA COLOR  ; +03  ; 13  ; 039
  STA COLUBK ; +03  ; 16  ; 049
  STA WSYNC  ; +03  ; 00  ; 000
  DEX		; +02  ; 02  ; 006
  BNE loop   ; +02**; 04**; 012**

  LDA #2	 ; +02  ; 06  ; 018
  STA VBLANK ; +03  ; 09  ; 027

I find that the asterisks help if I run into timing glitches and need to quickly zero in on the portion of the code that might be taking a little longer than I was expecting. And, yes, obviously the entry cycle for a loop might not be exactly the same when the program first falls into the loop versus when it goes back to the loop, so I might say something like "enter : 67 or before," or maybe even something like "enter : 45 to 67" if the loop requires that the program not enter before a certain time. And if the entry cycle is variable, I usually count the cycles *inside* the loop based on what they are the second time through, or maybe give multiple sets of cycle counts if necessary. Of course, cycle-counting is usually only needed when the timing is critical and needs to be pretty damn specific, so most of the time it turns out that only one set of cycle counts are needed for a loop.

 

Michael

Link to comment
Share on other sites

It's rarely a problem. In some cases, a 4-cycle branch can be useful (since the 6502 lacks a 4-cycle JMP). If you are only skipping a byte ahead (jumping over a single-byte instruction), you can use a CMP # opcode to do it in 2 cycles (or NOP # if you don't want to mess up flag status).

 

 

Something that may be useful to you is to set up a macro to test your code in any cycle-critical loops or routines.

   MAC CHECKPAGE;thx TZ
  IF >. != >{1}
	 ECHO ""
	 ECHO "ERROR: different pages! (", {1}, ",", ., ")"
	 ECHO ""
	 ERR
  ENDIF
  ENDM

 

To call it, just place it where you have your starting point and give the name of the destination. i.e...

	  ldx #$04
Waste19cycles:
  dex
  bne Waste19cycles
  CHECKPAGE Waste19cycles

 

An error will be reported during assembly if the pages differ for the loop.

Edited by Nukey Shay
Link to comment
Share on other sites

  • 3 weeks later...
  • 9 years later...

...

MAC CHECKPAGE;thx TZ
	  IF >. != >{1}
		 ECHO ""
		 ECHO "ERROR: different pages! (", {1}, ",", ., ")"
		 ECHO ""
		 ERR
	  ENDIF
   ENDM
...

 

 

Note that the CHECKPAGE macro has an off-by-one error as it compares with the page of the current program position instead of "current program position - 1".

This should fix it:

    MAC CHECK_PAGE
.PREV_POS   SET . -1
        IF >.PREV_POS != >{1}
            ECHO ""
            ECHO "ERROR: different pages! (", {1}, ",", .PREV_POS, ")"
            ECHO ""
            ERR
        ENDIF
    ENDM
Edited by Dionoid
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...