Video timing outside the kernel

+batari · February 24, 2005

Hello all,

I'm a greenhorn 2600 programmer. Not exactly a newbie, so this is posted here.

I'm currently trying to work some timing glitches out of my first game. Outside of the display kernel, there are a lot of loops and branches with hard-to-predict timing and I'm finding it hard to get it just right with WSYNC loops. I've looked into this PIA chip, and I don't quite understand it, but it has a few interval timers that might be very useful if they do what I hope they do.

Now, if you set the timer at the beginning of vertical blank, does the PIA halt the 6507 until it reaches zero (making it less useful) or does the 6507 keep on running, so that you could simply start it at the beginning of VBLANK, then run through a variable number of cycles, then check it in a loop for zero when you're done with your work, then just do a single WSYNC and you'd be in perfect sync every time? Please tell me this is so, since this would mean that my timing glitches will be easy to fix, and I can take out all of those wasteful WSYNC loops and free up some precious ROM space.

If the PIA does halt the 6507, what is its real utility? Is the only option to tediously count prehaps thousands of cycles or to just keep hacking until you get it right?

Nukey Shay · February 24, 2005

AFAIK, the countdown timer just counts down to zero apart from the program running. You can stick a value in one of the registers whenever you want. By using a delay loop later on with INTIM...you can keep your program executing at uniform periods.

vdub_bobby · February 24, 2005

AFAIK, the countdown timer just counts down to zero apart from the program running. You can stick a value in one of the registers whenever you want. By using a delay loop later on with INTIM...you can keep your program executing at uniform periods.

In addition, I believe that after INTIM hits zero it will continue decrementing once per cycle, regardless of which timer you initially setup. I think.

EricBall · February 24, 2005

Now, if you set the timer at the beginning of vertical blank, does the PIA halt the 6507 until it reaches zero (making it less useful) or does the 6507 keep on running, so that you could simply start it at the beginning of VBLANK, then run through a variable number of cycles, then check it in a loop for zero when you're done with your work, then just do a single WSYNC and you'd be in perfect sync every time?

First, it's a RIOT (a 6532) and it only has one timer. It is possible to have the timer only count down every 1, 8, 64 or 1024 CPU cycles though by writing to different memory locations (TIM1T, TIM8T, TIM64T and T1024T respectively). There are also some interrupt options, but they aren't useful in the 2600 because they aren't connected.

The usual way to use the RIOT timer is to write to TIM64T with 76/64 * the number of lines to wait, then put a STA WSYNC, LDA INTIM, BNE loop at the bottom. You may need to adjust the TIM64T slightly so watch the Z26 line counter or log file.

+batari · February 24, 2005

Thanks for the help. This will make programming a little easier. I tried using TIM64T for 30 lines in overscan then 37 in vblank, and it seems to have helped, but it didn't fix all of the timing problems. There must be some issues in the display kernel that I missed.

EricBall · February 24, 2005

z26.log is your friend. Yes, it's gi-normous, but HD space shouldn't be a major problem for anyone these days. There are also ways you can make it easier on yourself:

Use WSYNC in when waiting for the timer
Put a distinctive STA $xxxx before or after code you want to watch. You can then FIND it easily (or even use the command line FIND command to just display those lines.

There is a gotcha with the #lines*76/64 STA WSYNC/LDA INTIM loop. Do you see it? What happens if the INTIM transistion to zero occurs during the WSYNC? The code then misses it and waits another 256 cycles (3-4 lines) for the timer to finish it's overflow count. Make sure your STA TIM64T occurs a fixed number of cycles after WSYNC, then use Z26.log to tune the TIM64T value.

+batari · February 24, 2005

I checked TIMINT instead of INTIM for zero. What's the difference?

And I also didn't have a WSYNC before the STA TIM64T. I see the gotcha now. I also put in a temporary bug trap to see if I ever pass zero, as:


timernotzero    LDA TIMINT

                BNE timernotzero

infiniteloop     BMI infiniteloop;bug trap

So far no bugs have been caught, but if the screen suddenly goes blank, I know that the zero was missed, or at least that's what this is intended to do.

I'm going to need to start using Z26 instead of Stella, at least for the timing issues. I usually work on a Mac, but realistic Mac users like me also have a PC... Sometimes the right software just isn't available otherwise.

vdub_bobby · February 24, 2005


timernotzero    LDA TIMINT

                BNE timernotzero

infiniteloop     BMI infiniteloop;bug trap

That won't catch anything

The only way the CPU will get to the 'bmi infiniteloop' instruction is if A == 0, so that branch will never be taken.

+batari · February 24, 2005

You're absolutely right. I need to swithc the order of the branches. :dunce:


timernotzero    LDA TIMINT

infiniteloop     BMI infiniteloop;bug trap 

                BNE timernotzero

No wonder it didn't catch any bugs.

kisrael · February 24, 2005

From a practical stand point, I'd say this: you don't have to worry about counting time during the blank period.

A few times during the development of JoustPong, I was sure "whoops, this is it, I've run out of time during the game logic and am still processing when I should be drawing on the screen" and it was ALWAYS something else. For most beginner's games, you have more-or-less plenty of time. (For stuff like chess, or super calculation intensive stuff, this might not hold true)

In short, the TIM6T-VBLANK, 30-WSYNC'd Overscan pattern that I describe in 2600 101 really does a swell job. I'd wager that you'll have plenty of time in the VBLANK...and can just do 30 scanlines of overscan with a counter not a timer. If you really do run out of VBLANK time, then you can just set up a timer for the overscan, which gives you almost double the amount of available cycles.

vdub_bobby · February 24, 2005

From a practical stand point, I'd say this: you don't have to worry about counting time during the blank period.

A few times during the development of JoustPong, I was sure "whoops, this is it, I've run out of time during the game logic and am still processing when I should be drawing on the screen" and it was ALWAYS something else. For most beginner's games, you have more-or-less plenty of time. (For stuff like chess, or super calculation intensive stuff, this might not hold true)

In short, the TIM6T-VBLANK, 30-WSYNC'd Overscan pattern that I describe in 2600 101 really does a swell job. I'd wager that you'll have plenty of time in the VBLANK...and can just do 30 scanlines of overscan with a counter not a timer. If you really do run out of VBLANK time, then you can just set up a timer for the overscan, which gives you almost double the amount of available cycles.

I'd agree with this...to a point: My first stab at a 2600 game involved a horizontally scrolling background (PF), which meant a lot of rotations. A LOT of rotations. Like 2800 rotations per frame! So right from the get-go I had to split stuff between VBLANK and overscan. Actually, I'm still not sure how to solve that one.

And then I ran into the same issue with my fishies game! Again, it's those darn rotations! I am using a random-number generator to pick which fish to bring in from the side (up to eight times per frame), and my random-number generator uses a bunch of rotations. About 5 rotations per random bit, and I needed 7 bits per fish (i.e., the maximum (which was hit every time you started a game) was 7 * 5 * 8 = 280 'rol ZP' per frame - at 5 cycles each, I was already at 1400 cycles of VBLANK time without doing anything else!)

I ended up having to limit the number of times I would call the random-number function per frame (which wasn't a big deal, but it took me a while to figure it out).

So my advice would be to not worry about it unless your screen starts blanking momentarily or rolling. And watch out for those rotations!

Thomas Jentzsch · February 25, 2005

I am using a random-number generator to pick which fish to bring in from the side (up to eight times per frame), and my random-number generator uses a bunch of rotations. About 5 rotations per random bit, and I needed 7 bits per fish (i.e., the maximum (which was hit every time you started a game) was 7 * 5 * 8 = 280 'rol ZP' per frame - at 5 cycles each, I was already at 1400 cycles of VBLANK time without doing anything else!)

Then use something simplier, like this:

  lda random

 lsr

 bcc .skipEOR

 eor #$b2       ; several other values possible here

.skipEOR:

 sta random

One whole new byte in just a few cycles.

vdub_bobby · February 25, 2005

Then use something simplier, like this:
  lda random

 lsr

 bcc .skipEOR

 eor #$b2       ; several other values possible here

.skipEOR:

 sta random
One whole new byte in just a few cycles.

Yeah, I saw that ...somewhere. Maybe in the Stella archives. With that, plus a few other optimization tricks I think I will revisit my 1K fishies game at some point...maybe I can add some music back in to that version.

Thanks,

-bob

+batari · February 25, 2005

  lda random 

 lsr 

 bcc .skipEOR 

 eor #$b2       ; several other values possible here 

.skipEOR: 

 sta random

This looks great, and even smaller than the original random number routine I "borrowed" . But I have to ask, how well does it really work?

Thomas Jentzsch · February 25, 2005

This looks great, and even smaller than the original random number routine I "borrowed" . But I have to ask, how well does it really work?

It's an implementation of a LSFR (Linear Feedback Shift Register) and creates a repeating sequence of 255 different bytes when using the right EOR values.

While this is not perfectly random by far, it is random enough for most of our quite simple purposes.

+batari · February 25, 2005

I tried this code (translated to C) and when seeded with $FF, it indeed produced 255 different bytes in pseudo-random order, except zero. This will save me some precious bytes of ROM. Thanks for sharing.

supercat · June 3, 2005

One whole new byte in just a few cycles.

804153[/snapback]

Yeah, but there's a 1:1 correspondence between old and new byte values.

I'd tend to think something like:

 lda random1
 lsr
 ror random2
 ror random3
 ror random4
 bcc nocarry
 eor #$A3
 sta random1
nocarry:
 ...

would yield more randomish behavior with a period of 4,294,967,295. If you only want to use three bytes for your generator, eliminate the "ror random4" instruction and use an "eor" value of #$DB. If you want to use two bytes, use an eof value of $BD. Note that it is possible to run one of these generators 'in reverse' if necessary (e.g. if using it to generate random playfield data in an environment where a player can fly up or down).

 lda random1
 add #128
 bcc nocarry
 eor #$A3
nocarry:
 rol random4
 rol random3
 rol random2
 rol
 sta random1

Cute, eh?

+Andrew Davie · June 25, 2005

  lda random 
 lsr 
 bcc .skipEOR 
 eor #$b2      ; several other values possible here 
.skipEOR: 
 sta random
This looks great, and even smaller than the original random number routine I "borrowed" . But I have to ask, how well does it really work?

804339[/snapback]

Not very. If it ever gets 0, it gets stuck on 0 -- it presumably never generates 0, and must not be initialised 0. I use a modified version that puts a branch to the eor, before the lsr, if it is 0 -- thus forcing it out of that 0 state if it ever occurs.

Cheers

A

djmips · November 24, 2005

supercat,

I was having some issues with my 8 bit 'random' number generator (period was too low) and was looking back at some of the 16 and 32 bit LFSR posted here and elsewhere and I noticed that your routine (as posted) doesn't store random1 (for the forward LFSR)

it shoud be...

 lda random1
 lsr
 ror random2
 ror random3
 ror random4
 bcc nocarry
 eor #$A3
nocarry:
 sta random1

supercat · November 24, 2005

I was having some issues with my 8 bit 'random' number generator (period was too low) and was looking back at some of the 16 and 32 bit LFSR posted here and elsewhere and I noticed that your routine (as posted) doesn't store random1 (for the forward LFSR)

971092[/snapback]

I think you're probably right. The dangers of typing code from memory.

BTW, returning to the subject of the thread, there are a couple of tricks I've found useful when dealing with RIOT timing.

-1- If you can afford to make the kernel a subroutine and some tasks might take too long to get done in one frame but you can't really predict the timing, it's helpful to load TIM64T with 125+the number of scan lines to wait, and then during your main loop, check the MSB of INTIM. If it's gone positive, call the kernel. The kernel should then wait for INTIM to finish counting down to 125. If you make sure that you check INTIM every 140 cycles or so, you don't have to worry about 'exactly' how long your vblank code takes. Sure, slowdowns can be annoying if they happen very much, but an occasional extra frame is nothing; much less annoying that a slipped sync.

When using this technique, you should #define a named label for the value 125 so it can be adjusted. Using a smaller value will leave less CPU time available for non-kernel stuff but make the kernel tolerant of longer polling intervals.

-2- If you limit yourself to using vblank rather than overscan, you can further minimize the risk of bad sync by adding a 'ballback' to the INTIM check. For example, if the top of the screen shows the score and it's supposed to be displayed starting when INTIM reaches 125 but when you get there INTIM is already 125 or below, you could wait until INTIM reaches, say, 119 and then jump to the part of the kernel below the display. If you choose the proper 'backup' INTIM value, the effect missing your first time deadline will cause the score to flicker but everything else on the screen will remain solid.

Unfortunately, this approach is a bit harder when the overscan runs long. Even there, however, you're not defenseless. If you have some audio code that's supposed to be run once per frame, you can arrange things so that it happens before VSYNC if there's time, or after VSYNC if there isn't. The net effect would be that being a little late for the scheduled screen-bottom routine would cause the kernel to take longer before returning than it otherwise would (since the audio stuff will have to be handled after VSYNC) but screen flipping would be avoided.

Video timing outside the kernel

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members