Jump to content
IGNORED

Squeezing Out Cycles - Please Help


MausBoy

Recommended Posts

With the new kernel options in bB, 32k Superchip games are possible, but those and some other options don't leave very many cycles for the main gameloop. Does anyone have any tips, advice, suggestions on how to cut down on cycle use to allow for the best games possible?

 

The only tip I have kind of sucks. You can fit twice as much into your gameloop if you bring the game down to the 30fps and run half of the code on odd frames, half on even.

Link to comment
Share on other sites

With the new kernel options in bB, 32k Superchip games are possible, but those and some other options don't leave very many cycles for the main gameloop. Does anyone have any tips, advice, suggestions on how to cut down on cycle use to allow for the best games possible?

 

The only tip I have kind of sucks. You can fit twice as much into your gameloop if you bring the game down to the 30fps and run half of the code on odd frames, half on even.

More tips:

- Use vblank.

- Use data statements in place of calculation or multiple if-thens wherever possible.

- Minimize or unroll loops.

- Minimize bankswitching.

- Instead of the 32-high playfield offered, you can use (IIRC) 25-31 and have the same pixel size but more kernel time.

 

And, of course, inline asm.

Link to comment
Share on other sites

  • 3 weeks later...
- Use vblank.

- Use data statements in place of calculation or multiple if-thens wherever possible.

 

batari,

 

Would you mind giving an example of how to properly use vblank, and how to substitute a data statement for multiple if-thens? I've been struggling trying to figure out how to do both.

 

For example, if I want to change the following code to use data statements:

 

if room=1||room=5 then goto sub
if room=2||room=6  then goto sub
if room=3||room=7  then goto sub
if room=4||room=8  then goto sub

 

would I change it to something like this?

 

  x=data1
 if room=x then goto sub
 data data1
 1,2,3,4,5,6,7,8
end

 

Thanks!

 

Steve

Link to comment
Share on other sites

Would you mind giving an example of how to properly use vblank

Here's a vblank example:

 

   rem * test vblank routine

  a = $00
  b = 0

loop
  drawscreen
  goto loop

  vblank
  b = b + 1
  if b = 30 then a = a + 1 : b = 0
  COLUBK = a
  return

Keep in mind that the "vblank" routine is just a subroutine-- the only thing that makes it special is that it's performed during the so-called "vblank" period instead of during the so-called "overscan" period. (Both take place during the TV's vertical blank or VBLANK, but in Atari programming terminology, "overscan" is the portion of the VBLANK that happens at the bottom of the screen-- before the VSYNC-- and "vblank" is the portion that happens at the top of the screen-- after the VSYNC.) So since the bB "vblank" routine is a subroutine, you need to end it with a "return," and position it somewhere in your code where the program won't "fall into" it. Also, you don't need to precede it with a line label, since you won't be calling the subroutine yourself. Instead, you put the keyword "vblank" at the beginning of the subroutine (where you would normally put the line label that identifies the subroutine), as shown above. When bB is compiling the code, and sees that keyword, it will replace the "vblank" keyword with its own special line label, and when the "vblank" code is being performed, bB will do a "JSR" to your "vblank" subroutine.

 

Michael

Link to comment
Share on other sites

Would you mind giving an example of how to properly use vblank

Here's a vblank example:

 

   rem * test vblank routine

  a = $00
  b = 0

loop
  drawscreen
  goto loop

  vblank
  b = b + 1
  if b = 30 then a = a + 1 : b = 0
  COLUBK = a
  return

Keep in mind that the "vblank" routine is just a subroutine-- the only thing that makes it special is that it's performed during the so-called "vblank" period instead of during the so-called "overscan" period. (Both take place during the TV's vertical blank or VBLANK, but in Atari programming terminology, "overscan" is the portion of the VBLANK that happens at the bottom of the screen-- before the VSYNC-- and "vblank" is the portion that happens at the top of the screen-- after the VSYNC.) So since the bB "vblank" routine is a subroutine, you need to end it with a "return," and position it somewhere in your code where the program won't "fall into" it. Also, you don't need to precede it with a line label, since you won't be calling the subroutine yourself. Instead, you put the keyword "vblank" at the beginning of the subroutine (where you would normally put the line label that identifies the subroutine), as shown above. When bB is compiling the code, and sees that keyword, it will replace the "vblank" keyword with its own special line label, and when the "vblank" code is being performed, bB will do a "JSR" to your "vblank" subroutine.

 

Michael

 

Thank you for the description and the example, Michael. It sounds like vblank would not give very much time to execute code, I probably wouldn't be able to put much in there, would I? I'll have to play around with it later tonight.

 

Thanks,

 

Steve

Link to comment
Share on other sites

Thank you for the description and the example, Michael. It sounds like vblank would not give very much time to execute code, I probably wouldn't be able to put much in there, would I? I'll have to play around with it later tonight.

 

Thanks,

 

Steve

Actually, the standard Atari game screen (as described in the "Stella Programmer's Manual") is built as follows:

 

-- 3 lines of VSYNC (which are blanked).

-- 37 lines of VBLANK (commonly called the "vblank" by Atari programmers).

-- 192 lines of picture ("active," or un-blanked).

-- 30 lines of VBLANK (commonly called the "overscan" by Atari programmers).

 

When you call "drawscreen," bB does a few things of its own first, then twiddles its thumbs while watching the timer to see when it's time to do the VSYNC. This is happening at the bottom of the screen, during the "overscan." Then it does the VSYNC, and does some things of its own-- all at the top of the screen, during the "vblank." Then it twiddles its thumbs again while watching the timer, and waits until it's time to start drawing the "active" part of the screen. Then it draws the screen, and returns to your program code. So what happens is as follows:

 

-- bB finishes drawing the screen, and returns to your code.

*** ACTIVE VIDEO ENDS ***

*** OVERSCAN BEGINS ***

-- Your code executes, during "overscan." <<<===

-- Your code eventually calls "drawscreen" (hopefully before you run out of cycles).

*** BB'S OWN OVERSCAN CODE BEGINS ***

-- bB does its own "overscan" stuff.

*** OVERSCAN ENDS ***

*** VSYNC BEGINS ***

-- bB does the VSYNC.

*** VSYNC ENDS ***

*** VBLANK BEGINS ***

-- bB does its own "vblank" stuff.

-- If you have a "vblank" subroutine, bB performs it here. <<<===

-- bB waits until it's time to draw the "active" portion of the screen.

*** VBLANK ENDS ***

*** ACTIVE VIDEO BEGINS

-- bB starts drawing the screen.

-- Loop back to the beginning of this list.

 

So your code normally has to execute during those 30 lines of "overscan" at the bottom of the screen, minus any time that bB needs to set the timer, return to your code, and then-- when you call "drawscreen"-- twiddle its thumbs waiting until it's time to do VSYNC.

 

On the other hand, the "vblank" period is *37* lines long, which is 7 lines *longer* than the "overscan" period. But of course, bB has to do its own stuff during that time, so you don't actually have 37 lines' worth of cycles for a "vblank" routine of your own. I don't know how many lines' worth you do have-- and it might vary depending on how much of its own stuff bB needs to get done during the "vblank"-- but you should have quite a few lines' worth of time, enough to make it well worth your while to move some of your code from "overscan" into "vblank."

 

Michael

Link to comment
Share on other sites

Thank you for the description and the example, Michael. It sounds like vblank would not give very much time to execute code, I probably wouldn't be able to put much in there, would I? I'll have to play around with it later tonight.

 

Thanks,

 

Steve

Actually, the standard Atari game screen (as described in the "Stella Programmer's Manual") is built as follows:

 

-- 3 lines of VSYNC (which are blanked).

-- 37 lines of VBLANK (commonly called the "vblank" by Atari programmers).

-- 192 lines of picture ("active," or un-blanked).

-- 30 lines of VBLANK (commonly called the "overscan" by Atari programmers).

 

When you call "drawscreen," bB does a few things of its own first, then twiddles its thumbs while watching the timer to see when it's time to do the VSYNC. This is happening at the bottom of the screen, during the "overscan." Then it does the VSYNC, and does some things of its own-- all at the top of the screen, during the "vblank." Then it twiddles its thumbs again while watching the timer, and waits until it's time to start drawing the "active" part of the screen. Then it draws the screen, and returns to your program code. So what happens is as follows:

 

-- bB finishes drawing the screen, and returns to your code.

*** ACTIVE VIDEO ENDS ***

*** OVERSCAN BEGINS ***

-- Your code executes, during "overscan." <<<===

-- Your code eventually calls "drawscreen" (hopefully before you run out of cycles).

*** BB'S OWN OVERSCAN CODE BEGINS ***

-- bB does its own "overscan" stuff.

*** OVERSCAN ENDS ***

*** VSYNC BEGINS ***

-- bB does the VSYNC.

*** VSYNC ENDS ***

*** VBLANK BEGINS ***

-- bB does its own "vblank" stuff.

-- If you have a "vblank" subroutine, bB performs it here. <<<===

-- bB waits until it's time to draw the "active" portion of the screen.

*** VBLANK ENDS ***

*** ACTIVE VIDEO BEGINS

-- bB starts drawing the screen.

-- Loop back to the beginning of this list.

 

So your code normally has to execute during those 30 lines of "overscan" at the bottom of the screen, minus any time that bB needs to set the timer, return to your code, and then-- when you call "drawscreen"-- twiddle its thumbs waiting until it's time to do VSYNC.

 

On the other hand, the "vblank" period is *37* lines long, which is 7 lines *longer* than the "overscan" period. But of course, bB has to do its own stuff during that time, so you don't actually have 37 lines' worth of cycles for a "vblank" routine of your own. I don't know how many lines' worth you do have-- and it might vary depending on how much of its own stuff bB needs to get done during the "vblank"-- but you should have quite a few lines' worth of time, enough to make it well worth your while to move some of your code from "overscan" into "vblank."

 

Michael

 

Thanks again for the detailed explanation, that really helps. I was just playing around with it a while ago, and moved some code that I had in bank 1 (just before the first drawscreen is called) to vblank, and it made a big difference - I'm getting much less screen flashing in debug mode when the enemy is moving around. I haven't tried it on real hardware yet, though. I made one more discovery after reading the online documentation - vblank has to reside in the last bank.

 

Steve

Link to comment
Share on other sites

I made one more discovery after reading the online documentation - vblank has to reside in the last bank.

 

Steve

Yes, that sounds right. Also, you have to be careful what you try to move into vblank. Keep in mind that your vblank will be performed *after* your other code is performed, and *after* bB performs its own vblank code-- so you normally wouldn't put anything into vblank if it needs to be performed *after* drawing the screen, but *before* bB processes your screen changes. For example, one of the things that bB does during vblank is position the sprites based on any changes you made during the overscan, so you probably wouldn't want to move any sprite movement code into vblank. On the other hand, you *could* do that if you wanted to, but you'd have to remember that the screen won't reflect the changes until the *next* time around-- i.e., first the screen will be drawn with the old sprite positions (because bB hasn't done the sprite positioning-- RESPx and HMOVE-- using the updated coordinates yet), and then the *next* screen will reflect the new changes, so it will be as though the screen is always lagging one frame behind your sprite position changes, if you see what I mean. Likewise, you probably won't want to put any collision-detection stuff into vblank, because bB will have just finished moving the sprites immediately before your vblank code, but it won't have drawn the screen with the new positions yet, so your collision-detection routine won't be able to do stuff like bump the player back properly if he runs into a wall. Also, you get only one vblank subroutine, so you want to use it for code that should be executed each frame. For example, vblank would be the perfect place to stick a music driver routine, for playing background music during your game. Or, if you want to be clever, you could set one or more flags, and then have your vblank routine call different subroutines based on which flags were set:

 

   vblank
  if flag_a = whatever then gosub subroutine_a
  if flag_b = whatever then gosub subroutine_b
  rem * etc.
  return

 

or

 

   vblank
  on vblank_flag goto vb1 vb2 vb3 etc
vb0 rem * if vblank_flag = 0 it will fall through to here
  do something
  return
vb1
  do something
  return
vb2
  do something
  return
vb3
  do something
  return
etc
  return

 

Michael

Edited by SeaGtGruff
Link to comment
Share on other sites

Thanks again for the detailed explanation, that really helps. I was just playing around with it a while ago, and moved some code that I had in bank 1 (just before the first drawscreen is called) to vblank, and it made a big difference - I'm getting much less screen flashing in debug mode when the enemy is moving around. I haven't tried it on real hardware yet, though. I made one more discovery after reading the online documentation - vblank has to reside in the last bank.

 

Steve

I did some checking, and I found that the standard kernel has as much as 1600 cycles free in vblank in the standard kernel. In the multisprite kernel, you have considerably less, but still some.

 

And, although the actual vblank command has to reside in the last bank, your vblank routine can consist solely of a "goto" to another bank. The return command will still work in the other bank (no need to go back to the last bank.)

Link to comment
Share on other sites

And, although the actual vblank command has to reside in the last bank, your vblank routine can consist solely of a "goto" to another bank. The return command will still work in the other bank (no need to go back to the last bank.)

Ah, excellent point! :) Or you could gosub to another bank and return to vblank, gosub yet another bank, etc., such as if you're using flags to control which of various possible subroutines should actually be performed during vblank.

 

Michael

Link to comment
Share on other sites

I did some checking, and I found that the standard kernel has as much as 1600 cycles free in vblank in the standard kernel. In the multisprite kernel, you have considerably less, but still some.

I did a simple program (almost identical to the "test vblank" example posted above) using the standard bB v1.0 kernel, and then stepped through it in Stella's debugger.

 

In the "overscan" period, there were 2598 cycles from the time bB returned to my code after finishing the "drawscreen" routines, until the time when bB finished twiddling its thumbs and watching the timer when I called "drawscreen" again. So that's roughly the number of cycles available for the user's bB code in the "overscan" period-- less about 12 cycles for calling "drawscreen" and then going through the timer-check loop at least once, or 2586 cycles for doing something between one "drawscreen" and the the next "drawscreen." Of course, that's just a rough figure, since the timer-check loop might end on a different cycle.

 

In the "vblank" period, there were 1648 cycles from the time bB jumped to my code after finishing its own "vblank" routines, until the time when bB finished twiddling its thumbs and watching the timer after my "vblank" routine was done. Again, if we reduce that by 12 cycles for the "return" at the end of my routine, plus at least once through the timer-check loop, that's about 1636 cycles.

 

So in rounded ball park figures, it looks like you have about 2600 cycles for your "overscan" bB code, and about 1600 cycles for your "vblank" bB code, giving a combined total of about 4200 cycles. :)

 

Michael

  • Like 1
Link to comment
Share on other sites

  • 8 years later...

I did a simple program (almost identical to the "test vblank" example posted above) using the standard bB v1.0 kernel, and then stepped through it in Stella's debugger.

 

In the "overscan" period, there were 2598 cycles from the time bB returned to my code after finishing the "drawscreen" routines, until the time when bB finished twiddling its thumbs and watching the timer when I called "drawscreen" again. So that's roughly the number of cycles available for the user's bB code in the "overscan" period-- less about 12 cycles for calling "drawscreen" and then going through the timer-check loop at least once, or 2586 cycles for doing something between one "drawscreen" and the the next "drawscreen." Of course, that's just a rough figure, since the timer-check loop might end on a different cycle.

 

In the "vblank" period, there were 1648 cycles from the time bB jumped to my code after finishing its own "vblank" routines, until the time when bB finished twiddling its thumbs and watching the timer after my "vblank" routine was done. Again, if we reduce that by 12 cycles for the "return" at the end of my routine, plus at least once through the timer-check loop, that's about 1636 cycles.

 

So in rounded ball park figures, it looks like you have about 2600 cycles for your "overscan" bB code, and about 1600 cycles for your "vblank" bB code, giving a combined total of about 4200 cycles. icon_smile.gif

 

Michael

 

Do you think any of those numbers have changed since 2007? How would a beginner count the cycles using Stella? Would it take 4200 key presses?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...