Jump to content
potatohead

Timing....

Recommended Posts

Anyone have a rough guess at commands / cycles?

 

Just ran into a timing constraint on the ooze game. To avoid some of the little timing bugs in the last version, I wanted to clear the playfield and draw a fresh set of ooze each frame.

 

This takes too long. Storing ooze lengths two per variable does not help this any.

 

Ahhh... the challenge of the 2600 is beginning to show.

 

FYI, more than a few pfgraphics commands will consume all the time you have. So will a lot of variable shuffling to squeeze more value out of each bit.

 

So, going back to the old approach to see if that can be done with the additional overhead of packing the data into each nibble of the variables.

Share this post


Link to post
Share on other sites
Anyone have a rough guess at commands / cycles?

 

Just ran into a timing constraint on the ooze game.  To avoid some of the little timing bugs in the last version, I wanted to clear the playfield and draw a fresh set of ooze each frame.

 

This takes too long.  Storing ooze lengths two per variable does not help this any.

 

Ahhh...  the challenge of the 2600 is beginning to show. 

 

FYI, more than a few pfgraphics commands will consume all the time you have.  So will a lot of variable shuffling to squeeze more value out of each bit.

 

So, going back to the old approach to see if that can be done with the additional overhead of packing the data into each nibble of the variables.

900068[/snapback]

The pf commands will use up your cycle budget really quick - they are currently the most "expensive" commands by far. The other commands are all relatively inexpensive.

 

When considering the cycle counts below, recall that you have about 2700 usable cycles per frame in bB.

 

Here's a tally which may be useful for estimating your time (values are approximate)

 

pfpixel: 80 cycles

pfscroll left/right: 500 cycles

pfscroll up/down: 650 cycles every 8th time it's called, 30 cycles otherwise

pfvline: 230-600 cycles depending on length (Approx 200+34*length)

pfhline: 250-1500 cycles depending on length (Approx 210+42*length)

 

If you're not using pf commands, chances are you won't exceed the time allotted, at least in Alpha 0.2. In the next version, we're hoping for full multiply/divide and fixed point math, which may also be a little taxing on cycles, but probably it will be comparable to pfpixel and not anywhere near the playfield scrolling or lines.

Share this post


Link to post
Share on other sites
Anyone have a rough guess at commands / cycles?

A simple, unoptimized loop which only clears the playfield data will require at best ~9 cycles/iteration (sta zp,x; dex; bpl loop), which is 6*9=54 cycles for one line of data. (if the loop is counting up, it will require 11 cycles)

 

During overscan, you have ~2100 cycles for calculations.

Share this post


Link to post
Share on other sites
pfpixel: 80 cycles

Ouch, maybe you should implement some instructions which allow filling a larger area (like memset).

Share this post


Link to post
Share on other sites
A simple, unoptimized loop which only clears the playfield data will require at best ~9 cycles/iteration (sta zp,x; dex; bpl loop), which is 6*9=54 cycles for one line of data. (if the loop is counting up, it will require 11 cycles)

 

During overscan, you have ~2100 cycles for calculations.

900089[/snapback]

Yes, i'd highly recommend using an inline asm routine for pf clearing.

 

Normally, you do have 2100 cycles in overscan. However, the stock bB kernel doesn't use all 192 scanlines right now - it uses around 184 or so, giving an extra 600 cycles or so.

Share this post


Link to post
Share on other sites
pfpixel: 80 cycles

Ouch, maybe you should implement some instructions which allow filling a larger area (like memset).

900090[/snapback]

It's heavy, yes, largely because it does some calculations to convert the x and y arguments to an actual memory location, then, of course, it only sets a bit at a time, requiring read-mask-write, and it must also index a lookup table because some of the bit orders are reversed.

 

memset (as it is used in C) would be faster but wouldn't be quite as useful since it doesn't deal with bits.

Share this post


Link to post
Share on other sites

Ok that tells me what I want to know.

 

I'm doing another redesign because I didn't have much luck storing ooze positions as nibbles in the vars.

 

Either I'm just not groking the bit operations (likely)

 

, or

 

they have some problems. (Somewhat likely)

 

Had some luck with using bits as flags, that will save me enough for now.

 

An inline asm for clearing the playfield works great and that's the route I chose. I no longer need the pfline command and pfpixel can be managed to avoid running out of time.

 

Thanks for the info guys, that tells me enough about the scale of things that I can make good decisions going forward.

Share this post


Link to post
Share on other sites

I think a "CLS" like command would be useful.

 

Maybe with a "FLS" fill screen command that does the opposite...or even better, a "negscreen" function that does a logical NOT on every byte of the playfield, so a CLS + NEG would do the same thing as a FLS.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...