Jump to content
IGNORED

Having trouble with 2 free-floating Player graphics


Recommended Posts

Hi!

 

I'm having a bit of trouble with some code I need to write.

For Steam Tunnel Bob, "game 2" will involve having 2 sprites on-screen at the same time.

It happens over a period of 93 scanlines or so.

 

I have this dummy code, currently. It works, but the P0 sprite can't move up or down, which is something I want to change.

I'll explain my dilemmas after the codebox.

 

;========================
; scanline 20 to 113
;========================
LDX	#93
LDY	Player1YPosition
NoP1TopLoop
STA	WSYNC
DEY
BEQ	WeHitTheTopOfP1Gfx
LDA	Player0GraphicsColors,X
STA	COLUP0
LDA	Player0Graphics,X
STA	GRP0
DEX
JMP	NoP1TopLoop
WeHitTheTopOfP1Gfx
LDA	Player0GraphicsColors,X
STA	COLUP0
LDA	Player0Graphics,X
STA	GRP0
DEX
LDY	Player1HeightGame2
DrawingP1Loop
STA	WSYNC
LDA	Player0GraphicsColors,X
STA	COLUP0
LDA	Player0Graphics,X
STA	GRP0
LDA	(Temp3),Y ; P1 colors
STA	COLUP1
LDA	(Temp2),Y ; P1 gfx
STA	GRP1
DEX
DEY
BPL	DrawingP1Loop
NoP1BottomLoop
STA	WSYNC
LDA	Player0GraphicsColors,X
STA	COLUP0
LDA	Player0Graphics,X
STA	GRP0
DEX
BPL	NoP1BottomLoop
;========================

 

So, here's the technical problems I can't seem to grasp.

 

1) There's a space vs cycles issue right away, with regard to "zero-padding".

I'm sure anyone who has written a 2600 game before knows about this.

Basically, for the P0 sprite, I can put zero-padding around the main graphic, as I don't anticipate it changing much (maybe between 2 graphics, at the end). But, for the P1 graphic, zero-padding may take up too much space (as I'm planning on having a different P1 graphic per screen). In light of this I tried to use an index counter that would determine the start position to draw the P1 graphic, and one that would determine its height. With these values, you can set X or Y-registers appropriately, and therefore store $00 into GRP0 at appropriate conditions, thereby saving the user space. However, I'm torn. Maybe it makes sense just to zero-pad everything (P0 and P1 graphic alike) , and store the graphics and this small routine in an isolated 4K bank. I'm just not sure what the best way for that is. If I zero-pad, I get the advantage of more cycles, which means I have more horizontal position availability for the sprites. One solution is to get extra HW (256 bytes) that I could dump graphics in, and zero-out manually, but for now, I'm trying to avoid "extra hardware", and stay a purist for as long as I can.

 

2) I'm having issues with indirect indexing. As you can see, in the example above, the P1 position is dynamic, by indirect fetch. However, the P0 position is not. I'd like to fix this, but it sounds like I'd have to use an indirect mode to do this. This is ok, but indirect modes based on the X-register have never worked for me, as they end up being pre-indexed indirect instead of post-indexed indirect, which is what I typically use and need for graphics fetches. So, even if I figure out a solution for problem 1, I'm not quite sure how to proceed with this issue.

 

Can anyone give me some pointers? I think I need a swift programming kick to the head; this is something that has eluded me for a long time, yet I think wouldn't be too hard to implement.

 

Thanks in advance!

-John

Edited by Propane13
Link to comment
Share on other sites

Hey John,

 

Can't give an in-depth answer right now since I'm at work :) , but in a nutshell: yes you cannot use the X register to do indirect addressing the way you'd like to. The 6502 doesn't support it. If you want to use indirect addressing with an offset (i.e. LDA (Address),Offset), you have to use the Y register for the offset. Which means that if you're doing two player graphics, you have to save the Y register and reuse it.

 

As to the zero-padding method, yes that is bar-none the fastest way to draw two players on the screen, but as you mentioned it is extremely wasteful ROM-wise. Other methods like skipdraw are much more ROM-efficient but obviously use more CPU cycles...

 

Later,

Ben

Link to comment
Share on other sites

You need to use a variant of SkipDraw!

 

Here are some methods for drawing sprites. Assume Y is a decrementing line-counter in all cases:

Simplest case:

   lda (GfxPtr),Y
  sta GRP0  ;+8 cycles

Pros: Very fast.

Cons: Must pad with zeroes.

Next:

   lda (GfxPtr),Y
  and (MaskPtr),Y
  sta GRP0  ;+13 cycles

Pros: Still pretty fast. Only need to pad your Mask with zeroes, so doesn't use as much space.

Cons: Even padding one table with zeroes is a lot of wasted ROM

 

Now, the fancy ones:

SkipDraw:

   lda #SPRITEHEIGHT
  dcp SpriteTemp
  bcc SkipDraw
  lda (GfxPtr),Y
  sta GRP0
ReturnFromSkipDraw ;+17 cycles

Pros: Still pretty fast, and runs in constant time if you set up the SkipDraw branch correctly. Requires (almost) no zero padding.

Cons: Doesn't write to GRP0 every line (can be necessary if you are using VDEL); for this reason you do need at least one zero of padding on all sprite graphics. Not as fast as the other methods. A minor pain to setup the variables. Can be a huge hassle to setup the SkipDraw branch if your kernel is complicated.

Note: If you didn't notice, it uses an illegal opcode.

 

A variant of SkipDraw that I call

DoDraw

   lda #SPRITEHEIGHT
  dcp SpriteTemp
  bcs DoDraw
  lda #0
  .byte $2C
DoDraw
  lda (GfxPtr),Y
  sta GRP0  ;+18 cycles

Pros: Runs in constant time, relatively fast, and doesn't require any branches in/out of your kernel. (Just make sure that short branch doesn't cross a page boundary!) No padding required in your graphics. Writes to GRP0 every time - sometimes required when you want to use VDEL.

Cons: The slowest method so far. The '.byte $2C' opcode-skip won't work on the SuperCharger and maybe other exotic bankswitch schemes. Still a minor pain to setup your variables.

Note: This guy uses an illegal opcode also.

 

Finally, the fanciest of the fancy:

SwitchDraw

   cpy SpriteTop
  beq SwitchDraw
  bmi WaitDraw
  lda (GfxPtr),Y
  sta GRP0
ReturnFromSwitchDraw;+15 cycles

Elsewhere, you need...

SwitchDraw
  lda SpriteBottom
  sta SpriteTop
  jmp ReturnFromSwitchDraw
WaitDraw
  SLEEP 4
  bmi ReturnFromSwitchDraw  ;branch always

Note that SpriteBottom = the bottom of the sprite ORed with $80.

Pros: The fastest non-padding method. Again, runs in constant time.

Cons: *Only works for Y < 128!* So if you want to use it over the whole screen you have to do some tricky, and generally very painful, kernel setup and graphics interlacing.

Edited by vdub_bobby
  • Like 1
Link to comment
Share on other sites

THIS IS AWESOME!

 

I really think the masking method is innovative-- that never would have crossed my mind.

I'm going to study these a little more in detail. I can see that using these, I can setup P0 and P1 graphics with no worries.

 

I was thinking I needed a bunch of branches that handled all case statements and ran dynamically, i.e.

1) if nothing this line, branch

2) If drawing p0 and not p1, jump to that routine

3) If drawing p0 and p1, jump to that routine

etc...

 

This takes a problem I was making way too complicated and makes it really nice and easy.

 

Thanks a bunch!

-John

Link to comment
Share on other sites

Another drawing method, not listed, is to do something like this. Assume sprites have at least one line of padding available on top and bottom, so that the two sprites will never begin or end on the same scan line. Assume further that sprites don't quite go to the top and bottom of the screen.

toploop_early:
 SLEEP 7
toploop:
 .. 52 cycles of other stuff
 lda #0
 sta GRP0
 sta GRP1
 iny
 zzz; nop 255 or something similar
 cpy sprite1top
 beq sprite1start_early
 cpy sprite0top
 bne sprite0start
sprite0loop:; ** STARTS ONE CYCLE EARLIER THAN THE OTHERS!
 .. 53 cycles of other stuff
 lda (sprite0),y
 sta GRP0
 lda #0
 sta GRP1
 iny
 zzz
 cpy sprite1bot
 beq toploop_early
 cpy sprite1top
 bne sprite0loop
 nop
bothloop:
 .. 52 cycles of other stuff
 lda (sprite0),y
 sta GRP0
 lda (sprite1),y
 sta GRP1
 iny
 cpy sprite0bot; Sprites must be chosen so zero ends first!
 bne bothloop
etc...

Cycle counts may not be right above, but a key feature of the code is that sprite decision-making time is minimized in the loop when both sprites are displayed (both sprites are handled in 23 cycles, including the INY and looping branch). Hammering the code into shape may be a pain, but it's faster even than maskdraw.

 

BTW, while interleaving sprite data (for a two-line unrolled loop) may seem an ugly approach, it has a lot of advantages and I'd recommend it if one is trying to optimize for speed. Among other things, having a loop counter go from 0 to 95 instead of 0 to 191 greatly increases the portion of each page that can be accessed without page crossings.

Edited by supercat
Link to comment
Share on other sites

Hey I was thinking of something.

 

As SkipDraw takes some cycles, do people usually do something like this:

(excuse the pseudo-code)

 

if (Player0's horiz position = right of screen) and (player 1's horiz position = right of screen)

{

Kernal 1:

do Skipdraw immediately after wsync for both p0 and p1

}

else if (Player0's horiz position = left of screen) and (player 1's horiz position = right of screen)

{

Kernal 2:

do Skipdraw with P0's update immediately after wsync, followed by p1's update

}

else if (Player0's horiz position = right of screen) and (player 1's horiz position = left of screen)

{

Kernal 3:

do Skipdraw with P1's update immediately after wsync, followed by p2's update

}

else // (Player0's horiz position = left of screen) and (player 1's horiz position = left of screen)

{

Kernal 4:

do Skipdraw with wasted cycles to pass the gfx, then update P0 and P1 with next line's data, instead of current

}

 

I think to give full-range motion, this would be the only effective way to do it.

Is this how others handle it? Or, do they limit the horizontal range/position of their sprites?

 

Thanks!

-John

Edited by Propane13
Link to comment
Share on other sites

Hey I was thinking of something.

 

As SkipDraw takes some cycles, do people usually do something like this:

(excuse the pseudo-code)

Just use VDEL, then store P1 in the first 21 cycles of the scanline and store P0 wherever you want. No reason to make it more complicated than necessary, unless you've got some ideological aversion to VDEL or something.

Link to comment
Share on other sites

Thanks batari!

 

That's so simple, yet solves my problems. I had completely forgotten about VDEL, and initially had thought it something to make sprites with less granularity.

In thinking about the 6-digit score routine, what you say makes perfect sense. :)

 

Ah... this community is so cool. I'm starting to understand things that long escaped my grasp these years.

Thanks again for everyone's help. I hope I can use this knowledge to "build a better homebrew".

 

-John

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...