Jump to content
IGNORED

Resetting a Graphics Pointer Within the Kernel- Don't Draw Routine


Just Jeff

Recommended Posts

Good Morning,

 

I'm trying to save kernel time, and code space in my kernel by eliminating DoDraw. I've come up with this routine that only uses 8 cycles on a typical line, and periodically resets the graphics pointer. Also, while it uses extra ROM space to pad the player graphic with zeros, I believe I'll more than make it up by not putting DoDraw in every kernel band, since my kernels have many bands. I'm interested in knowing alternate methods. Does anyone have any? Also, how does this one look? Can it be improved? etc..


;===============================================================================
;Don'tDraw: In kernel pointer reset
;===============================================================================

;These first two lines are the only code necessary on most kernel lines

	lda (Player0Ptr),y  	;Load line 1 shape for player0
	sta GRP0            	;and store it

;In order to avoid putting hundreds of zeros above and below the player
;graphic, reset the pointer on occasion, when time allows.  I'm using a large
;buffer of 50 zeros here, but it could often be smaller.  Buffers and graphics
;must be on the same page since this routine only manipulates the LSB.

ResetP0Ptr:
	dey			;Decrement scan line counter
	tya			;Put the scanline counter in the accumulator
	clc	
	sbc ObjectY		;Compare with with player 0 position
	bmi DontDraw		;If its passed already, then draw zeros
	sbc #50			;Advance 50 lines to see if the grapic is in that range
	bpl DontDraw		;If out of range, jump to the zeros
	lda Player0PtrB		;If in range, load the true pointer that was calculated
				;outside the kernel
	sta Player0Ptr		;and store it as the pointer that is actually used
	jmp End	
DontDraw:			;Set the pointer to top of zeros
	tya     		;get the scan line again
 	sbc #<BufferZeros 	;Point to bank of zeros plus y above the graphics
        sta Player0Ptr
End:

;So if were at scanline 100 and the player is above at 110 then the first SBC
;causes the pointer to skip to the top of the zeros

;And if the player is below at 90, then the first result is 10, 50 is subtracted
;for a result of -40. Branch is not taken, and actual pointer is loaded

;If the player is high above at 160 then the first sbc is negative, causing a 
;skip to the zeros

;If the player is far below at 40, then the first result is 60, causing the branch
;to be skipped.  50 is subtracted to a result of 10 therefore second branch is 
;taken, drawing zeros.

BufferZeros:
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000
	
Player0Gfx:
	.byte %10010010,%10100100,%10010010,%11001001,%10100100,%11001001,%10010010,%10100100

MoreZeros:
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000,%00000000
	.byte %00000000,%00000000
Edited by BNE Jeff
Link to comment
Share on other sites

I don't remember exactly how I implemented this in my Nyancat kernel (I am trying to get back into working on that game), but I did a similar thing. Instead of padding an entire screens worth of zeroes, which wouldn't fit in a page anyway, I also split the screen up into rows (or sections).

If I remember correctly, the kernel rows were 14 scanlines high, with 5 lines in between. Since I wanted it to be possible for the cat sprite to visibly shift between rows, I treated 2 rows and the 5 lines between them as one 33-scanline-high unit.

In order to have enough time inside the kernel, the sprite graphics are stored in RAM, and I can use zeropage offset addressing with the x register to avoid the extra time it takes to load through a pointer, but with enough wiggle room, I would be using a ROM pointer like in your example. Obviously, each situation is unique, requiring its own approach.

  • Like 1
Link to comment
Share on other sites

Somewhat. The nice thing is that you don't have to pad as much.

 

When you have a graphics table in ROM, you can't change the data, only the pointer offset, so you need to pad both sides of the graphics with the maximum amount necessary for both extremes in sprite positioning (extreme top vs extreme bottom). In your example, you are using 50 bytes of padding on both sides of the graphics, for 100 bytes total. And as you stated, this can probably be trimmed down a bit.

 

When the graphics is stored in RAM, however, you can change the data, so you must take advantage of that. You simply have a graphics "buffer" that is the exact size to hold one kernel's worth of graphics. So rather than positioning a pointer to offset into the data table, you pre-load the buffer with the data to draw, and then draw it inside the kernel, (probably repeating this process multiple times within a frame). No need for a pointer at all, since the buffer is always in the same place in RAM. In your example, this would probably use about 50 bytes of RAM for a single buffer. And of course, after the kernel is drawn, you can re-use this RAM for any other temporary variables you need. One thing you can do is to improve the graphics in other kernels, since you already have a chunk of RAM reserved for temporary data.

 

For my example, I used multiple buffers for different graphics aspects. I would set the stack pointer to the end of the buffer, and do a loop, pushing enough zeroes to the stack to clear the buffer. (Since I had multiple buffers, I lined them up one-after-the-other in RAM, and cleared them all in one step.) Then for each buffer, I positioned the stack pointer based on how I wanted to position the sprite, and pushed the graphics into RAM.

 

There is no need to pad the graphics tables in ROM, so this approach would free up a lot of ROM to be used for more sprites, animations, etc. at the expense of some RAM. The sad thing is that it doesn't save much time, only a single cycle, which in my case was absolutely necessary. Loading using lda RAM,x takes 4 cycles, while using lda (ROM),y takes 5 (assuming no page boundaries are crossed). But as I said, it does clear up a lot of ROM.

Edited by JeremiahK
  • Like 2
Link to comment
Share on other sites

There's lots of different wasy to go, and what you choose really depends on your situation. If your sprites are all the same height then you might consider using AND (),Y

 

You will want to move the loading of your graphics to the end of the scanline to deal with any extra cycles from crossing boundaries. i.e

 

 

lda (graphics),Y ; 5/6

and (drawMask),Y 5/6

sta WSYNC

sta GRP0

 

This method does eat up rom for the AND mask table, but is generally easy to implent and not too slow.

  • Like 3
Link to comment
Share on other sites

There's lots of different wasy to go, and what you choose really depends on your situation. If your sprites are all the same height then you might consider using AND (),Y

 

You will want to move the loading of your graphics to the end of the scanline to deal with any extra cycles from crossing boundaries. i.e

 

 

lda (graphics),Y ; 5/6

and (drawMask),Y 5/6

sta WSYNC

sta GRP0

 

This method does eat up rom for the AND mask table, but is generally easy to implent and not too slow.

 

Thank you.. This looks very useful for re-using the zeros- cool..

 

But would still require 400 zeros, more or less- right? In other words, this does not reset the pointer, so I would still have to reset the drawMask pointer somehow if I wanted to avoid doing that- correct?

 

My situation is that I have many short kernel bands and many opportunities to display little more than player 0 between those bands.

 

You mentioned graphics being the same height. Actually, I have just 2 graphics for Player 0, but one of them is basically the same as the other, with the bottom two lines cut off. In that case, this mask could somehow be shifted two lines (with no kernel code required) up or down- correct? like this?:

SquareGraphic:
	.byte /11111111
	.byte /10000001
	.byte /10000001
	.byte /11111111

;Any 1 to 4 lines can be cut off by shifting this mask up or down 1 to 4 lines	
drawMask: 
; A screen full of zeros before this
	.byte /00000000
	.byte /00000000
	.byte /00000000
	.byte /00000000
	.byte /11111111
	.byte /11111111	
	.byte /11111111
	.byte /11111111	
	.byte /00000000
	.byte /00000000
	.byte /00000000
	.byte /00000000	
; A screen full of zeros after this
Link to comment
Share on other sites

I think it is something more like this (correct me if I am misunderstanding you, Omegamatrix):

Graphics_1:
	.byte /11111111
	.byte /10000001
	.byte /10000001    ; GFX_PTR would point here to draw Graphics_2
	.byte /11111111
Graphics_2:
	.byte /11111111
	.byte /00011000
	.byte /00011000
	.byte /11111111
Graphics_3:
	.byte /10000001
	.byte /11111111
	.byte /11111111
	.byte /10000001

Mask:
	.byte /00000000
	.byte /00000000
	.byte /00000000    ; MSK_PTR would point here
	.byte /00000000
	.byte /11111111
	.byte /11111111
	.byte /11111111
	.byte /11111111
	.byte /00000000
	.byte /00000000
	.byte /00000000
	.byte /00000000

In order to draw a sprite, you prepare two pointers, GFX_PTR for the graphics and MSK_PTR for the mask.

 

As an example, let's say you want to draw one of the 4-scanline sprites positioned in the middle of an 8-scanline kernel band. (2 blank rows, 4 sprite rows, 2 blank rows.) MSK_PTR would point to the position in the mask table 2 bytes before the sprite part of the mask starts (the 1's). GFX_PTR's value would depend on which 4-scanline sprite you wanted to draw, but it would always also point to 2 bytes before the desired sprite starts. Now the mask data lines up with the graphics data.

 

In the kernel, you simply loop through 8 lines, using the Y register as your counter. You load, and-mask, and write the graphics like this: (assuming the graphics are upside-down, so we count down from 7 until we pass 0)

	ldy #7

.Loop	lda (GFX_PTR),y    ; load the graphics
	and (MSK_PTR),y    ; use the mask to remove the graphics that are not part of the sprite
	sta WSYNC
	sta GRP0           ; write the masked data to the graphics register

	dey
	bpl .Loop          ; loop from 7-0

If you wanted another sprite that was shorter than the mask, you would have to pad the sprite graphics with zeroes to make it the same height as the mask. In other words, you would just add some blank lines until the sprite was the correct height. You would not need to change the mask in any way.

 

Edit: I would definitely recommend doing it this way rather than the way I described, unless you actually need to save the cycles.

Edited by JeremiahK
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...