Jump to content
IGNORED

Why are my shape tables flickering?


Tyrop

Recommended Posts

I am trying to program a centipede game in assembly language. I am doing it in Antic F and I am using the technique in the book, Atari Graphics and Arcade Game Design, Chapter 8 (Raster Graphics) which draws, and then erases, a shape table to screen memory. Basically, I am using the same shape table for each of 7 centipede segments (The shape table consists of 8 separate versions of the centipede segment - each one shifted over one bit from the previous one). The actual shape itself is small - it is only 8 pixels x 8 pixels (because of the 8 shifted patterns, it is 2 bytes wide by 8 bytes high). It is only the size of a keyboard character. I am copying the shape, byte by byte, to screen memory (or'ing it with screen memory). Then I erase it, byte by byte, by copying the shape to screen memory again and eor'ing it with the bytes that are aleady there. WHat I do is I copy the shape (centipede segment) 7 times so that you see 7 segments next to each other in a straight line. THen, I need to have a delay so that the centipede is visible longer than it is invisible. So for a delay, I have the computer keep looping until a flag is set in the vertical blank. THen, it immediately un-draws the entire 7 segments and draws them all again one pixel to the left, and so on. SO basically, it draws the centipede as fast as it can, waits for the next vertical blank interrupt, and then immediatley erases, and then draws at the next position (so the centipede advances to the left one pixel per 1/60th of a second or 60 pixels per second). I figure that the amount of time it must wait for the VBI is very long in comparison to the the amount of time it takes to erase and re-draw. When I have it drawing one centipede segment, it is nice and smooth. But with 7, it flickers. There must be a better technique. How does a game like Drol, with so much larger shapes, draw, erase, and re-draw without flicker? I was looking at the code for soft-sprites and cannot follow it at all and I am not sure what the author meant by "masking." How do people learn this stuff? Is there a book that teaches how to smoothly move shape tables?

Edited by Tyrop
Link to comment
Share on other sites

If you EOR with what's already there, then you're going to generate your own flicker in a way, since, as you say, you're putting 7 segments next to each other. If they happen to overlap, then of course in Mode F, 1 EOR 1 = 0.

 

If it's the case that your segments don't overlap, then I'd guess that either/both the routine is taking too long or it's sometimes erasing when that part of the screen is being drawn.

 

You might be better off having a DLI near the bottom of visible screen instead that triggers the flag you wait on before performing the erase - or just a wait loop which uses VCOUNT to reach a certain value.

 

That aside, if you insist on using EOR for softsprites, the usual erase technique is to just write the sprites to screen again, in reverse order that you initially did them.

 

 

ed: a good technique to work out what's going on is to change the background colour in your program. e.g. leave it at 00 in the shadow register. Change it to 02 when erasing, 04 when drawing, back to 00 when complete.

Edited by Rybags
Link to comment
Share on other sites

If you EOR with what's already there, then you're going to generate your own flicker in a way, since, as you say, you're putting 7 segments next to each other. If they happen to overlap, then of course in Mode F, 1 EOR 1 = 0.

 

The shapes do overlap but not the actual graphics parts. Hopefully, I can put this into words :) The first segment is drawn, and depending on its horizontal location, there will be blank bits on the left and/or the right of the graphic. Then, the second segment is drawn behind it (to the right of the first segment), and it also will have a blank area on its left and/or right side. So there is overlap, but only where one of the shapes has blank bits, but I OR it so the graphic parts do not get overwritten. The result is a centipede that looks like the arcade: there is one blank pixel separating each of the segments from each other. But when I erase, I do it by EOR'ing the first shape with itself, then the second with itself, etc. I will try your other suggestions, and maybe a longer pause between draw and un-draw - maybe 3 VBI's just to see if that does the trick (even though 3 VBI's = a pretty slow centipede!)

 

Is there a better way to move shapes around the screen than 1) draw shapes; 2) small pause; 3) completey erase shapes; 4)immediately re-draw shapes at new location; repeat... ?

Link to comment
Share on other sites

a first quick look at the code... and the binary... as your centipede gets more and more down to the bottom and there more segments are visible i would assume that your draw routine is taking too much time... for soft sprites thats always a indicator that the draw routine is slower than the rasterbeam...so you get hit by the rasterbeam when you are erasing the stuff...

 

and while debugging and you are running into timing issues look into the code which will be called many times... f.e.

; SUBROUTINE XDRAW
;-------------------------
;
; COPY TO SCREEN
;
	 PHXY
	 LDY Y
	 STY VERT
	 JSR GETADR
	 LDA HEIGHT
	 STA TEMPHEIGHT
	 LDA WIDTH
	 STA TEMPWIDTH
	 LDX #0
	 LDY #0
?AGAIN
	 LDA SHPL
	 STA TEMPSHPL
	 LDA SHPH
	 STA TEMPSHPH
?AGAIN2
	 LDA (SHPL,X)
	 EOR (POSITIONL),Y
	 STA (POSITIONL),Y
	 INY 
	 DEC TEMPWIDTH
	 BEQ ?DOWN1
; NEXT BYTE IS 8 BYTES DOWN
	 CLC 
	 LDA SHPL
	 ADC #8
	 BCC ?CONT2
	 INC SHPH
?CONT2
	 STA SHPL
	 JMP ?AGAIN2
?DOWN1

 

needs to be optimised and can be done very faster... ;)

Link to comment
Share on other sites

Heaven, thanks for looking at my program. I looked at that thread and I viewed some of the source codes in some of those attachments. I wish I could follow what the programmers are doing. It is so hard for me to trace through assembly language listings. Can you describe the concept of what game programmers generally do to draw and move sprites, or point me to a book? How did you learn? I don't know what "buffering" or "double buffering" or "masking" is. I figure I should just draw and erase and re-draw. I guess I just need to know what do conceptually, and then I will try to figure out out how to translate the concept into code. Also, the draw routine I used is almost identical to the one in the book, Atari Graphics and Arcade Game Design, so I figured it was fairly optimized, but you say it can be optimized even more?

 

Thank you to you and Rybags and anyone else that can give me some pointers.

 

I would love to learn this stuff and it is something I've wanted to do since the early 80's. I've read De Re Atari and I've looked through the books that are on atariarchives.org. I am interested in learning to move sprites (as opposed to using PM graphics) because I like the high resolution of Antic F (even without color) or the fairly high resolution and multicolor ability of Antic E. My goal is to see how close I can get the Atari to look like the arcade version of centipede (resolution-wise, not color wise). Centipede 5200 impressed me by its smoothness but it lacks the resolution (Antic E?), and I realize that it is probably as close as you can get with color (and, I have no idea how that programmer drew and moved the centipede). I think the actual arcade version is in a mode that is essentially 320x200 or close to it.

Edited by Tyrop
Link to comment
Share on other sites

Double-buffering is just "page flipping" - using 2 screen regions, so you can draw on one without worrying about tearing or flicker, while the other one is displayed.

 

There are various levels of optimisation with soft-sprites.

 

1. None - the program has to rotate sprite and mask data before overlaying it to the screen. Very slow, but also very conservative on memory use.

2. Program uses pre-shifted sprite and mask data. Much quicker but uses 4-8 times the memory to store sprite definitions.

3. Unrolled loops. e.g. the strip sprites routine mentioned earlier, or a variation of conventional techniques but with loops unrolled. Faster, but more memory hungry for the program code.

4. 100% Hard-coded, unrolled loops. Massive increase in memory usage but also massive increase in speed. Each sprite has dedicated section of program which directly performs load/mask/or/store operations with Immediate addressing instructions used.

 

Add to that - variations of each if you're using the EOR mode.

Link to comment
Share on other sites

ok. what about small "standard" ones always usefull for beginners... ;)

 

first...try to fit often used variables into zero page... as i am not familiar with your assembler but it seems these are not zero pages:

 

VERT .DS 1

VERTL .DS 1

VERTH .DS 1

HORIZ .DS 1

TEMPHEIGHT .DS 1

TEMPWIDTH .DS 1

TEMPSHPL .DS 1

TEMPSHPH .DS 1

TIMER .DS 1

VBFLAG .DS 1

WIDTH .DS 1

HEIGHT .DS 1

 

and some of these you are using in the draw routine... remember that the 6502 is faster when dealing with zero page.

 

then... maybe avoid pha,pla for storing values on the stack. instead use (again) zero pages f.e. temp_a, temp_x, temp_y

 

so instead of

 

txa

pha

tya

pha

...

you will just do a simple

 

stx temp_x

sty temp_y

...

 

and later you do a simple

 

ldx temp_x

ldy temp_y

rts

 

much faster...

 

then...

 

;-----------------------------

; SUBROUTINE GETADR

;

; STORES LEFT MOST SCRN ADDRESS

; OF ROW+HORIZ OFFSET AT

; POSITIONL, POSITIONH

; PARAMETERS ARE YREG

; =ROW AND HORIZ=COLUMN

;------------------------------

GETADR

LDA ROWL,Y

CLC

ADC HORIZ

STA POSITIONL

LDA ROWH,Y

ADC # >SCREENMEM

STA POSITIONH

RTS

 

as you know anyway where your screen ram is.... why not prepare the lookuptable to have the "addition" included anyway? f.e.

 

tab_init:

lda #<screenmem

sta temp

lda #>screenmem

sta temp+1

ldx #0

tabinit0: lda temp

sta rowl,x

lda temp+1

sta rowh,x

clc

lda temp

adc #$28 ;40 bytes per line

sta temp

lda temp+1

adc #0

sta temp+1

inx

cpx #160

bcc tabinit0

rts

 

so your getadr will look like this:

 

GETADR

LDA ROWL,Y

STA POSITIONL

LDA ROWH,Y

STA POSITIONH

RTS

...

 

and these are only general optimisation not to mention the Rybags mentioned "based on your game" suited ones... ;)

Edited by Heaven/TQA
Link to comment
Share on other sites

now let's have a look how centipede 5200 works (http://atari.fandal.cz/detail.php?files_id=1591)

 

screen is located at $1150 onwoards (load the game in Atari800win and hit f8 in to get into the monitor and type dlist to have a look)

 

now let's search for a code like LDA (),y EOR (),Y or similar... and voila look at $7477

 

lda $3090,x ;here must be something like sprite data for antic e

ldy #0 ;first byte

eor ($e0),y

sta ($e0),y

lda $3091,x ;second byte

iny

eor ($e0),y

sta $(e0),y

lda $3092,x

ldy #$28 ;!!! 40 = we are in next scanline

eor ($e0),y

sta ($e0),y

...

$7501 is the exit of this routine...

 

so i assume that for each pixel position there is a separate draw routine and the routine is unrolled for the amount of data is has to be copied...

 

 

 

as you can see...not much overhead

Link to comment
Share on other sites

I have been working on copying soft sprites on a graphics 15 screen myself. The vbi routine is also working along with mulitplexing player/missile graphics. I am not sure what is meant by double buffering, but know using self modifying code like lda #$ffff,y eor #$ffff,y, sta #$ffff works alittle faster than using zeropage indirect. That is along with counting down y register down to 0 instead of counting up and comparing a value to see if its finished.

 

Another trick I have been experimenting with is calculating the start address of each line and store the low and hi byte in separate tables. This saves you from multiplying by 40 (or screen width) each time you call your drawing routine. Then all you have to do is index the row and add the byte column. Found out those multiply routines can eat up 150-200 clock cycles and limits you to about 20 sprites on the screen.

Link to comment
Share on other sites

THanks for the description. I think I am using unrolled loops. I am using a loop to copy the sprite to the screen and then another loop (very similar code) to copy sprite to the screen again to erase it using EOR.

 

Typically an "unrolled" loop is where you take the code that would be a loop (with a jmp or beq, bne back to the beginning) and remove the jump back.

 

For example:

 

For x=1 to 3 do some stuff:next x

 

to unroll it you would:

 

Do some stuff for x=1

Do some stuff for x=2

Do some stuff for x=3

 

This is going to cost you about 3 times (in my example) the memory for code, but it will be faster, since the compares, branches and jumps required to setup a loop are eliminated. Unrolled loops are used quite a bit in heavily optimized code.

 

But, I wouldn't start there if I were you. I'd take Heaven's suggestion and make sure all your most used variables are in zero page, and also use lookup tables for your screen position calculations. After that I'd start optmizing more by finding better ways (faster; less cycles) to do things in assembly, and lastly I'd unroll the loops. Concentrate on the code that is called over and over, especially screen drawing code.

Edited by Shawn Jefferson
Link to comment
Share on other sites

Thank you Heaven for taking the time to look at my program and explain how to make it faster. I will digest the information and do some re-writing. Quick question, though. So far, the only thing I did was move my variables to page zero (starting at location $80), but now i am getting some stray pieces of my graphics scattered around the screen. The main program still works as it did (I have not yet optimized the draw routines). I am thinking that the problem has something to do with the fact that I now have two different program counter origins: one at *=$80 (for the zero page variables) and one at *=$2000 (for the main code). Now, when I binary load with DOS option L, it immediately runs the program, whereas before, I had to use option M to run at address $2000. Is there a way of telling the assembler what address the program is supposed to start at?

 

I am attaching the source code and executable, and again, I really appreciate the time you take to look at this, and if you don't have the time, I totally understand. THe main program source code is CENT16.ASC and the executable is CENT16.65O.

cent16.zip

Link to comment
Share on other sites

Some Assemblers allow specifying a run address at compile time (esp modern cross-platform ones).

 

With older ones, you can achive the same.

 

*=$2E0

.WORD RUNADR

.WORD INITADR

 

Init address is optional of course - you can code multiple Init routines if you really wanted, which get executed as the program loads.

Link to comment
Share on other sites

here, a real centipede, inspired by your thread..

 

- 50 sprites of 6x6 pixels (2 feet per sprite ==> 100 feet :) )

- 50 fps in PAL (use PAL mode in the emulator), but not 60 in NTSC :(

- graphic mode 7, 32 bytes wide (more cycles available)

- correctly masked (for a one color background)

- double buffering

- pre rotated sprites

- just one frame of animation (I'm lazy)

- unrolled code and some other optimizations (lot of macros)

- compiled with MADS 1.7.5 (system_demo.m65 is the main file)

- boring sinus table movement

 

greetings

 

(system_demo.obx is the executable!)

 

NRV

 

100pede.zip

  • Like 1
Link to comment
Share on other sites

That is truly impressive. What are you doing there for 3D, calculus? I wish I could trace through your source code, it is so hard for me to trace other people's assembly language code. Where can I learn the concept of what you are doing? Are there any books or other materials you can recommend?

Link to comment
Share on other sites

I think I have done a fair job of optimizing my draw and erase routine. By the way, I have an NTSC system. I cannot understand why the centipede is not visible at the top of the screen and then the segments become visible, one by one, as it gets lower. If I run it in Atari800WinPlus in PAL 50hz, you see more of the centipede at the top of the screen. I know that the program is always drawing the entire centipede (it has 7 segments) because if I run it with a machine lanuage monitor and slow down its execution, I can watch it draw and erase the entire thing at all times, so I think the invisible segments have something to do with the rate that the screen refreshes. Does that make sense? I use a wait loop after the draw routine and before the erase routine by waiting for the next VBI before it erases. If I delete the wait loop altogether, then I see the whole centipede but it flickers. Maybe the solution is to do page flipping.

 

The executable is CENT22.65O.

Cent22.zip

Edited by Tyrop
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...