Jump to content
IGNORED

WIP Battle Pong- Questions, Comments, Snide Remarks Welcome


Recommended Posts

Hello!

 

post-44582-0-81932200-1471115620_thumb.jpg

 

Here's a game I've been working on for a couple of months, based on SpiceWare's Collect Mini.

 

My plan from the beginning has been to get the kernal working as fully as possible before focusing on game logic. I'm almost at that point but think I could probably use a little help.

 

Issue #1- Its getting kinda wobbly. I'm not so concerned with what happens with the players when they go out of bounds since I'll prevent that from happening at some point. I'm mostly concerned with all of the jittering and the strange things that happen to the coins at the bottom of the screen.

 

I want to get it cleaned up a bit before attempting to work in Omegamatrix's score routine.

 

If anyone could help I would definitely appreciate it! To be clear though, all input will be appreciated, not just kernal stuff. Thanks!

 

 

 

 

 

BP1.asm

BP1.bin

  • Like 2
Link to comment
Share on other sites

I worked on this a little more today but haven't figured it out yet. I tidied up the part of the kernal that has the biggest issue to make it easier to read and attached the asm & bin here. I also changed code so the right player is now controllable. If you move him vertically you can get a much better sense of the issue. The problem occurs when player 1 (on the right) passes the ball and the coins. After I cleaned up my cycle counts I didn't see where timing would be an issue. Though after doing this, I'm questioning whether I did my DoDrawBall, and DoDrawMissile routines are correct. The timing is different then it is for the players and I'm not sure .byte is skipping over the correct amount of code on those.

 

In addition to the attached file, here is a paste (Also, how you all make such nice pastes? ):

 

ArenaLoop3: ; ? - worse case time to get here
sta WSYNC ; 3 ?
;---------------------------- start of line 1 of the 2LK
lda #PLAYER0_HEIGHT-1 ; 2 2 - height of the player graphics,
dcp Player0Draw ; 5 7 - Decrement Player0Draw and compare with height
bcs CDoDrawGrp0 ; 2 9 - (3 10) if Carry is Set then player0 is on current scanline
lda #0 ; 2 11 - otherwise use 0 to turn off player0
.byte $2C ; 4 15 - $2C = BIT with absolute addressing, trick that
; causes the lda (Player0Ptr),y to be skipped
CDoDrawGrp0: ; 10 - from bcs DoDrawGRP0
lda (Player0Ptr),y ; 5 15 - load the shape for player0
sta GRP0 ; 3 18
lda (PlayerColorPtr),y ; 5 23
sta COLUP0 ; 3 26

Ball:
lda #BALL_HEIGHT-1 ; 2 28
dcp BallDraw ; 5 33
bcs DoDrawBall ; 2 35
lda #0 ; 2 37
.byte $2C ; 4 41

DoDrawBall: ; 36 - from bcs DoDrawBall
lda #2 ; 2 38
sta ENABL ; 3 41
sta WSYNC ; 3 44

;---------------------------- start of line 2 of the 2LK

lda #PLAYER1_HEIGHT-1 ; 2 2 - height of the player 1 graphics, subtract 1 due to starting with 0
dcp Player1Draw ; 5 7 - Decrement Player1Draw and compare with height
bcs CDoDrawGrp1 ; 2 9 - (3 10) if Carry is Set, then player1 is on current scanline
lda #0 ; 2 11 - otherwise use 0 to turn off player1
.byte $2C ; 4 15 - $2C = BIT with absolute addressing, trick that
; causes the lda (Player1Ptr),y to be skipped
CDoDrawGrp1: ; 10 - from bcs DoDrawGrp1
lda (Player1Ptr),y ; 5 15 - load the shape for player1
sta GRP1 ; 3 18
lda (Player1ColorPtr),y ; 5 23
sta COLUP1 ; 3 26

Missile0:
lda #MISSILE_HEIGHT-1 ; 2 28
dcp Missile0Draw ; 5 33
bcs DoDrawMis0 ; 2 35
lda #0 ; 2 37
.byte $2C ; 4 41

DoDrawMis0: ; 36 - from bcs DoDrawBall
lda #2 ; 2 38
sta ENAM0 ; 3 41

Missile1:
lda #MISSILE_HEIGHT-1 ; 2 43
dcp Missile1Draw ; 5 48
bcs DoDrawMis1 ; 2 50
lda #0 ; 2 52
.byte $2C ; 4 56

DoDrawMis1: ; 51 - from bcs DoDrawBall
lda #2 ; 2 53
sta ENAM1 ; 3 56

dey ; 2 58 - update loop counter
cpy #20 ; 2 60
bne ArenaLoop3 ; 2 62 - 3 if taken

BP1e.asm

BP1.bin

Link to comment
Share on other sites

Use Code to format your code:

 

post-7074-0-09824800-1471207709.png

 

You have a timing problem where you are updating the color to the playfield too late.

 

post-7074-0-98762800-1471207947_thumb.png

 

To really see the problem, step through one of the lines the yellow is bleeding onto the coins in Stella's debugger. Here you can see we are at the start of instruction to write to COLUPF. At this time the playfield is currently a yellow ($1C), which is the color of the border. In Stella's debugger you can see a white dot moving across the left window while you are pressing the "step" button. This dot shows you where the beam currently is. In the picture we can see the beam is close to where the coins will be drawn.

 

post-7074-0-19528200-1471209004_thumb.png

 

Now this is after the instruction has executed. We see it has updated the playfield to a red ($46) color, but the color write came too late and it was updated in the middle of drawing a coin.

 

post-7074-0-91748400-1471209016_thumb.png

 

Also, if you look at the timing in Stella's debugger of "lda CoinColor,x" you will see that it takes four cycles, but in your code you have it listed as three:

		ldx #3

ArenaLoop4:
        sta WSYNC           	; 3 0
;---------------------------- start of line 1 of the 2LK 

DoDrawPills:	 
		lda (PillPtr),y			; 5 5
		sta PF2					; 3 8
        
        lda #PLAYER0_HEIGHT-1 	; 2 10 - height of the player graphics, 
        dcp Player0Draw     	; 5 15 - Decrement Player0Draw and compare with height
        bcs DDoDrawGrp0      	; 2 17 - (3 10) if Carry is Set then player0 is on current scanline
        lda #0              	; 2 19 - otherwise use 0 to turn off player0
        .byte $2C           	; 4 23 - $2C = BIT with absolute addressing, trick that
                            	; 	causes the lda (Player0Ptr),y to be skipped
DDoDrawGrp0:                 	;   18 - from bcs DoDrawGRP0
        lda (Player0Ptr),y  	; 5 23 - load the shape for player0
        sta GRP0            	; 3 26
        lda (PlayerColorPtr),y	; 5  31
        sta COLUP0				; 3  34
        
        lda CoinColor,x			; 3 37             <------- should be 4 cycles
        sta COLUPF				; 3 40       

Still, knowing this doesn't fix the problem. The write to COLUPF simply must happen sooner. This is where you need to start rearranging the kernel.

 

 

To fix the problem you need to write the color at least 1-2 cycles earlier. I recommend moving an instruction before the WSYNC for ArenaLoop4. As a word of caution, watch out for your scanlines suddenly increasing after this change. If they do then you have probably now taken too many cycles on the previously line and you will have to move code around to fix that that.

 

Try something like this. I move the graphics write for PF2 because LDA (zp),Y takes 5 cycles, and LDA #imm takes 2 cycles. We just need two cycles.

		ldx #3
ArenaLoop4:

DoDrawPills:	 
        lda #PLAYER0_HEIGHT-1 	; 2 10 - height of the player graphics,       <------ MOVED BEFORE WYSNC to gain 2 cycles
        sta WSYNC           	; 3 0
;---------------------------- start of line 1 of the 2LK 

        dcp Player0Draw     	; 5 15 - Decrement Player0Draw and compare with height
        bcs DDoDrawGrp0      	; 2 17 - (3 10) if Carry is Set then player0 is on current scanline
        lda #0              	; 2 19 - otherwise use 0 to turn off player0
        .byte $2C           	; 4 23 - $2C = BIT with absolute addressing, trick that
                            	; 	causes the lda (Player0Ptr),y to be skipped
DDoDrawGrp0:                 	;   18 - from bcs DoDrawGRP0
        lda (Player0Ptr),y  	; 5 23 - load the shape for player0
        sta GRP0            	; 3 26
        lda (PlayerColorPtr),y	; 5  31
        sta COLUP0				; 3  34
        
        
        
		lda (PillPtr),y			; 5 5       <------ MOVED DOWN
		sta PF2					; 3 8 
Edited by Omegamatrix
Link to comment
Share on other sites

Thanks!

 

I've corrected the cycle count.. I did not realize that the debugger displayed them for you. I looked the counts up myself which was probably good practice for me anyway. A zero page,x takes an extra cycle over zero page..

 

I did see a timing issue with player 0 as well, but didn't mention it because the player 1 issue was more severe. Now that you mentioned it though, I see its nearly identical. When player 0 crosses the ball or the coins, the time increases, just like with player 1. So the issue is even clearer now- why do my player DoDraws take longer when the player is being drawn with the ball or coins, and not the missiles? Or, why is the DoDraw taking longer at all? Its supposed to be consistent whether or not its being drawn.

 

Here's the updated code snippet. JEdit makes me tab twice sometimes to make everything line up, but when I paste here you see those double tabs.


ArenaLoop3:                 	;   ? - worse case time to get here
        sta WSYNC           	; 3 ?
;---------------------------- start of line 1 of the 2LK 
        lda #PLAYER0_HEIGHT-1 	; 2  2 - height of the player graphics, 
        dcp Player0Draw     	; 5  7 - Decrement Player0Draw and compare with height
        bcs CDoDrawGrp0     	; 2  9 - (3 10) if Carry is Set then player0 is on current scanline
        lda #0              	; 2 11 - otherwise use 0 to turn off player0
        .byte $2C           	; 4 15 - $2C = BIT with absolute addressing, trick that
                            	;        causes the lda (Player0Ptr),y to be skipped
CDoDrawGrp0:                	;   10 - from bcs DoDrawGRP0
        lda (Player0Ptr),y  	; 5 15 - load the shape for player0
        sta GRP0            	; 3 18
        lda (PlayerColorPtr),y	; 5 23
        sta COLUP0				; 3 26
        
Ball:
		lda #BALL_HEIGHT-1		; 2 28
		dcp BallDraw			; 5 33
        bcs DoDrawBall			; 2 35
		lda #0					; 2 37
        .byte $2C				; 4 41
        
DoDrawBall:        				;   36 -  from bcs DoDrawBall
        lda #2					; 2 38
        sta ENABL				; 3 41
        sta WSYNC           	; 3 44
        
;---------------------------- start of line 2 of the 2LK  

        lda #PLAYER1_HEIGHT-1   ; 2 2 - height of the player 1 graphics, subtract 1 due to starting with 0
        dcp Player1Draw     	; 5 7 - Decrement Player1Draw and compare with height
        bcs CDoDrawGrp1     	; 2 9 - (3 10) if Carry is Set, then player1 is on current scanline
        lda #0              	; 2 11 - otherwise use 0 to turn off player1
        .byte $2C           	; 4 15 - $2C = BIT with absolute addressing, trick that
                            	;        causes the lda (Player1Ptr),y to be skipped
CDoDrawGrp1:                	;   10 - from bcs DoDrawGrp1
        lda (Player1Ptr),y  	; 5 15 - load the shape for player1
        sta GRP1            	; 3 18       
        lda (Player1ColorPtr),y	; 5 23
        sta COLUP1				; 3 26
    
Missile0:
		lda #MISSILE_HEIGHT-1	; 2 28
		dcp Missile0Draw		; 5 33
		bcs DoDrawMis0			; 2 35
		lda #0					; 2 37
		.byte $2C				; 4 41
        
DoDrawMis0:        				;   36 -  from bcs DoDrawBall
        lda #2					; 2 38
        sta ENAM0				; 3 41
          
Missile1:
		lda #MISSILE_HEIGHT-1	; 2 43
		dcp Missile1Draw		; 5 48
        bcs DoDrawMis1			; 2 50
		lda #0					; 2 52
        .byte $2C				; 4 56
        
DoDrawMis1:        				;   51 -  from bcs DoDrawBall
        lda #2					; 2 53
        sta ENAM1				; 3 56

        dey                 	; 2 58 - update loop counter
        cpy #20					; 2 60
        bne ArenaLoop3       	; 2 62 - 3 63 if taken    

       
Link to comment
Share on other sites

To fix the problem you need to write the color at least 1-2 cycles earlier. I recommend moving an instruction before the WSYNC for ArenaLoop4. As a word of caution, watch out for your scanlines suddenly increasing after this change. If they do then you have probably now taken too many cycles on the previously line and you will have to move code around to fix that that.

 

I tinkered around with the arrangement a little bit but feel I need to figure this out first:

 

Any idea why the player's position is affecting the timing of the color write? When Player 0 is above the coins, the glitch only appears when he is at the same elevation as the ball. All other times, the colors change in time.

 

I can't figure out why, first, the player position would affect that at all, and second, why the many WSYNCs wouldn't iron the issue out before or during the coin drawing.

 

When at other vertical positions, it writes in time. Why?

 

post-44582-0-91872100-1471395341_thumb.jpg

Edited by BNE Jeff
Link to comment
Share on other sites

Do you know that the LDA (indirect),Y is 5 cycles, but sometimes 6? This is a problem in your code that is creating the color glitch when the P0 is the same line as the ball, and also is causing the screen to increase in length (more scanlines) when P1 goes into the coin area.
Okay so if you look here at the instruction set for the 6502, you can see the listing for LDA (indirect),Y:
Indirect,Y    LDA ($44),Y   $B1  2   5+

+ add 1 cycle if page boundary crossed
The importance of page boundaries here is that if you "cross them" it will take 6 cycles for that LDA (indirect),Y instead of 5.
A page is 256 bytes. To tell the page number take a look at an absolute address in hexadecimal:
$125A
In the above address, "2" is the page number. Here is another example:
$1C33
In the second example "C" is the page number. From both examples you can see the third digit from the right is the page number.
Now lets turn our attention to (indirect),Y addressing. At some point in the code you will have stored the base address for the pointer... something like this code:
    ; Set Player1ColorPtr to proper value for drawing player1
        lda #<(PlayerColor + COLOR_HEIGHT - 1)
        sec
        sbc ObjectY+1
        sta Player1ColorPtr
        lda #>(PlayerColor + COLOR_HEIGHT - 1)
        sbc #0
        sta Player1ColorPtr+1

The pointer takes 2 bytes in ram. If the absolute address of the pointer was say $1CF8, then $1C is the high byte, and $F8 is the low byte of the address. If you look in Stella's debugger you would see the indirect address is actually stored as in ram as $F8 $1C (not as $1C $F8). Remember this so you don't get confused. This is called Little Endian Order. Low byte first, and high byte second.

 

 

Now for the page boundary crossing stuff I will give a table and explain. Remember (indirect),Y addressing is dependent on Y for the final address. You start with a base address (which is in ram), then Y gets added to that address to make the target address. The target address is where you actual get your data from.

indirect address example for LDA (indirect),Y

 Base        Y       Final     Pages for Base Addr and
Address    value    Address      Final Addr the same?     Cycles
----------------------------------------------------------------
 $1CF8       0       $1CF8               Yes                5
 $1CF8       1       $1CF9               Yes                5
 $1CF8       2       $1CFA               Yes                5
 $1CF8       3       $1CFB               Yes                5
 $1CF8       4       $1CFC               Yes                5
 $1CF8       5       $1CFD               Yes                5
 $1CF8       6       $1CFE               Yes                5
 $1CF8       7       $1CFF               Yes                5
 $1CF8       8       $1D00               No                 6   <--- Page boundary crossed
 $1CF8       9       $1D01               No                 6   <--- Page boundary crossed
 $1CF8      10       $1D02               No                 6   <--- Page boundary crossed
 $1CF8      11       $1D03               No                 6   <--- Page boundary crossed
 $1CF8      12       $1D04               No                 6   <--- Page boundary crossed
 $1CF8      13       $1D05               No                 6   <--- Page boundary crossed
 $1CF8      14       $1D06               No                 6   <--- Page boundary crossed

These page boundary crossings can be a nuisance if you don't account for the extra cycle. In the case of your code it is causing all sorts of problems related to delaying other parts of the code so color updates come late, and sometimes scanlines get doubled due to writing to WYSNC at cycle 74 instead of cycle 73.

 

 

To fix these page boundaries you either have allow time for 5 or 6 cycles, or change what pointer and index so that you don't cross the page.

 

 

As a side note off topic, there are times you take advantage of page crossings to save a byte or two in the kernel instead of using nop's.

 

  • Like 1
Link to comment
Share on other sites

I didn't mention it, but there is also a real simple solution to fix these page boundary crossings.

 

Move your data so that it doesn't cross the page boundary. For example if the player graphics data starts at the beginning of the page and you are getting page boundary crossings because your pointer is 14 bytes before the beginning of the page, then inserting 14 bytes of something before your graphics will correct that.

 

Good choices to fill the space are other data tables that don't get used by the kernel or at least get used in a place where timing is not tight.

  • Like 1
Link to comment
Share on other sites

Technically the first 2 digits are the page numbers, the pages would be $12 and $1C.

 

Zero Page RAM is known as such because the first 2 digits for the address are $00.

 

The stack on 6502 based systems is on page 1, it starts at $01FF and works its way down to $0100.

 

The memory in the 2600 is wired without complete decoding of the address lines, which causes the same physical RAM from $0080-$00FF to also be accessible at $0180-$01FF so it can be used as both zero page and the stack. The RAM is mirrored numerous times, such as $0480-$04FF, $0580-$05FF, $2080-$20FF, $2180-$21FF, etc.

  • Like 1
Link to comment
Share on other sites

Hmm...

 

I did have a slight awareness of page boundaries, and considered it here, but discounted it because I removed a couple of NOPs from the kernal and still had the problem. I also thought that crossing the page boundary meant that if the operator and the operand were on different pages. But you are saying that during the LDA itself, the index causes the crossing. Right?

 

So, as far as I can tell, my align 256 is keeping my graphics on a single page ($F5) And, I moved some stuff around with no effect.

 

However, PillRAMGfx is stored in RAM. But I moved the defined space a bit with no effect.

 

Is using LSB, MSB with zero page / RAM doing this?

 

Am I on the right track?

Edited by BNE Jeff
Link to comment
Share on other sites

Couldn't figure it out so I just fixed it by rearranging the kernal.

 

post-44582-0-38126100-1471684478_thumb.jpg


ArenaLoop4:
        sta WSYNC           	; 3 0
;---------------------------- start of line 1 of the 2LK 

DoDrawPills:	 
	;lda (PillPtr),y	; 5 Moved to the end of line 2
	;sta PF2		; 3 Moved to the end of line 2
        
        lda #PLAYER0_HEIGHT-1  	; 2 2
        dcp Player0Draw     	; 5 7 
        bcs DDoDrawGrp0       	; 2 9 
        lda #0                	; 2 11 
        .byte $2C           	; 4 15
                             	; 	 
DDoDrawGrp0:                 	;   10 
        lda (Player0Ptr),y   	; 5 15 
        sta GRP0            	; 3 18
        lda (PlayerColorPtr),y 	; 5 23
        sta COLUP0		; 3 26
        
        lda CoinColor,x		; 4 30
        sta COLUPF		; 3 33      
        
        nop 			; 2 35
        nop			; 2 37
        nop			; 2 39
        nop			; 2 41
        nop			; 2 43
        nop			; 2 45
        nop			; 2 47
        nop 			; 2 49
        nop			; 2 51
        nop			; 2 53
        nop			; 2 55
        nop			; 2 57	

        lda #PF_COLOR 		; 2 59
        sta COLUPF 		; 2 61
        dex			; 2 63
        sta WSYNC            	; 3 66
        
;---------------------------- start of line 2 of the 2LK 

        lda #PLAYER1_HEIGHT-1   ; 2  2
        dcp Player1Draw      	; 5  7  
        bcs DDoDrawGrp1      	; 2  9
        lda #0              	; 2 11  
        .byte $2C            	; 4 15 
                            	;  
DDoDrawGrp1:                 	;   10 
        lda (Player1Ptr),y  	; 5 15 
        sta GRP1            	; 3 18   
        lda (Player1ColorPtr),y	; 5 23
        sta COLUP1		; 3 26
        lda CoinColor,x		; 4 30
        sta COLUPF		; 3 33  
        
        nop			; 2 35
        nop			; 2 37
        nop			; 2 39
        nop			; 2 41
        nop			; 2 43
        
        dey			; 2 45 Moved here for timing
        dex			; 2 47
        txa			; 2 49
        and #%00000011		; 2 51
        tax			; 2 53
        lda #PF_COLOR		; 2 55
        sta COLUPF		; 3 58   

	lda (PillPtr),y		; 5 63 Moved here for timing
	sta PF2			; 3 66 Moved here for timing
		
        ;dey                 	;      Moved up for timing
        cpy #10			; 2 68
        bne ArenaLoop4       	; 2 70 - 3 73 if taken
        

BP1l.asm

BP1.bin

Edited by BNE Jeff
Link to comment
Share on other sites

Hello!

 

I've been working hard on the game...

 

post-44582-0-84755300-1472247417_thumb.jpg

 

Successes:

 

The score board is now functional thanks to Omegamatrix.

 

I finally buried the player and ball positioning routines inside the top wall of the arena. TIA simply repeats the playfield pattern while the subroutines run.

 

Consequently, the screen is now back to 192 scan lines.

 

I added some collision detection so now the ball can be batted back and forth (for 1 point!) Also, you can shoot the ball if you don't feel like hitting it.

 

The missiles now come directly from the players. I cheated by not using a DoDraw for them. The missiles come out at the elbows because ENAM0 and ENAM1 look at the D1 bit of the player graphics. This save lots of time in the KERNEL (Happy now, Gauauu?) This has the effect of giving the players guided missile capability too. Not sure if I like that. But as the players pick up more guns, it will be easy to make more bullets come out as well.

 

 

Issues:

 

I added a simple score routine- one point for hitting the ball. The score displays strange characters. It seems to be counting in hexidecimal. I tried to turn Binary Coded Decimal on, then off during the score math but it didn't work.

 

Starting to become concerned that I just won't be able to keep the left and right arena walls much longer due to timing issues.

 

Next:

 

I really want to get the Jumbotron fully functional but strange things happen each time I try. Right now it only displays the score.

BP2.bin

BP2b.asm

Edited by BNE Jeff
  • Like 1
Link to comment
Share on other sites

Looking good!

 

I took a quick look at your code and the problem you're having with Decimal Mode is this:

 

 

 

Believe it or not, only two instructions are affected by the D flag: ADC and SBC.

 

In other words, not the INC instruction.

 

I've updated it to add 1 point for player collisions and 10 points for missile collisions:

BP2b.asm

BP2b.bin

Link to comment
Share on other sites

Looking good!

 

I took a quick look at your code and the problem you're having with Decimal Mode is this:

 

 

 

 

In other words, not the INC instruction.

 

I've updated it to add 1 point for player collisions and 10 points for missile collisions:

attachicon.gifBP2b.asm

attachicon.gifBP2b.bin

 

Thanks! Can't INC. Now I know...

 

I was wondering how I was going to carry the score to all 6 digits too:

    sed
    clc
    adc leftScore,y
    sta leftScore,y
    dey
    lda #0
    adc leftScore,y
    sta leftScore,y
    dey
    lda #0
    adc leftScore,y
    sta leftScore,y
    cld
    rts
    

Pretty straight forward. Just like carrying an LSB to an MSB. I guess this is pretty much the 24 bit version of that.

 

Thanks again!

Link to comment
Share on other sites

OK, the Jumbo-tron is hooked up now. (BIN attached)

 

I had to branch to jumps to subroutines. Seems a little inefficient. Is there a better way?

	lda JumboState
	cmp #0
	beq StaticDisplayJump
	cmp #1
	beq TestPatternJump
						; All other values will default to Scoreboard
	jmp TwoScore
	
TestPatternJump:
		jsr TestPattern
		jmp Band2Prep
		
StaticDisplayJump:
		jsr StaticDisplay
		jmp Band2Prep


TwoScore:
	sta WSYNC
	sta WSYNC
	sta WSYNC
	
	TWO_SCORE_KERNEL	;***
	
	sta WSYNC
	sta WSYNC
	lda #PF_COLOR
	sta COLUPF
	; Set temporary ball size and reflect
	
		lda #%1111001
		sta CTRLPF
		sta VDELBL		;D0 to delay
		;sta REFP0		;D3 to reflect
		;sta REFP1		;D3 to reflect 	
	sta WSYNC
	
Band2Prep:
		ldy #90		; compensate for score loop

BP2.bin

Edited by BNE Jeff
Link to comment
Share on other sites

 

Thanks! Can't INC. Now I know...

 

I was wondering how I was going to carry the score to all 6 digits too:

    sed
    clc
    adc leftScore,y
    sta leftScore,y
    dey
    lda #0
    adc leftScore,y
    sta leftScore,y
    dey
    lda #0
    adc leftScore,y
    sta leftScore,y
    cld
    rts
    

Pretty straight forward. Just like carrying an LSB to an MSB. I guess this is pretty much the 24 bit version of that.

 

Thanks again!

Glad to see you making progress. I would like to give some general tips here.

 

As you code more and more you'll learn when it better to use the Y register or X.

    adc leftScore,y
    sta leftScore,y

Each of these instructions takes 3 bytes even though the ram locations are in zero page. If you look at the instruction set you will see that zeropage,Y isn't available in this case, so DASM chooses absolute,Y instead.

 

However the X register is available for zeropage in this case, and that only takes 2 bytes. This will save you 6 bytes in your routine. It will also save you some cycles for the STA instruction.

 

Next you can skip the decrement instructions in your routine by hardcoding the subtraction before the comma, and DASM will take care of the rest. So your code looks like this now:

    sed
    clc
    adc leftScore,X
    sta leftScore,X
    lda #0
    adc leftScore-1,X   ; -1 instead of 1st DEX
    sta leftScore-1,X
    lda #0
    adc leftScore-2,X   ; -2 accounts for 2nd DEX
    sta leftScore-2,X
    cld
    rts

Take a look at the list file or open the code in Stella to see what the subtraction compiles too. :)

Link to comment
Share on other sites

 

OK, the Jumbo-tron is hooked up now. (BIN attached)

 

I had to branch to jumps to subroutines. Seems a little inefficient. Is there a better way?

	lda JumboState
	cmp #0
	beq StaticDisplayJump
	cmp #1
	beq TestPatternJump
						; All other values will default to Scoreboard
	jmp TwoScore

You can skip the cmp #0 in this case, as lda JumboState will set/clear the zero flag depending on the value of JumboState.

 

So if you lda JumboState, and JumboState = 0, then BEQ will be taken as the zero flag is set.

 

If you lda JumboState, and JumboState = any value besides 0, then BEQ will not be taken as the zero flag is clear.

 

 

 

If you end up with doing many compares then it becomes inefficient. What can be done in this case is making your code a state machine, and doing indirect jumps into the state you are in. Basically you have tables of pointers to indexed by the state, and that makes it easy to jump to the code. From my own game you can see I do that in some places. In the code below I am in the splash screen. I have 12 states at the moment, and I certainly do not want to do 12 compares to figure out what code I need to jump to.

;==============================================================================
;                         STATE MACHINE VBLANK CODE POINTERS
;==============================================================================

SplshVblankLoTab:
    .byte <SplshStVb_00_Balloons
    .byte <SplshStVb_01_Explosions
    .byte <SplshStVb_02_GntMorph
    .byte <SplshStVb_03_GntPause
    .byte <SplshStVb_04_GntFlail    ; Splash State 04 (Vblank region) Giant Arms Flailing
    .byte <SplshStVb_05_GntFall
    .byte <SplshStVb_06_GntImpact
    .byte <SplshStVb_07_GntBounce
    .byte <SplshStVb_08_GntCrossing
    .byte <SplshStVb_09_GntGone
    .byte <SplshStVb_10_WriteAtarivox
    .byte <SplshStVb_11_ReadAtarivox
SplshVblankHiTab:
    .byte >SplshStVb_00_Balloons
    .byte >SplshStVb_01_Explosions
    .byte >SplshStVb_02_GntMorph
    .byte >SplshStVb_03_GntPause
    .byte >SplshStVb_04_GntFlail
    .byte >SplshStVb_05_GntFall
    .byte >SplshStVb_06_GntImpact
    .byte >SplshStVb_07_GntBounce
    .byte >SplshStVb_08_GntCrossing
    .byte >SplshStVb_09_GntGone
    .byte >SplshStVb_10_WriteAtarivox
    .byte >SplshStVb_11_ReadAtarivox


;==============================================================================
;                          SPLASH SCREEN VBLANK
;==============================================================================

SplashMainLoop:
    lda    #$0E
.loopVsync:
    sta    WSYNC
;---------------------------------------
    sta    VSYNC
    lsr
    bne    .loopVsync
    lda    #SPLSH_TIME_VBLANK
    sta    TIM64T

    ldy    splsh_State
    lda    SplshVblankLoTab,Y
    sta    splsh_JumpInd
    lda    SplshVblankHiTab,Y
    sta    splsh_JumpInd+1
    ldy    giant_GiantSequence
    jmp.ind (splsh_JumpInd)   ; indirect jump into code segment
Link to comment
Share on other sites

That's what I get for doing something quick and dirty right before I was heading out :)

 

I know.. I didn't know what to think of it.. Disbelief, at first.. :grin:

 

So as far as I'm concerned, I didn't know any of that. I thought x and y were identical. Now I know to look out for that.

 

I'm pretty weak on DASM. I've seen people use things like:

adc leftScore-1
sta leftScore-1

but never really realized its replaces a counter.

 

I've read the state machine stuff a couple of times so far. Not getting it yet...

 

Thanks!

Edited by BNE Jeff
Link to comment
Share on other sites

 

I know.. I didn't know what to think of it.. Disbelief, at first.. :grin:

 

So as far as I'm concerned, I didn't know any of that. I thought x and y were identical. Now I know to look out for that.

 

I'm pretty weak on DASM. I've seen people use things like:

adc leftScore-1
sta leftScore-1

but never really realized its replaces a counter.

 

I've read the state machine stuff a couple of times so far. Not getting it yet...

 

Thanks!

 

Basically, he's getting what the desired state is, which is presumably set from one of the other states, plopping that into Y, and using it to grab the address of the desired table, plop that into a 16-bit variable, and use jump indirect to go to the address in that variable. So the jump would happen, code would execute, which invariably would cause it to set the state to something else, repeat.

 

this is the pattern of a finite state machine:

 

* Get Desired State (this is the beginning of the FSM)

* Jump to code implementing desired state

* Execute code of desired state

* set next desired state

* Jump back to the beginning of the FSM, which repeats this whole thing, repeating indefinitely, or until there is an explicit end state defined which breaks this loop.

 

-Thom

  • Like 1
Link to comment
Share on other sites

MODE           SYNTAX       HEX LEN TIM
Absolute      JMP $5597     $4C  3   3
Indirect      JMP ($5597)   $6C  3   5

Ahh..

 

OK, so there is something called jump indirect. I was certain that JMP could only move a half a page forward or backwards. Schooled twice in twelve hours on those instructions.. :) I really did read them when I started doing this, but some of it doesn't really sink in until you see it in context. State table makes sense now.

 

Tough still, the state table still needs to be kept close to any possible JMP. Correct?

 

 

 

You can skip the cmp #0 in this case, as lda JumboState will set/clear the zero flag depending on the value of JumboState.

 

So if you lda JumboState, and JumboState = 0, then BEQ will be taken as the zero flag is set.

 

If you lda JumboState, and JumboState = any value besides 0, then BEQ will not be taken as the zero flag is clear.

 

 

Speaking of context, when I first started, I thought BEQ, and BNE meant specifically "Branch if equal to zero" or "Branch if not equal to zero" because I usually saw it used the way you say here. But totally forgot it.

 

Thanks again!

Edited by BNE Jeff
Link to comment
Share on other sites

I know.. I didn't know what to think of it.. Disbelief, at first.. :grin:

:lol:

 

So as far as I'm concerned, I didn't know any of that. I thought x and y were identical. Now I know to look out for that.

If you look at the opcodes, such as for ADC:

Affects Flags: S V Z C

MODE           SYNTAX       HEX LEN TIM
Immediate     ADC #$44      $69  2   2
Zero Page     ADC $44       $65  2   3
Zero Page,X   ADC $44,X     $75  2   4
Absolute      ADC $4400     $6D  3   4
Absolute,X    ADC $4400,X   $7D  3   4+
Absolute,Y    ADC $4400,Y   $79  3   4+
Indirect,X    ADC ($44,X)   $61  2   6
Indirect,Y    ADC ($44),Y   $71  2   5+


 

You'll notice there's a Zero Page,X but not a Zero Page,Y. dasm knows this, so it will use Absolute,Y instead. Likewise Indirect,X and Indirect,Y look different in the SYTNAX in order to help you remember that they work differently. I don't think I've ever used Indirect,X

 

 

Compare the Disassembly in these two screenshots.

 

Using Y

post-3056-0-41356600-1472396634_thumb.png

 

Using X

post-3056-0-73086500-1472396644_thumb.png

 

You'll see that the one using Y is using 3 bytes per instruction vs 2 for the X (the hex values with the green background). You'll also see the instruction use different hex values (79 for Y, 75 for X) matching what you see in the opcode mode list for ADC above.

 

Also note Using Y shows the instructions with .wy (for Word Address Indexed Y) after them. Additionally the STA instruction for Indirect,Y takes longer than Indirect,X - that's show by the ;5 and ;4 just to the left of the hex values. Looking at those ;# values in Stella is an easy way to verify the cycle times that you've been putting ( :thumbsup:) in your code.

 

I'm pretty weak on DASM. I've seen people use things like:

adc leftScore-1
sta leftScore-1
but never really realized its replaces a counter.

 

 

That calculates the address at compile time rather than run time. Anytime you can do that is a good thing as it saves cycle time for your program, and there's not many to spare on the Atari.

 

If you look at the Using Y you'll see that leftScore is replaced with the address $00B5 (listed as B5 00 in the hex values because the 6507 is a little-endian CPU. For the Using X you'll see the leftScore-1 is replaced with B4, and leftScore-2 with B3. Each of those replacements saved 2 cycles of run time as they eliminated the decrement instructions.

  • Like 1
Link to comment
Share on other sites

I'm pretty weak on DASM. I've seen people use things like:

adc leftScore-1
sta leftScore-1

but never really realized its replaces a counter.

It doesn't replace a counter per say.

 

Darrell explains it here:

That calculates the address at compile time rather than run time. Anytime you can do that is a good thing as it saves cycle time for your program, and there's not many to spare on the Atari.

 

If you look at the Using Y you'll see that leftScore is replaced with the address $00B5 (listed as B5 00 in the hex values because the 6507 is a little-endian CPU. For the Using X you'll see the leftScore-1 is replaced with B4, and leftScore-2 with B3. Each of those replacements saved 2 cycles of run time as they eliminated the decrement instructions.

I want this to be very clear. I will substitute it in the code:

;name          ram location
leftScore       $B5
leftScore-1     $B4
leftScore-2     $B3

;This:
    sed
    clc
    adc leftScore,X
    sta leftScore,X
    lda #0
    adc leftScore-1,X
    sta leftScore-1,X
    lda #0
    adc leftScore-2,X
    sta leftScore-2,X
    cld
    rts
    
;Essentially becomes this:
    sed
    clc
    adc $B5,X
    sta $B5,X
    lda #0
    adc $B4,X
    sta $B4,X
    lda #0
    adc $B3,X
    sta $B3,X
    cld
    rts

Handy what the compiler can do, no?

 

 

Now for jumps:

MODE           SYNTAX       HEX LEN TIM
Absolute      JMP $5597     $4C  3   3
Indirect      JMP ($5597)   $6C  3   5

Ahh..

 

OK, so there is something called jump indirect. I was certain that JMP could only move a half a page forward or backwards. Schooled twice in twelve hours on those instructions.. :) I really did read them when I started doing this, but some of it doesn't really sink in until you see it in context. State table makes sense now.

 

Tough still, the state table still needs to be kept close to any possible JMP. Correct?

It's branches could only move forward or backward half a page. Jumps can go anywhere in the current bank. The state table doesn't have to be close as you are using absolute indexed addressing to look up the values.

 

 

To criticize my own code for indirect jumps... If I'm not currently in a subroutine then I should have stack ram to do something like this:

;==============================================================================
;                         STATE MACHINE VBLANK CODE POINTERS
;==============================================================================

SplshVblankLoTab:
    .byte <(SplshStVb_00_Balloons-1)
    .byte <(SplshStVb_01_Explosions-1)
    .byte <(SplshStVb_02_GntMorph-1)
    .byte <(SplshStVb_03_GntPause-1)
    .byte <(SplshStVb_04_GntFlail-1)   ; Splash State 04 (Vblank region) Giant Arms Flailing
    .byte <(SplshStVb_05_GntFall-1)
    .byte <(SplshStVb_06_GntImpact-1)
    .byte <(SplshStVb_07_GntBounce-1)
    .byte <(SplshStVb_08_GntCrossing-1)
    .byte <(SplshStVb_09_GntGone-1)
    .byte <(SplshStVb_10_WriteAtarivox-1)
    .byte <(SplshStVb_11_ReadAtarivox-1)
SplshVblankHiTab:
    .byte >(SplshStVb_00_Balloons-1)
    .byte >(SplshStVb_01_Explosions-1)
    .byte >(SplshStVb_02_GntMorph-1)
    .byte >(SplshStVb_03_GntPause-1)
    .byte >(SplshStVb_04_GntFlail-1)
    .byte >(SplshStVb_05_GntFall-1)
    .byte >(SplshStVb_06_GntImpact-1)
    .byte >(SplshStVb_07_GntBounce-1)
    .byte >(SplshStVb_08_GntCrossing-1)
    .byte >(SplshStVb_09_GntGone-1)
    .byte >(SplshStVb_10_WriteAtarivox-1)
    .byte >(SplshStVb_11_ReadAtarivox-1)


;==============================================================================
;                          SPLASH SCREEN VBLANK
;==============================================================================

SplashMainLoop:
    lda    #$0E
.loopVsync:
    sta    WSYNC
;---------------------------------------
    sta    VSYNC
    lsr
    bne    .loopVsync
    lda    #SPLSH_TIME_VBLANK
    sta    TIM64T

    ldy    splsh_State
    lda    SplshVblankHiTab,Y
    pha
    lda    SplshVblankLoTab,Y
    pha
    ldy    giant_GiantSequence
    rts              ; indirect jump into code segment

That saves 4 bytes. Note how the pointer tables are now set up using the "()" and -1. Follow this exactly for both tables or you will have debugging headaches.

 

If you are in Stella's debugger look at the address is stores when you do a JSR. It is always -1.

 

 

Now for me there is still more optimizing to be done. I'm close to the point where all the code segments are on the same page, and the code is pretty much finished (which means I'm not shifting it around.... It's stable). Once I have it all on the same page I can eliminate the look up table for the high address pointers and do this instead:

    ldy    splsh_State
    lda    #>(SplshStVb_00_Balloons-1)
    pha
    lda    SplshVblankLoTab,Y
    pha
    ldy    giant_GiantSequence
    rts              ; indirect jump into code segment

For my code that will save me an additional 13 bytes.

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...