Am I going the right way with this?

johnnystarr · December 9, 2013

Here is the kernal outline I've written to better understand how this stuff works:

;-------------------------------------------------------------------------
;  Kernel
;-------------------------------------------------------------------------
MAIN        LDA     #0
            STA     VBLANK
            LDA     #%00000010
            STA     VSYNC
            STA     WSYNC
            STA     WSYNC
            STA     WSYNC
            LDA     #0
            STA     VSYNC

            LDA     #37
            STA     TIM64T
WAITVB      LDA     INTIM
            BNE     WAITVB
            
            
            
            LDX     #192
PICTURE     DEX
            STX     COLUBK
            STA     WSYNC
            BNE     PICTURE

            LDA     #%01000010
            STA     VBLANK
            
            LDA     #30
            STA     TIM64T
WAITOS      LDA     INTIM
            BNE     WAITOS


            JMP     MAIN

So, instead of using REP or a bunch of NOPs, I wanted timers for VBLANK and OVERSCAN. I went with 37 / 192 / 30 segments as most tutorials reccomend. As 192 counts down to zero in the PICTURE loop, it prints the background and whatnot. I've attached a screenshot of the output. It looks...right? I mean I'm not sure if it is or not...

By using the TIM64T, I am giving myself a buffer to write my logic right? Basically, as my game grows, the timer will adjust automatically? Or, do I have to count cycles for the VBLANK and OVERSCAN the way I do for HBLANK?

Sorry if this is derivative of any recent posts. I've been out of the 2600 dev for several months and I want this to make sense before I move forward.

+SpiceWare · December 9, 2013

Looks like you have the right idea, though your timer counts are too low. There's 76 cycles per scanline, but the timer counts down every 64 cycles so for the Vertical Blank section use 44 instead of 37 (37*76/64 = 44). Likewise use 35 instead of 30 for OverScan (30*76/64 = 36). You want to generate a 262 scanline display, so you may need to adjust those values. You can hit the ` key to enter Stella's debugger and look for this:

You can also use Command-L (Mac) or ALT-L (Linux, Windows) to show the count over the screen:

I did a presentation at the Houston Arcade Expo that includes a small sample program with lots of comments (way more than I'd normally put in). My code uses 47 for the timer because I'm setting it during the 3 scan lines for the sync signal. I also draw a larger display of 200 scan lines, so my overscan timer is set for a smaller amount.

Here's my presentation (36.1 MB Keynote) from the expo. Also available as a PDF document (6.3 MB) or PowerPoint presentation (6.4 MB).

Also here's the complete source (23.2 KB) with ROM and output from DASM for the sample program.

Edited December 9, 2013 by SpiceWare

johnnystarr · December 9, 2013

Thanks, I've adjusted my timers and it seems to give me a better picture.

I'm a bit perlexed at the logic when accessing the TIA here though. For instance, can you help me understand the following code?

MAIN        LDA     #0              ; zero
            STA     VBLANK          ; turn off VBLANK?
            LDA     #%00000010      ; D1 = 1
            STA     VSYNC           ; enable VSYNC
            STA     WSYNC           ; send WSYNC signal for VSYNC
            STA     WSYNC           ; 2
            STA     WSYNC           ; 3
            LDA     #0              ; zero
            STA     VSYNC           ; turn off VSYNC

            LDA     #43             ; 37 scanlines * 76 / 64 = 43
            STA     TIM64T
                        
            ; begin vblank logic here
            
WAITVB      LDA     INTIM
            BNE     WAITVB

So... I get why we would turn VBLANK off at the start of this code. This is because we are at the top of the loop and the overscan hadn't disabled VBLANK. We then write VSYNC for 3 lines, then we turn off VSYNC because we've done our thing. But, then we don't start VBLANK again???

We explcitly go to blanking below just before we get to Overscan. So, why wouldn't we turn on blanking for the 37 lines of VBlank?

+SpiceWare · December 9, 2013

Ah, yeah your setting of VBLANK isn't quite right either.

Think of the TIA register VBLANK as VIDEO ON/VIDEO OFF. When bit 1 of VBLANK is turned on, the video output is turned off. This causes TIA to ignore the colors and bit patterns set for the players (sprites), missiles, ball, playfield, etc. and just send black scanlines to the TV. This is important during Vertical Blank as outputting anything other than black can be interpreted by your TV as something like Closed Captioning (which is sent on line 21 of the Vertical Blank).

The last thing your 192 kernel loop does is set the background color to black. Since nothing else was ever turned on you can't easily tell that VBLANK wasn't correct.

If you look at my sample code you'll find:

VerticalSync:
    lda #2      ; LoaD Accumulator with 2
    sta WSYNC   ; STore Accumulator to WSYNC, any value halts CPU until start of next scanline
    sta VSYNC   ; Accumulator D1=1, turns on Vertical Sync signal
    sta VBLANK  ; Accumulator D1=1, turns on Vertical Blank signal (image output off)
 
...

;========================================
; Kernel
;========================================        
    
    ; turn on video output    
    sta WSYNC
    lda #0 
    sta VBLANK

Edited December 9, 2013 by SpiceWare

johnnystarr · December 10, 2013

Awesome advice. I really appreciate your slides as well. Using the debug mode in the Stella emulator has been a goal of mine, but I didn't exactly know what to look for. I really need to readup on it to see what else I can debug.

I find it interesting that you separate the VBlank and Overscan parts from the Kernel itself. I guess I've viewed the "kernel" as being the entire game loop. I guess it could go either way.

After playing with the timers, I came out right at 262! Kind of a good feeling actually; zen-like if you will.

+SpiceWare · December 10, 2013

Thanks. Here's a good page to start with for using Stella's Debugger.

I've always considered the Kernel to be the part that generates the display - so technically that'd be triggering VerticalSync, setting the timer & wait for the end of VB and OS, and the main loop that draws the display. I don't consider the game logic (in VB and OS) as part of the kernel as they're not actively generating the display.

I separate out VB and OS because I reuse sections of code in order to save precious ROM space. If you were to look in the source for Space Rocks you'd only see a single section of code that sets Vertical Sync (it's labeled VerticalBlank:), even though I have 3 different "game logic" routines to process the logic for the menu, asteroid field or easter egg. At the end of VerticalBlank: you'll find this bit of code that decides which of those routines needs to be run:

        ldy Mode
        bmi EE       ; if Mode = 128 (bit 7, the minus flag, is on) then run the Easter Egg logic
        bne Game     ; if Mode = 1 (not equal to zero) then run the Game logic
        jmp MainMenu ; otherwise Mode = 0, so run the Main Menu logic
EE:     jmp EasterEgg        
Game:                ; start of game logic

Edited December 10, 2013 by SpiceWare

johnnystarr · December 10, 2013

Dude, I love the way you think! I see how your code is really a 3 clause condtion with only 1 load. If I understand correctly, the psuedo code might look like this:

if A = 10000000
    EasterEgg()
end

if A = 00000001
    Game()
else    // A = 00000000
    MainMenu()
end

Edited December 10, 2013 by johnnystarr

johnnystarr · December 11, 2013

@SpiceWare or anyone else who knows.

I've been learning how to set breakpoints in the Stella Emulator. The downside is that I want to be able to set a breakpoint on a label and for it to break the first time it gets to that label. But, what happens, is I have to load Stella, go into debug mode, set my breakpoint and type "run".

It is important for me to know why something isn't working on the initial pass. But, by the time i hit `, it has cycled dozens of times. I've tried 'restart' or what not, but it isn't a command.

johnnystarr · December 11, 2013

nevermind I found the -debug commandline option

+SpiceWare · December 11, 2013

Right-Click the ROM and select Power-on options

Change Startup Mode from Console to Debugger

Click Load ROM

Note that the Frame Count, Frame Cycle, etc are all 0. PC will be whatever you set the RESET vector to, in mine it's F000 which is also labeled InitSystem.

Omegamatrix · December 12, 2013

I never run Stella from the command line. I just leave it open while I'm compiling. When I run a rom I find I can get into the debugger immediately by hitting the ~ key as the rom is loading.

You can also exit the debugger by pressing the ~ key again.

johnnystarr · March 11, 2014

I did a presentation at the Houston Arcade Expo that includes a small sample program with lots of comments (way more than I'd normally put in). My code uses 47 for the timer because I'm setting it during the 3 scan lines for the sync signal. I also draw a larger display of 200 scan lines, so my overscan timer is set for a smaller amount.

Spice, in your presentation, you highlight the different sections of games and the graphics used (IE P0, P1, M0, M1, BL, PF, BK) Is there a way to do this in the Stella emulator, or are these values inferred based on your studies?

+SpiceWare · March 11, 2014

Spice, in your presentation, you highlight the different sections of games and the graphics used (IE P0, P1, M0, M1, BL, PF, BK) Is there a way to do this in the Stella emulator, or are these values inferred based on your studies?

That's built into Stella. It's known as Fixed Debug Colors and is covered in the Developer Keys section of Stella's manual. I'm on a Mac, so I use Command-Comma to toggle that feature. If you're on Linux or Windows then you'll use ALT-Comma instead.

You can also toggle the Fixed Debug Colors in the debugger by right-clicking in the TIA Display in the upper-left:

The colors are used when the display is rendered, so changing the setting does not change what's already shown - easiest thing to do is hit the <Frame + 1> button to render the next frame.

Edited March 11, 2014 by SpiceWare

+SpiceWare · March 11, 2014

Just for fun, here's an example of toggling the setting then advancing by scanlines instead of frames:

johnnystarr · March 11, 2014

Spice, thanks again for the cool tips. I was looking again at Dragon Fire just now, and I wanted to clarify something about HBLANK. On this screen there are 3 HBLANK lines. I'm guessing the top two are repositioning P0 for the man and the gargoyles on the right? And the bottom one is for the Score?

+SpiceWare · March 11, 2014

I always thought of those as shields, but gargoyles probably makes more sense.

If you start the game, you'll see that player1 is used for both fireballs.

The 2nd HMOVE is between the fireballs, so it's most likely used to reposition player1. To make sure, start up the debugger and type in trapwrite RESP1. If you start the emulation, you'll find that it's written to 4 times:

Before first fireball

Before second fireball

Before gargoyles

Before score

Even though RESP1 was used 4 times, HMOVE was only used on 3 of them.

When positioning a player, normally you use 3 registers. For player 1 those are RESP1, HMP1 and HMOVE. The registers do the following:

RESP1 - set coarse position of player 1. Due to the differences in CPU (1.19 HMz) and TIA (3.58 MHz) speed you can't set every X position, only every third position.
HMP1 - set the horizontal motion for player 1. This is used to set the X position adjust of -7 to 8 pixels. Updating HMP1 doesn't actually make the adjustment.
HMOVE - uses the value in HMP1 to adjust player 1's X position.

If RESP1 puts the sprite were you want it, then there's no need to do HMP1 and HMOVE. For decorative features like the gargoyles position 27 (where player 1 ends up) works just as well as if it ended up at position 26 or 28. Due to the limited resources (it's a 4K game), if the position's good enough then there's no need to waste ROM making it "perfect".

johnnystarr · March 12, 2014

When positioning a player, normally you use 3 registers. For player 1 those are RESP1, HMP1 and HMOVE. The registers do the following:

RESP1 - set course position of player 1. Due to the differences in CPU (1.19 HMz) and TIA (3.58 MHz) speed you can't set every X position, only every third position.

HMP1 - set the horizontal motion for player 1. This is used to set the X position adjust of -7 to 8 pixels. Updating HMP1 doesn't actually make the adjustment.

HMOVE - uses the value in HMP1 to adjust player 1's X position.

If RESP1 puts the sprite were you want it, then there's no need to do HMP1 and HMOVE. For decorative features like the gargoyles position 27 (where player 1 ends up) works just as well as if it ended up at position 26 or 28. Due to the limited resources (it's a 4K game), if the position's good enough then there's no need to waste ROM making it "perfect".

Spice, I can't thank you enough for all the pointers. This stuff is going to help with quite a few gaps in my understanding of the TIA. While we're on the subject of RESPx, perhaps someone could clarify this image for me. After learning how to use the 'debug colors' I've been able to review how designers utilized GRP0 and GRP1 in tandem. (I know this has been mentioned in some of my other posts, so bear with me)

Take for instance the score uses GRP0 and GRP1 every other digit. According to my recent understanding, this must be one of two things:

A) - This is an example of using RESP0 and RESP1 at just the right moment to achieve multiple sprites per line

B) - Or, this is using NUSIZ0 and NUSIZ1 to duplicate the sprites

If this is an example of "A": Is this an example of using RESPx, HMPx, and HMOVE?

If this is an example of "B": How does one go about changing the GRPx register between copies?

I'm not even sure if "B" is possible. Everyone has provided great resources and details, so I really appreciate it. Perhaps a small code sample would bridge the gap for me. I would really like to review the portion of a kernel that draws GRP0, strobes RESP0 and then draws different sprite serial data on the same line.

Thanks again!

Edited March 12, 2014 by johnnystarr

+SpiceWare · March 12, 2014

It's B. Do you happen to have an iOS device? David Crane did up a couple apps that explain 2600 tricks. The first introduces how the 6 digit score works, the second shows the actual code.

2600 Magic

Dragster Magic

Have I pointed you to MiniDig before? It contains a lot of useful info from the old Stella email list. The tricks page has stuff about VDEL and 48 pixel sprites.

Another thing you could do is look in Medieval Mayhem's source code. I use the 48 pixel routine to display the main menu. Search for Show48graphic.

johnnystarr · March 12, 2014

Another thing you could do is look in Medieval Mayhem's source code. I use the 48 pixel routine to display the main menu. Search for Show48graphic.

Ok, that's pretty cool. I've been going over your source in the Show48graphic sub. I think this would be very handy down the line. I'm curious, is this trick used every time you need to lineup P0 and P1? What if I only needed 32p vs. 48p (4 consecutive sprites)? Would I modify the code, or is this to get the extra 16 pixels?

Very cool app from Mr. Crane BTW!

+SpiceWare · March 13, 2014

Show48graphic sub doesn't position the sprites, it only updates GRP0 and GRP1 in order to show the 48 pixels image.

The lineup happens at the following lines located just below just below VerticalBlankMenuCode

26614 - set HMP1 to 0 (so no adjustment for player 1)
26627 - set HMP0 to $F0 (means +1)
26632 - strobe RESP0 to set player 0 to X position 54
26633 - strobe RESP1 to set player 1 to X position 63
26643 - strobe HMOVE, which uses the value in HMP0 to move player 0 to X position 55.

For 32 pixels without using the VDEL trick you'd basically do this:

loop:
  lda (imageA),y
  sta GRP0
  lda (imageB),y
  sta GRP1
  lda (imageC),y
  tax
  lda (imageD),y
  SLEEP ??
  stx GRP0
  sta GRP1
  dey
  bpl loop

The value for SLEEP will depend on the X positions of the players.

johnnystarr · March 13, 2014

For 32 pixels without using the VDEL trick you'd basically do this:
loop:
  lda (imageA),y
  sta GRP0
  lda (imageB),y
  sta GRP1
  lda (imageC),y
  tax
  lda (imageD),y
  SLEEP ??
  stx GRP0
  sta GRP1
  dey
  bpl loop
The value for SLEEP will depend on the X positions of the players.

Interesting. Am I to understand that this code would be in place of the HMPx and HMOVE calls? From what I gather about the loop going on here, is that Y would set the height of our 4 graphics tables right?

If Y = 16 and our sprite tables are each 16 bytes in length, than this kernel would draw 16 scanlines of parallel players. Is it that these instructions add up to 76 cycles? Or, would I strobe WSYNC just after DEY to

make up the difference?

For the time being, I can see myself using the 32p trick for various "bosses" and 1K score kernels. I would also love to memorize and fully comprehend 48p sprites, but I think 32 is a good place to start. Would you or anyone else

be able to show a full kernel using the 32p trick?

Thanks again!

Omegamatrix · March 13, 2014

Interesting. Am I to understand that this code would be in place of the HMPx and HMOVE calls? From what I gather about the loop going on here, is that Y would set the height of our 4 graphics tables right?

If Y = 16 and our sprite tables are each 16 bytes in length, than this kernel would draw 16 scanlines of parallel players. Is it that these instructions add up to 76 cycles? Or, would I strobe WSYNC just after DEY to

make up the difference?

For the time being, I can see myself using the 32p trick for various "bosses" and 1K score kernels. I would also love to memorize and fully comprehend 48p sprites, but I think 32 is a good place to start. Would you or anyone else

be able to show a full kernel using the 32p trick?

Thanks again!

Everything has been positioned by the time that loop is entered. You are correct that 'Y' register is controlling the height.

When we are talking about 76 cycles it is always one scanline. Specifically that is how many cycles we have to do stuff before the next scanline is reached.

There should also be a WSYNC in there... sometimes though you will write a loop that is too busy for a WSYNC. In that case you have to make sure the loop is exactly 76 cycles, and that you first enter the loop on the correct cycle (which cycle depends on your loop!). There are also times where you don't want the loop locked by a WSYNC, so that you can update the graphics registers on different cycles.

An example of an unlocked loop is below. It is the first thing I ever wrote for the 2600. It is based of Eckhard's big, movable 48 byte display. In mine you can also hold the firebutton and move the joystick to scroll the graphics.

http://atariage.com/forums/topic/145465-48-bit-sprite-with-color-scroll/

Edited March 13, 2014 by Omegamatrix

+SpiceWare · March 13, 2014

Doh! Can't believe I forgot the WSYNC - it'd be right after the loop label.

My recent game kernels (for Stay Frosty 2, Space Rocks and Frantic) have been so packed with instructions that I've hardly been using WSYNC. This is the main kernel loop from Frantic:

KernelLoop:
        ; at this point the registers hold the following:
        ; A - graphics for player 1
        ; Y - enable for missile 1 & PF0 for right side of screen
        ; PF0 and PF1 have already been updated for left side of room
        ; GRP0 (on VDEL) has been preloaded with player 1 graphics
        ; BL (on VDEL) has been preenabled with missile data
        ;                  at cycle 73 
        sta GRP1                ; 3 76/0 - before 22 - also updates GRP0 & BL via VDEL
        lda #<DS_COLUP0         ; 2  2
        sta COLUP0              ; 3  5 - before 22
        lda #<DS_COLUP1         ; 2  7
        sta COLUP1              ; 3 10 - before 22
        sty ENAM1               ; 3 13 - before 22
        lda #<DS_EVENT_M0       ; 2 15 - bit 7 triggers kernel event
        sta ENAM0               ; 3 18 - before 22
        sbmi KernelEvent        ; 2 20 - 3 21 if taken
        SLEEP 3                 ; 3 23
        UPDATE_SPEECH           ; 5 28
        sty PF0                 ; 3 31 - PF0R, 28-49
        lda #<DS_PF2L           ; 2 33
        sta PF2                 ; 3 36 - PF2L, before 38
        ldy DS_PF0R_M1          ; 4 40 - any after sty PF0, loads for next line 
        lda #<DS_PF1R           ; 2 42
        sta PF1                 ; 3 45 - PF1R, 39-54
        lda #<DS_GRP0           ; 2 47 
        sta GRP0                ; 3 50 - any, on VDEL
        lda #<DS_PF2R           ; 2 52
        sta PF2                 ; 3 55 - PF2R, 50-65    
        lda #<DS_PF0L_BL        ; 2 57
        sta PF0                 ; 3 60 - PF0L, after 55
        sta ENABL               ; 3 63 - any, on VDEL
        lda #<DS_PF1L           ; 2 65
        sta PF1                 ; 3 68 - PF1L, 66 - 28
        lda #<DS_GRP1           ; 2 70
        jmp KernelLoop          ; 3 73

johnnystarr · March 13, 2014

An example of an unlocked loop is below. It is the first thing I ever wrote for the 2600. It is based of Eckhard's big, movable 48 byte display. In mine you can also hold the firebutton and move the joystick to scroll the graphics.

http://atariage.com/forums/topic/145465-48-bit-sprite-with-color-scroll/

Thanks, the source is well commented as well.

One last clarification on the 32-pixel vs. 48-pixel sprites. Are they the same technique in general? The screenshot here is of the 32p dragon. Is the trick to load NUSIZ0 and NUSIZ1 with #1 so that it enables "two sprites (close)". Then, we use RESPx, HMx and HMOVE to get the lined up just right, and finally we use Spice's example code:

For 32 pixels without using the VDEL trick you'd basically do this:
loop:
  lda (imageA),y
  sta GRP0
  lda (imageB),y
  sta GRP1
  lda (imageC),y
  tax
  lda (imageD),y
  SLEEP ??
  stx GRP0
  sta GRP1
  dey
  bpl loop
The value for SLEEP will depend on the X positions of the players.

Sorry for any redundancy, but this stuff is so arcane, it's like learning how to ride a unicycle blindfolded.

+SpiceWare · March 13, 2014

Truthfully, the 32 pixel, 48 pixel and sprite-reuse are probably more advance that what a "Newbie" should be worried about.

What I would do as a newbie would be to get a program working that uses the two players. If you want a large image just set it to 4X in size. While you'll be stuck with a chunky 32 pixel sized image, it'll most importantly help you to grasp on using RESPx, HMPx, HMOVE, etc.

Once you get a handle on the basics, then look into doing more advanced routines like 32 and 48 pixel images and sprite reuse.

Edited March 13, 2014 by SpiceWare

Am I going the right way with this?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members