Jump to content

Step 1 - Generate a Stable Display

Posted by SpiceWare, in Collect 24 June 2014 · 2,755 views

First things first - head over to MiniDig - best of stella and download the Stella Programmer's Guide from the docs page.  I've also attached it to this blog entry, but you should still check out what's available over at MiniDig.
Also, for this tutorial you'll need to have DASM to compile the code as well as Stella and/or a real 2600 with a Harmony cart to run your code.
The heart of the Atari is TIA, the Television Interface Adaptor.  It's the video chip, sound generator, and also handles some of the controller input.   As a video chip, TIA is very unusual.  Most video game systems have memory that holds the current state of the display.  Their video chip reads that memory and uses that information to generate the display.  But not TIA - memory was very expensive at the time, so TIA was designed with a handful of registers that contain just the information needed to draw a single scanline.  It's up to our program to change those registers in realtime so that each scanline shows what its supposed to.  It's also up to our program to generate a "sync signal" that tells the TV when its time to start generating a new image.
Turn to page 2 of the programmer's guide.  You'll find the following diagram, which I've slightly modified:
Attached Image
The Horizontal Blank is part of each scanline, so we don't need to worry about generating it.  Everything else though is up to us!  We need to generate a sync signal over 3 scanlines, after which we need to wait 37 scanlines before we tell TIA to "turn on" the image output.  After that we need to update TIA so each of the 192 scanlines that comprise visible portion of the display show what they're supposed to.  Once that's done, we "turn off" the image output and wait 30 scanlines before we start all over again.
In the source code, available below, you can see the Main loop which follows the diagram:

        jsr VerticalSync    ; Jump to SubRoutine VerticalSync
        jsr VerticalBlank   ; Jump to SubRoutine VerticalBlank
        jsr Kernel          ; Jump to SubRoutine Kernel
        jsr OverScan        ; Jump to SubRoutine OverScan
        jmp Main            ; JuMP to Main
Each of the subroutines handles what's needed, such as this section which generates the sync signal:
        lda #2      ; LoaD Accumulator with 2 so D1=1
        sta WSYNC   ; Wait for SYNC (halts CPU until end of scanline)
        sta VSYNC   ; Accumulator D1=1, turns on Vertical Sync signal
        sta WSYNC   ; Wait for Sync - halts CPU until end of 1st scanline of VSYNC
        sta WSYNC   ; wait until end of 2nd scanline of VSYNC
        lda #0      ; LoaD Accumulator with 0 so D1=0
        sta WSYNC   ; wait until end of 3rd scanline of VSYNC
        sta VSYNC   ; Accumulator D1=0, turns off Vertical Sync signal
        rts         ; ReTurn from Subroutine
Currently there's no game logic, so the VerticalBlank just waits for the 37 scanlines to pass:
        ldx #37         ; LoaD X with 37
        sta WSYNC       ; Wait for SYNC (halts CPU until end of scanline)
        dex             ; DEcrement X by 1
        bne vbLoop      ; Branch if Not Equal to 0
        rts             ; ReTurn from Subroutine
The Kernel is the section of code that draws the screen.  
    ; turn on the display
        sta WSYNC       ; Wait for SYNC (halts CPU until end of scanline)
        lda #0          ; LoaD Accumulator with 0 so D1=0
        sta VBLANK      ; Accumulator D1=1, turns off Vertical Blank signal (image output on)
    ; draw the screen        
        ldx #192        ; Load X with 192
        sta WSYNC       ; Wait for SYNC (halts CPU until end of scanline)
        stx COLUBK      ; STore X into TIA's background color register
        dex             ; DEcrement X by 1
        bne KernelLoop  ; Branch if Not Equal to 0
        rts             ; ReTurn from Subroutine
For this initial build it just changes the background color so we can see that we're generating a stable picture
Attached Image
Like Vertical Blank, OverScan doesn't have anything to do besides turning off the image output, so it just waits for enough scanlines to pass so that the total scanline count is 262.
        sta WSYNC   ; Wait for SYNC (halts CPU until end of scanline)
        lda #2      ; LoaD Accumulator with 2 so D1=1
        sta VBLANK  ; STore Accumulator to VBLANK, D1=1 turns image output off
        ldx #27     ; LoaD X with 27
        sta WSYNC   ; Wait for SYNC (halts CPU until end of scanline)
        dex         ; DEcrement X by 1
        bne osLoop  ; Branch if Not Equal to 0
        rts         ; ReTurn from Subroutine
Anyway, download the source and take a look - there's comments galore.
Attached File  collect_20140624.bin (2KB)
downloads: 213
Attached File  Collect_20140624.zip (20.01KB)
downloads: 334
Stella Programmer's Guide
Attached File  Stella Programmers Guide.pdf (754.4KB)
downloads: 172
Addendum on Keynote - what the audience sees:
Attached Image
What I see on the iPad:
Attached Image

You may want to mention that 192 is a guideline and was based upon the display capability of TVs at the time.  Games can (and do) draw more lines.  PAL & SECAM games will draw more lines as those standards display more lines per screen (at a lower refresh rate).

  • Report
Yep - I planned to mention that and use Medieval Mayhem's 200 visible scanlines as an example.   Things like this I put in the presenter notes.  The iCloud version of Keynote doesn't let you see them yet, it's still a beta release.  I've added a couple photos at the end of the blog entry which show what the audience sees vs what I see on the iPad.  And yes, I practice giving my presentations using my HDTV.  I connect to the Mac mini which drives my TV using AirServer, which emulates the AirPlay feature of the Apple TV.  I covered AirServer in one of my DVR-Project blog entries.  (as an aside - I see they now have a version for Windows).
Another thing I plan to mention is that the liberal use of JSR/RTS I'm doing in this example is only for legibility.  Eliminating the JSR/RTS will save RAM (by reducing stack requirements), ROM (save 4 bytes for each JSR/RTS removed) and processing time (12 cycles per JSR/RTS).  I'll most likely create a final version of Collect in order to show the savings.
  • Report

Newbie here. I'm having trouble understanding why you are waiting for scanlines to finish at the beginning of VerticalSync, Kernel, and Overscan. From what I understand vsync takes 3 scanlines, vblank 37, the image 192 and then overscan 30. In your code the vsync subroutine waits for 1 + 3 scanlines, vblank 37, kernel 1 + 192 and overscan 1 + 27. Why is this? Isn't the image/kernel starting 2 scanlines later than it should?



  • Report

Great questions  :thumbsup:
For VerticalSync, STA VSYNC needs to be done at the start of the scanline.  Since we don't know the exact cycle* we're at when we get to VerticalSync, the STA WSYNC is done to insure that VSYNC is set at the proper time.  An advanced trick is to set VSYNC at different times on the scanline in order to create an interlaced display.  More info on that here.
For the Kernel, we have the same problem - we don't know the exact cycle* that we'll get to Kernel, so we use a STA WSYNC to make sure we turn on video output (STA VBLANK) before the scanline starts to be drawn.  In Stella it's not noticeable if VBLANK is set later in the scanline, but on a real Atari it can be.  You can see that in this blog entry, scroll down to the two photos of Frantic - the setting of VBLANK in the middle of the scanline is readily apparent in the second photo.

For the Overscan, same thing - we don't know the exact cycle* we'll get there and we want to turn off video output (using STA VBLANK again) before the scanline starts.
As far as starting a couple scanlines later - that's not a problem as the television is very forgiving.  There's two things we need to be concerned with when creating the display:

1) The total number of scanlines for every frame is the same.  If it's not, the television will jitter and/or roll.  While we typically use 262 for NTSC, it's possible to use more or less.  For PAL we typically use 312.  It's possible to use more or less, though PAL does require that the number is even - if the television receives an odd number of scanlines the color information will be lost and the image will be displayed in black & white.

2) The scanline the Kernel begins at should be the same for every frame.  If it's not, the picture will jitter up/down.

*The reality is for this simple program we can calculate the cycles - but as we make changes to the program we'd have to remember to go back and recalculate, then readjust code so the timing of STA VSYNC and STA VBLANK is correct. Using STA WSYNC means we don't have to worry about doing that.

  • Report

It took me a little while to figure out what "D1" in these comments meant.


lda #0          ; LoaD Accumulator with 0 so D1=0

lda #2          ; LoaD Accumulator with 2 so D1=1


From the Stella Programmer's Guide and Kirk Israel's tutorial here:




...well I'll just quote from the bottom of here:




;    If you read your Stella Programmer's Guide,
;    you'll learn that bit "D1" of VSYNC needs to be
;    set to 1 to turn on the VSYNC, and then later
;    you set the same bit to zero to turn it off.
;    bits are numbered from right to left, starting
;    with zero...that means VSYNC needs to be set with something
;    like 0010 , or any other pattern where "D1" (i.e. second
;    bit from the right) is set to 1. 0010 in binary
;    is two in decimal, so let's just do that:
    LDA #2
    STA VSYNC    ; Sync it up you damn dirty television!



Humble Suggestion - change the comment on the 2 lines at top of this comment to:


lda #0          ; LoaD Accumulator with 0 so we can set D1=0 with the following line's sta

lda #2          ; LoaD Accumulator with 2 so we can set D1=1 with the following line's sta


Otherwise the original comments read to novices (such as me) that loading the accumulator is what sets the VBLANK and VSYNC rather than the following lines' sta operand.  This is probably obvious if you already are familiar with assembly language and know what D1 refers to (unlike moi).


Also, I got confused because in the collect.asm file on lines 135 & 136 I found this:


        lda #0                ; LoaD Accumulator with 0 so D1=0
        sta VBLANK      ; Accumulator D1=1, turns off Vertical Blank signal (image output on)


Possibly the 2nd line's comment should show "; Accumulator D1=0, turns off..."



  • Report
I've been coding in assembly since the early '80s so it's hard to remember what wasn't familiar when I started. I normally use "bit x" when referencing the bits within a byte, where the value of x is 0-7 and arranged as 76543210(so if just bit 7 is on the value of the byte would be $80 or 128). Bit x notation seems a lot clearer to me, but Stella Programmer's Guide used the D notation so that's what I used for this code so it would match the documentation.

Yep, that would be an error, they do happen. I probably copied/pasted the lines of code and forgot to fix the comment in the second line. It's probably wrong in steps 2-14 as well.

  • Report

The resulting display here is _very_ clean. 


That is to say, it is nicely centered on both CRT displays, as well as upscalers and CRT displays that try to make the best of a signal that is only "somewhat" compliant.


See picture here: CJiEuxA.jpg


And the vertical blank signals are turned off, NOT ONLY during VBLANK, but during the overscan. Why is this important? Because when a video signal is being generated, you see artifacts of various things, including the signal coming out of the horizontal blank (HBLANK) and the last little chirp of the colorburst signal. In my case, since I am using a display converter/upscaler to HDMI, I see the entire overscan area, including a chirp of the colorburst and a bit of noise in the back porch signal (which show up as very faint vertical bars along the left and right borders of the display)...


Many games take liberties with the vertical timings, even though they generate a stable 262/263 (sometimes more, sometimes less) display, which causes the display to shift on an upscaler. And I've seen more than a few games that forget to turn off the vertical blank during overscan, leaving the artifacts I've mentioned above. 



  • Report

Search My Blog

Recent Entries

Recent Comments

Latest Visitors

2 user(s) viewing

0 members, 2 guests, 0 anonymous users