Why can't the Atari 2600 display better graphics?

Andromeda Stardust · September 4, 2015

I did some experiments using Stella (verification on real hardware is necessary!).

Sampling takes place over 180 scanlines, each scanline is sampled. It seems that not matter how fast I move the mouse, no more than ~20 changes happen during one frame. Also there are no 2-bit changes between two samples, so we can omit that extra check.

Detection of the direction outside the kernel is a bit tricky and not 100% reliable so far:

I remember the initial sample of SWCHA before the kernel starts.

After the kernel I take the number of changes (posDiff) modulo 4.

For even values, I cannot determine the direction (this is the problem), so I assume the previous one.

For the two odd values, I have two dedicated routines where I compare the last sample of SWCHA with the initial sample and determine the direction. This is reliable because there are no 2-bit changes happening.

The problem occurs on rapid direction changes with a certain new direction speed (2, 4, 6 etc. changes/frame). Then the cursor continues into the wrong direction. This does not happen very frequently, because movement speed is never 100% constant. But still it is noticeable.

Attached is a test program for the CX-80 if you want to check it out. If you move up and down erratically enough, sometime you will notice a wrong direction response from the cursor.

Ideas for improvement are welcome.

BTW: The 17 cycle code above requires two extra cycles. Add "and #%11000000" after lda SWCHA. Now we need 19 cycles.

Maybe a different method of encoding is needed for trackball controllers than reading the raw 2-bit gray binary code and hoping it has not changed states more than once since the last read. If you only poll the controller once per 1/60sec scanline, and the encoder wheel is moving faster than one position during that time frame, youwill get errors. Worst case scenario the encoder moves three positions per frame and the cursor or player appears to have retrograde motion or move backwards. Expanding the encoder to 3 or 4 bits would certainly allow greater error protection in the event the encoding wheel moves multiple notches within one frame.

Another option is to bundle an encoder with the trackball. For instance the SNES mouse uses an opticle encoder connected to a logic chip that counts the number of increments as a signed 8-bit integer. The H position and V position are sent serially through the bus during sync operation. Thus no matter how fast you move the mouse, it's position in X/Y pace will be tracked reliably and relayed to the console.

I'm not saying that the Atari needs to use a fancy serial interface like the SNES mouse, but reading the raw state of the encoding wheel will not cut the mustard unless every state change can be counted. Digital logic could for instance be used to add or subtract to a 4-bit binary counter via the up-down-left-right joystick inputs. Reading this data would not require a "sync" signal like the SNES mouse since the 4-bit value could be added or subtracted to the previous value in memory in order to obtain absolute the change in X or Y coordinates. This could be handled by ARM logic inside Melody game carts instead of relying on limited VCS RAM and CPU resources.

This way the software would have an accurate state of the position of the trackball, and unless it has rotated more than 8 positions during one frame of video, far less likely, would not introduce errors into the trackball movement. The required digital logic could be provided using a cheap microcontroller housed in a 9-pin housing similar to the AtariVox or Stelladaptor so no permanent modding of an original 2600 or 5200 trackballs is necessary. Dip switches could be used to select the appropriate trackball type, since there are multiple incompatible models of trackballs.

Edited September 4, 2015 by stardust4ever

Andromeda Stardust · September 4, 2015

By the way, this thread has been a great read. I recommend anyone curious about the VCS capabilities read Racing the Beam.

+DrVenkman · September 4, 2015

I hope it works.

BTW: This emulates the problems with the CX-80. Due to the different encoding, determining the direction for the CX-22 is dead easy.

Vertical movement looks great with my CX-22 on a 4-Switch Woody - nice and smooth, although a tad slow. Horizontal movement ... not so much. The X-axis isn't recognized at all. Is that expected?

Thomas Jentzsch · September 5, 2015

Vertical movement looks great with my CX-22 on a 4-Switch Woody - nice and smooth, although a tad slow. Horizontal movement ... not so much. The X-axis isn't recognized at all. Is that expected?

Yes, only one axis is checked.

Omegamatrix · September 5, 2015

Good idea. How about this?

  lda     SWCHA       ; 4
  cmp     lastTrack   ; 3
  beq     .noChange   ; 2/3
  sta     lastTrack   ; 3
  .byte   $a9         ;-2   skips pla
.noChange
  pla                 ; 4 = 14

I was just thinking about this. If you can break this into a two line (or more) kernel you can reduce the maximum cycles per line by 1, but you use more cycles overall.

;line 1
  lda     SWCHA       ; 4
  sta     temp        ; 3 = 7

  
;line 2
  lda     temp        ; 3
  cmp     lastTrack   ; 3
  beq     .noChange   ; 2/3
  sta     lastTrack   ; 3
  .byte   $a9         ;-2   skips pla
.noChange
  pla                 ; 4 = 13

It's useful if you really need 1 more cycle in that particular line. Also it will probably be enough resolution reading every 2-4 lines. Not too sure as I don't own a track ball.

Thomas Jentzsch · September 5, 2015

Maybe a different method of encoding is needed for trackball controllers than reading the raw 2-bit gray binary code and hoping it has not changed states more than once since the last read. If you only poll the controller once per 1/60sec scanline, and the encoder wheel is moving faster than one position during that time frame, youwill get errors. Worst case scenario the encoder moves three positions per frame and the cursor or player appears to have retrograde motion or move backwards. Expanding the encoder to 3 or 4 bits would certainly allow greater error protection in the event the encoding wheel moves multiple notches within one frame.

That's the reason why the trackball has to be polled during the kernel, ~40..50 times/frame minimum. And for being able to do that inside an already very busy kernel, we are looking for the most efficient code in this thread.

Thomas Jentzsch · September 5, 2015

Also it will probably be enough resolution reading every 2-4 lines. Not too sure as I don't own a track ball.

Every 4 lines is sufficient, Missile Command is doing that and the attached code (for a CX-22) too. Please stress the ROM with all kinds of movement pattern and report back how it feels.

Today I have extended my code to reading both directions. Originally I had hoped it would be possible to alternate between directions each frame. But until now the result looks not very nice. Maybe if I delay the output of one direction until the other direction has been read too... :ponder: That might work, I will try tomorrow. Making it work would have quite some significant advantages.

As of now, it seems necessary to read both directions each frame. So the SWACNT trick doesn't work anymore. This requires extra code for branching and the code needs more RAM for variables.

Also the PLA trick will only work for one direction, so I skipped that for now too.

Here is the current kernel code:

    tya                     ; 2     assuming Y contains the scanline counter
    lsr                     ; 2
    bcs     .readY          ; 2/3   =  6/7
;.readX
    lda     SWCHA           ; 4
    and     #%00110000      ; 2
    cmp     lastTrackX      ; 3
    bne     .doChangeX      ; 2/3
    inc     diffX           ; 5
    bne     .endTrackXY     ; 3     = 19

.doChangeX                  ;12
    sta     lastTrackX      ; 3
    bne     .endTrackXY     ; 3     = 18

.readY
    lda     SWCHA           ; 4
    and     #%11000000      ; 2
    cmp     lastTrackY      ; 3
    beq     .noChangeY      ; 2/3
    sta     lastTrackY      ; 3
    bne     .endTrackXY     ; 3     = 17

.noChangeY                  ;12
    inc     diffY           ; 5     = 17
.endTrackXY
; total: X: 24/25; Y:24

So that's quite a bit more than I was hoping for. PLA may help to reduce this to 24 max. But that's it, unless there is a brilliant idea waiting somewhere.

Note: As long as the carry is not touched, the initial direction branch can be easily separated from the rest of the code. So that provides some flexibility.

trackball test v0.06 (CX-22).bin

Edited September 5, 2015 by Thomas Jentzsch

Omegamatrix · September 5, 2015

I suggest expanding the rom on the games you want to hack to write the 4 line kernels. Then you should be able to spread the load over multiple lines reducing the cycles needed per scanline.

Then you can do something like this:

;line 1
    lda     SWCHA                ;4  @4
    sta     temp                 ;3  @7
    and     #$C0                 ;2  @9
    sta     Ytemp                ;3  @12
    eor     temp                 ;3  @15
    sta     Xtemp                ;3  @18



;line 2
    lda     Xtemp                ;3  @3
    cmp     lastTrackX           ;3  @6
    beq     .updateX             ;2³ @8/9
    sta     lastTrackX           ;3  @11
    bne     .endX                ;3  @14   always branch
.updateX:
    inc     diffX                ;5  @14
.endX:



;line 3
    lda     Ytemp                ;3  @3
    cmp     lastTrackY           ;3  @6
    beq     .updateY             ;2³ @8/9
    sta     lastTrackY           ;3  @11
    bne     .endY                ;3  @14   always branch
.updateY:
    inc     diffY                ;5  @14
.endY:

Alternatively you could do this for line 1 to save 2 cycles:

;line 1
    lda     #$C0                 ;2  @2
    and     SWCHA                ;4  @6
    sta     Ytemp                ;3  @9
    eor     SWCHA                ;4  @13   small risk of SWCHA changing between reads...
    sta     Xtemp                ;3  @16

Finally you could reduce line 1 more by splitting the read over 2 lines (one for X, and one for y). And you can of course save more cycles by using PLA instead of INC for one of the axises.

Mr SQL · September 5, 2015

Show me a post where I said that, or anything even remotely close to that.

TIA supports 128 colors, not 112.

Sounds like Stardust was referring to the PAL TIA which is more limited, but the NTSC version supports far more than 128 colors with artifacting enabled

Artifacting is an excellent method for enhancing Atari 2600 graphics - the gorgeous graphics on the right require a tube telly and a real Atari, but no extra memory to mutiply the bit-plane:

You were posting wikipedia articles to prove this technique only works on the Apple II, did you find any more?

iesposta · September 6, 2015

Okay, if you want to cheat, 480 colors!

480_Colors.zip

iesposta · September 6, 2015

Also, regarding Trackball...

If I recall, somone had found space and was working on Trackball in Centipede... (Nukey Shay?)

It was recently, and I don't think anything was completed or it would have been talked about more...

Anyway, I think it would be a fantastic improvement to add real analog Trackball to Centipede and/or Millipede

+DrVenkman · September 6, 2015

Anyway, I think it would be a fantastic improvement to add real analog Trackball to Centipede and/or Millipede

Ahem ... Go back and re-read the last couple days' posts. Due to sheer evil genius on my part, I have managed to maneuver both Thomas and Darrell into discussing Trak-Ball code in the context of a putative "Centipede-TB" hack/home-brew: two of the smartest, most-accomplished 2600 home-brew coders working today in one thread, thinking about one thing.

Omegamatrix · September 6, 2015

Okay, building off the 4 line spread from post 108. This is now 13 cycles worse case (per line).

;line 1
    lda     SWCHA                ;4  @4
    eor     lastTrackY           ;3  @7
    and     #$C0                 ;2  @9
    sta     Ytemp                ;3  @12



;line 2
    lda     Ytemp                ;3  @3
    bne     .changeY             ;2³ @5/6
    inc     diffY                ;5  @10
    bne     .finishY             ;3  @13   always branch (assume diffY starts at 0)

.changeY:
    eor     lastTrackY           ;3  @9
    sta.w   lastTrackY           ;4  @13
.finishY:



;line 3
    lda     SWCHA                ;4  @4
    eor     lastTrackX           ;3  @7
    and     #$30                 ;2  @9
    sta     Xtemp                ;3  @12



;line 4
    lda     Xtemp                ;3  @3
    bne     .changeX             ;2³ @5/6
    inc     diffX                ;5  @10
    bne     .finishX             ;3  @13   always branch (assume diffX starts at 0)

.changeX:
    eor     lastTrackX           ;3  @9
    sta.w   lastTrackX           ;4  @13
.finishX:

You can still use PLA JMP for one of axises, so that only one line of the four takes 13 cycles.

ZylonBane · September 6, 2015

Sounds like Stardust was referring to the PAL TIA which is more limited, but the NTSC version supports far more than 128 colors with artifacting enabled

Wrong. TIA in PAL mode has a 104-color palette.

Artifacting is an excellent method for enhancing Atari 2600 graphics - the gorgeous graphics on the right require a tube telly and a real Atari, but no extra memory to mutiply the bit-plane:

Wrong. Artifacting is outputting half-color-clock pixels to trick NTSC decode circuitry into generating color information where none is actually present. Since TIA can't generate pixels narrower than a single color clock, artifacting is impossible on a 2600. The "method" you describe is called color fringing ("Spurious chromaticity at the boundaries of objects in a color TV picture.") and it most certainly does not "multiply the bit plane". Words mean things. You don't get to redefine them as you please just to puff yourself up.

Omegamatrix · September 6, 2015

Thought about this some more. You can also decrease the 13 cycles max for either or both axises without using PLA, which frees up the stack pointer if the game is using it.

Here is the code for one axis, replacing the code for line 2 in the post above.

;use $E6 (INC opcode) or $C6 (DEC opcode) for lastTrackY

;use any of the following zp ram locations for diffY,
;they form the high address of the absolute ram mirror...
$80,$81,$84,$85,$88,$89,$8C,$8D
$A0,$A1,$A4,$A5,$A8,$A9,$AC,$AD
$C0,$C1,$C4,$C5,$C8,$C9,$CC,$CD
$E0,$E1,$E4,$E5,$E8,$E9,$EC,$ED
    
    lda     Ytemp                ;3  @3
    bne     .noChangeY           ;2³ @5/6
    eor     lastTrackY           ;3  @8   
    .byte $AD ; STA ABSOLUTE, takes 4 cycles
.noChangeY:
    inc     diffY                ;5  @11/12

The problem is the cycles are no longer even, and you have to use specific ram locations. However it could be useful if cycles are needed. I suppose the cycles could become even if you placed the branch to cross a page boundary. All possible albeit hard to implement...

Edit: Might be more practical to use PLA for 1 axis, and the second axis aligned to a page boundary crossing. Then everything is even and 12 cycles per scanline. Or maybe a WSYNC will re-align it all.

Thomas Jentzsch · September 6, 2015

Some clever ideas there, Omegamatrix. Depending of the game we want to write or hack, spreading cycles at the cost of a few more extra total cycles, might become useful.

And we must nor forget the RAM usage. My code above uses 4 bytes of RAM (temporary) inside the kernel and it needs 8 more bits (4 being permanent, for remembering the directions, e.g. left/right/unknown) outside the kernel.

Andromeda Stardust · September 6, 2015

Sounds like Stardust was referring to the PAL TIA which is more limited, but the NTSC version supports far more than 128 colors with artifacting enabled

Artifacting is an excellent method for enhancing Atari 2600 graphics - the gorgeous graphics on the right require a tube telly and a real Atari, but no extra memory to mutiply the bit-plane:

Emulation_vs_CRT_with_Artifacting.jpg

You were posting wikipedia articles to prove this technique only works on the Apple II, did you find any more?

Artifacting only works in 320 mode. What you are showing in the right image is just RF bleed. My Atari has the cleanest RF possible and still does this.

Keatah · September 6, 2015

what game is that in post 109?

Edited September 6, 2015 by Keatah

iesposta · September 6, 2015

what game is that in post 109?

KC Munchkin Monster Maze

(demo near last post in thread)

http://atariage.com/forums/topic/222606-kc-munchkin-monster-maze-atari-2600/?p=2938224

Keatah · September 6, 2015

Cool. I bet I can tweak Stella and my LCD to come real close to what a CRT does. Thing is real CRT blooms vertically too. And the Blargg effects in Stella don't do that very well. I can always pull out the big guns and play with the MagicBrite settings on one of my older monitors.

Mr SQL · September 6, 2015

Artifacting only works in 320 mode. What you are showing in the right image is just RF bleed. My Atari has the cleanest RF possible and still does this.

Do you mean 320 pixels horizontally are required? That's a misnomer, it's about chroma bleed:

That's one of my 80's games for the CoCo which only had 256x192 resolution. there are actually even more artifact colors in the BW image, but the emulator doesn't show them all.

Here I combine chroma and RF bleed, creating moire swirls for even prettier Atari graphics:

www.youtube.com/watch?v=aghqgf6qqRw

Mr SQL · September 6, 2015

Cool. I bet I can tweak Stella and my LCD to come real close to what a CRT does. Thing is real CRT blooms vertically too. And the Blargg effects in Stella don't do that very well. I can always pull out the big guns and play with the MagicBrite settings on one of my older monitors.

Yes it's the vertical. I used the fine vertical resolution to trigger artifacting on the vertices.

Omegamatrix · September 6, 2015

Some clever ideas there, Omegamatrix. Depending of the game we want to write or hack, spreading cycles at the cost of a few more extra total cycles, might become useful.

And we must nor forget the RAM usage. My code above uses 4 bytes of RAM (temporary) inside the kernel and it needs 8 more bits (4 being permanent, for remembering the directions, e.g. left/right/unknown) outside the kernel.

In the end it might be a combination of methods, or even stuffing values in a few places when timing gets tight. Maybe converting to Superchip would clear up the ram issues much easier then doubling up ram.

I found the thread where Nukey started working on a Centipede trackball hack. I think he could do it with the ideas we have here. Just expand the rom and unwind.

It would also be nice to start with a screen that says "Spin Trackball Left" at power-on. Then capture which states the trackball goes through and decide which trackball profile to use, and start the game. It would be easy to avoid any wrong detections by waiting for multiple confirmations before locking in the profile.

Omegamatrix · September 6, 2015

Here is another version which reads both axises in 30 cycles using two routines. This is for a situation where you have lots of cycles once every 4 lines, and can expand the rom.

;lines 1-4
    lda     SWCHA                ;4  @4
    sta     lastTrack2           ;3  @7
    eor     lastTrack            ;3  @10
    and     #$C0                 ;2  @12
    bne     .changeY             ;2³ @14/15

    .byte $EE  ; INC absolute, 6 cycles
.changeY:
    inc     diffY                ;5  @20   "diffY" must also be high address of rom space
    eor     lastTrack            ;3  @23
    bne     .changeX             ;2³ @25/26
    nop                          ;2  @27
    .byte $A5  ; LDA zp, 3 cycles
.changeX:
    pla                          ;4  @30



;lines 5-8, "lastTrack" and "lastTrack2" are reversed...
    lda     SWCHA                ;4  @4
    sta     lastTrack            ;3  @7
    eor     lastTrack2           ;3  @10
    and     #$C0                 ;2  @12
    bne     .changeY_B           ;2³ @14/15

    .byte $EE  ; INC absolute, 6 cycles
.changeY_B:
    inc     diffY                ;5  @20   "diffY" must also be high address of rom space
    eor     lastTrack2           ;3  @23
    bne     .changeX_B           ;2³ @25/26
    nop                          ;2  @27
    .byte $A5  ; LDA zp, 3 cycles
.changeX_B:
    pla                          ;4  @30

It trashes the stack pointer, but only uses 3 variables in the kernel. It is also faster overall.

Omegamatrix · September 6, 2015

Just had another idea. I haven't tested this, but I think it'll work. Same idea as the above post. Use two alternating routines every 4 lines. It requires Thomas's idea of setting the low bits of SWCHA to zero by setting the port as output. At 29 cycles now...

;lines 1-4
    lda     SWCHA                ;4  @4
    sta     lastTrack2           ;3  @7
    eor     lastTrack            ;3  @10
    cmp     #$3F                 ;2  @12   carry set if Y axis changed
    and     #$30                 ;2  @14
    bne     .changeX             ;2³ @16/17
    nop                          ;2  @18
    .byte $A5  ; LDA zp, 3 cycles
.changeX:
    pla                          ;4  @21
    bcs     .changeY             ;2³ @23/24

    .byte $EE  ; INC absolute, 6 cycles
.changeY:
    inc     diffY                ;5  @29   "diffY" must also be high address of rom space





;lines 5-8, "lastTrack" and "lastTrack2" are reversed...
    lda     SWCHA                ;4  @4
    sta     lastTrack            ;3  @7
    eor     lastTrack2           ;3  @10
    cmp     #$3F                 ;2  @12   carry set if Y axis changed
    and     #$30                 ;2  @14
    bne     .changeX_B           ;2³ @16/17
    nop                          ;2  @18
    .byte $A5  ; LDA zp, 3 cycles
.changeX_B:
    pla                          ;4  @21
    bcs     .changeY_B           ;2³ @23/24

    .byte $EE  ; INC absolute, 6 cycles
.changeY:
    inc     diffY_B              ;5  @29   "diffY" must also be high address of rom space

Why can't the Atari 2600 display better graphics?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members