tetris A Two-Player Competitive and Co-Op "Tetris" for the Atari 2600 VCS

AkashicRecord · November 6, 2018

I myself am used to the behavior of Game Boy Tetris, which rotates S and Z pieces such that they appear as they do in the third and fourth columns of the chart you posted. I pieces are rotated about the second block from the left (in horizontal orientation).

I'm not particularly fussed about the nuances of I pieces, but I have basically no experience with any Tetris where pieces translate as they rotate.

Yeah, I'm kind of drawing a blank myself on which versions had this particular rotation "standard" in place. I'm guessing that some of the newest versions do...but I'm not even sure if TetrisDS had it. (That was my most recently owned version that was worth a shit.)

Either way, I'm going to have more than just one way to skin this cat, so let's collectively make the best Tetris title to ever exist.

AkashicRecord · November 7, 2018

I'm *almost* finished with the rotation system, but right now it kind of flips the game piece about....60 times per second.

It should be relatively easy and quick to add the rest of the game pieces once the full behavior of just one piece is completely finalized. (The entire thing is mostly just a data driven loop with indexed loads and such based on the current piece rotation.) The gradient shading will be a lot more flexible now, as I was just previously decrementing the color value every single time.

AkashicRecord · November 9, 2018

I ran into something interesting last night...by accidentally "frying" one of my prototype programs. Luckily, the volume was off, because oh boy, if it is turned up loud...

Anyway, this quickly showed me how many uninitialized and unused registers that I've been ignoring, and some that I'd almost completely overlooked.

So, I completely gutted my program down to it's barest of bones and essentially rewrote the whole thing from scratch. Doing this, I fixed quite a few things in the process (...and back to a proper 262 lines of NTSC again).

After everything was done, I wired up the joystick button to a sort of "soft reset" function and tested frying the program quite a bit (the Backspace key in Stella), checking to see if the program would essentially restore itself to a working state. (It did.)

There's probably not any *real* need to do this, but I don't like loose ends like what I had discovered...it was also time to start refactoring and cleaning up the program anyway before moving too far forward...I've also taken the David Crane approach to graphics and have started placing black lines in areas to reduce blooming and color artifacts.

One strange thing I noticed while frying my test program was that the playfield priority in CTRLPF seemed to be ignored, or at least handled differently, between Missile and Player objects when VDELPx is enabled?

Edited November 9, 2018 by AkashicRecord

CapitanClassic · November 9, 2018

If you dont want frying features (like double missiles in Space Invaders), then a common technique is to ClearRAM and registers. The technique takes between 11-8 bytes of ROM, depending upon how cleaver you want to be.

http://www.randomterrain.com/atari-2600-memories-tutorial-andrew-davie-12.html

Andromeda Stardust · November 10, 2018

Well, it seems that this kind of information is a bit easier to come by these days.

This is the "SRS" or Super Rotation Standard for the game. It gives each piece 4 unique rotations...except the square. This causes some minor horizontal and vertical translations for the Z, S and Line pieces which are absent from many Tetris variations...

SRS-pieces.png

If there are any that object to this rotation behavior, I could have this as an "SRS On / Off" game option toggle before the start of a game.

I have played official Tetris versions, and the I beam and S/Z pieces generally only have two orientation matrices. Square obviously needs only one. This reduces the required matrices from 28 (7 × 4) to 19 (3 × 4 + 3 × 2 + 1 × 1). The preview piece display matrix is usually 4 × 2 (character sprites may be used for preview display, whereas playfield tiles are used for placed blocks), however the minimum matrix size for all pieces is 4 × 4 because the length and width of the I beam must fit within it. Hitting the cw or ccw rotation button will swap the current piece's matrix with another matrix tied to the correct orientation of the same piece, provided the desired rotation does not cause collisions with placed bricks.

Given the Atari has extremely limited RAM available, the contents of the matrices needs to be encoded in ROM because the existing 128 bytes of RAM cannot possibly handle all manipulations. By reusing matrixes between parts which exhibit rotational symmetry, you can eliminate 9 matrices from code, or 18 bytes of ROM, assuming each 4 × 4 matrix uses 1 bit encoding (16 bits equals 2 bytes).

AkashicRecord · November 11, 2018

Right now, my graphics resources are very minimal, really only a handful of bytes for everything. But even with that, I'm being a little wasteful. There are a few ways to represent the pieces, whether it be bytes or nibbles, or even mathematical and bitwise operations. For now, I think simpler is better, and that's certainly easier to debug.

I'm using a few 16-bit pointers and indirectly indexing into them using the Y register to access the sprite data. After random piece selection, I build a graphics pointer to a table which is indexed by the piece's current rotation. I'm doing pretty good so far with keeping the code to a minimum...sort of.

There are a few other tables to handle some adjustments which drive the kernel...things like: what scanline to start and stop drawing on, and (later) when to change colors, and when to start loading dirty playfield data...i.e. the "placed" bricks, etc.

That said, I was spending some quality time earlier this morning actually writing part of the kernel in RAM, building opcodes amongst some very carefully placed variables and pointers. This allowed for the fastest possible data loads and stores (about 20 cycles for both players' sprites and colors), as I could use immediate addressing and dynamically rewrite the operands elsewhere.

The problem with this was that while it was cool as hell, it was looking like some major overkill. Maybe I'll come back to it later if I'm really starved on clock cycles, because it actually worked quite well...at the expense of about 20 extra bytes of RAM.

Edited November 11, 2018 by AkashicRecord

AkashicRecord · November 11, 2018

This morning I managed to make some more strides forward, especially for the 1P prototype. I fixed an indexing bug which I was completely overthinking, and managed to get a single-scanline kernel which loads the player graphics, the color, and each half of the playfield data, successfully finishing the write for the asymmetric (right-side) portion of the screen to the appropriate TIA register *exactly* on machine cycle 48, and then changes the player color value for loading on the next scanline...whew, that was a lot.

The 2P prototypes will probably benefit from a 2-line kernel, but I might be able to squeak by with a 1-liner... We're getting close! I still have some room for improvement, as I can apply the color changes on prior scanlines, but right now that isn't necessary, and I don't have to litter my code with NOPs everywhere. (I actually don't have a single NOP...)

Edited November 11, 2018 by AkashicRecord

AkashicRecord · November 12, 2018

This isn't much, but I felt obligated to at least share something recent...

This is the debugger output of one of my recent test protos. The disassembly listing to the right shows how I've labeled the CPU cycles entering into the main portion of the kernel. (Note the second write to PF2 occurring on cycle #45 and ending on cycle #48.)

(In this test, I'm only performing the "cycle 48 write" on one scanline, but it isn't much to apply this code to the other parts of the kernel...)

AkashicRecord · November 13, 2018

I made a lot more headway on the kernel timing today by breaking it into a few similar pieces. Here's another screenshot showing the asymmetric playfield by utilizing "cycle 48" PF2 writes. (There are still some bugs to be squashed.)

This example is just using junk data and colors, but it illustrates that the principle is working while advancing down the playfield, as well as displaying a game piece (somewhat) properly from sprite data. Every 8th scanline triggers an adjustment of the index into the playfield "brick" data, and this is mostly working properly as well.

Edited November 13, 2018 by AkashicRecord

AkashicRecord · November 13, 2018

Now is probably a good time for a quick list of what's done...or mostly done, and a list of things that are left.

(Working or mostly working)
- Working application shell
- TV signal sync / 192 scanlines
- Colored sprites
- Animation
- Random number generation
- Joystick polling / Input control
- World coordinate system
- Asymmetric game playfield / Data loading

(Things still left TODO)
- 1P / 2P kernel separation
- Game logic
- Line clearing / Scoring
- Timing adjustments

- Sound

- Additional game modes
- Detail / Graphics enhancements
- Public beta testing

- Music
- Manual / Box
- Game cartridge

Edited November 13, 2018 by AkashicRecord

AkashicRecord · November 14, 2018

Today, I'm doing a little more work on the kernel, this time working the Carry flag into the equation. This should help streamline the kernel a little bit more, and things will also make a little more sense. I have some off-by-one errors creeping in that need to be taken care of as well.

I'll be posting another playfield test soon, but this one will be a little different. I'm looking at having a completely black background for the game (even outside of the game playfield), and to use gradient shading for the placed bricks instead. I think will look quite nice, but it will require changing another TIA register (obviously.) I should be able to re-use the index for the game playfield data and simply apply it to a color table located in ROM. Each game level would have 20 possible color shades for the game bricks, so I could get kind of "fancy" here, if I want. That leads me to some interesting ideas.

One very difficult level could use pure black for some of the lower blocks, essentially requiring you to remember where you placed some of the bricks. Maybe the "ghost" piece outline will come in handy here... I'll come back to these ideas later, because I think this could be interesting and challenging. Maybe one of the 2P modes could allow the opponent to "black out" his opponents game field for a few seconds...or for one move, etc...?

Also, If anyone is interested, I can share detailed code snippets and explain what is happening for anyone writing (or wanting to write) their own VCS programs. Some of the routines are generic enough to be reusable in other games and programs.

Andromeda Stardust · November 14, 2018

Unrelated, but I played a game of "Forklift Tetris" at my job site today. Made sure every last crate had a place to fit...

"All work and no play makes jack a dull boy."

LOL...

Looking forward to future developments. :thumbsup:

AkashicRecord · November 14, 2018

Unrelated, but I played a game of "Forklift Tetris" at my job site today. Made sure every last crate had a place to fit...

"All work and no play makes jack a dull boy."

LOL...

Looking forward to future developments.

Reminds me of the performance pieces where office lights in a skyscraper are wired to a Tetris game logic and a game is played on the side of the building.

https://youtu.be/eMBguPuKPi4

The only way this could be better is if the building was one rigged for demolition, and the entire field played and set for a giant line piece the entire length of the building which triggers the demolition charges at the end as all lines are cleared...along with the building...

As for the future developments, I'm currently investigating a timer-based kernel idea. Could be interesting.

Edited November 14, 2018 by AkashicRecord

Andromeda Stardust · November 14, 2018

Well fortunately no "lines" were demolished today. I don't think management would have been happy, though it wouldn't hurt to get rid of some stuff... |:)

There was a skit in futurama where a construction crew vaporized a project by dropping the I beam into the slot. Also Tetris was used as a destructive weapon in the Adam Sandler movie Pixels. But if the video was launched into space in 1983, Tetris was not yet invented?

I wouldn't dwell on the plot. Also you don't kill Donkey Kong with a hammer; he falls off the girders. #hollywood

AkashicRecord · November 14, 2018

Here's a very in-depth breakdown of the Nintendo Tetris game. There are quite a few factors here that I was not aware of. One interesting point is that the input has an initial 16 frame delay for lateral input repetition, at which point the game pieces move horizontally every 6 frames (~0.10 seconds). This is something comfortable that I could use as a starting point for this game...although I heavily object to the automatic drop speed doubling from level 28 to 29... It's almost impossible for a human to play at a 1 frame drop increment...

http://meatfighter.com/nintendotetrisai/

This page has TONS of stuff, almost too much! Did you know that there was a hidden unfinished 2P versus in the NES Tetris???

There is also information about programming an AI, although that might be too much for the VCS... (I'll still have a look at it though...it may be acceptable to have an crude AI that operates over multiple frames...better than nothing, right?)

Notice how it's pretty impossible to play using only the "squiggly" pieces...

Edited November 14, 2018 by AkashicRecord

Lillapojkenpåön · November 14, 2018

Also, If anyone is interested, I can share detailed code snippets and explain what is happening for anyone writing (or wanting to write) their own VCS programs. Some of the routines are generic enough to be reusable in other games and programs.

I would like some kernel exmples, I could write a kernel but it would be incredibly bad since I'm new to this and haven't followed the progress in vcs development which is probably lightyears from my primitive way of thinking of it by now.. I'm currently looking at this since I want to reposition stuff. My current and first kernel does the exact same thing (everything) every scanline, and repositioning takes up two scanlines, doesn't that mean I can't draw anything on those two scanlines? Or am I thinking completely wrong there?

AkashicRecord · November 15, 2018

I would like some kernel exmples, I could write a kernel but it would be incredibly bad since I'm new to this and haven't followed the progress in vcs development which is probably lightyears from my primitive way of thinking of it by now.. I'm currently looking at this since I want to reposition stuff. My current and first kernel does the exact same thing (everything) every scanline, and repositioning takes up two scanlines, doesn't that mean I can't draw anything on those two scanlines? Or am I thinking completely wrong there?

I'll have a better response later, but here is the positioning that I'm using for all objects. It works pretty well, but there is a 1-pixel discrepancy when you start resizing the player sprite objects...so you'll have to compensate for that one manually, but this routine will handle the more or less "standard" 1px positioning discrepancy inherent in the Missile and Ball objects.

To use this subroutine you'll have to call it with:

  JSR PositionSprite

...but that isn't enough. It takes two arguments, one in the A register, and the other in the X register.

The A register should hold the target horizontal pixel, and the X register contains the object to be positioned:

X=0 Player 1 (RESP0)

X=1 Player 2 (RESP1)

X=2 Player 1 Missile (RESM0)

X=3 Player 2 Missile (RESM1)

X=4 Ball (RESBL)

A=(0-159) Target Pixel

For the A register, I believe that 0-159 are the valid values and should be able to "hit" every pixel across the entire screen width...but the sprites will wrap around from the right edge of the screen to the left if you start overstepping the bounds...this can be abused to great effect for scrolling objects onto the screen from the right, similar to how it was done in Grand Prix....I think.

Here's the routine. Call it from VBLANK or even in the Overscan period if you want to be different:

;; The answer to life, the universe, and everything
;; ...including horizontal positioning on the Atari 2600
MAGIC = 46         ; Douglas Adams was off by 4

PositionSprite     ;; Sprite placement in 27 bytes -
                   ;; A=Target Pixel
    cpx  #2        ;; X=0 P0, X=1 P1, X=2 M0, X=3 M1, X=4 BL
                   ;  ..check to see if Missile or Ball
    adc  #0        ;  add carry from above for 1px missile / ball error

    clc            ;  clear carry for ADC
    adc  #MAGIC    ;; HERE BE MAGIC WIZARD SHIT

    sec            ;  set carry for SBC
    sta  WSYNC     ;  finish current scanline

_SBC15
    sbc  #15       ; repeatedly subtract 15
    bcs  _SBC15    ; until crossover

    sta  RESP0,X   ; set coarse position (where we are "NOW" in TIA color clock cycles)

    eor  #$FF      ; ...
    adc  #$F9      ; it's complicated...

    asl            ; shift into place
    asl            ;
    asl            ;
    asl            ; and get an appropriate value for the upcoming HMOVE

    sta HMP0,X     ; set the fine position
                   ;
  rts              ; and return from subroutine

A JSR to call this routine isn't enough however...once you call the routine for any number of objects (0-4), you have to strobe the HMOVE register (STA HMOVE) immediately after a STA WSYNC for it to "take effect"... If the HMOVE isn't done exactly after a scanline sync (or at least on cycle 0), then the values that were just calculated in the function and written to RESPx will be invalid because HMOVE operates differently depending on when it is strobed.

How the positioning function above works is it uses the X register to identify the object to be placed, fixes the pixel position for 3 of the 5 choices, and then uses the X register to index the write to RESP0, since the next 4 memory locations are the Player 2 sprite, Player 1 missile, Player 2 missile, and Ball objects respectively.

That said, an example to place the Player 1 sprite object (GRP0 via RESP0) at pixel position 80 might look like:

  ;  sta HMCLR
  stx #0
  sta #80
  jsr PositionSprite
  sta WSYNC
  sta HMOVE

The commented-out HMCLR is not necessary if you are only positioning once. I'm showing it just to illustrate that you'll need to be mindful of it if you start repositioning things more than once... Remember that you'll have to write to GRP0 to actually start drawing, and of course have a non-background color written to COLUP0 for the above example to actually do anything.

Edited November 15, 2018 by AkashicRecord

AkashicRecord · November 15, 2018

My current and first kernel does the exact same thing (everything) every scanline, and repositioning takes up two scanlines, doesn't that mean I can't draw anything on those two scanlines? Or am I thinking completely wrong there?

If you use each player, missile, and ball, you only have to really position those once every frame (for most purposes.) If you are reusing objects down the screen, then you have to start doing a bit of work to prep data for the scanlines on which to start (re)drawing the objects. That can definitely get tight, because you have a very limited amount of time to do things as it is.

Any area on your screen where nothing is changing is 76 cycles that are mostly unused. (You could even NOP your way across an entire scanline without a WSYNC as long as the used CPU cycles are exactly 76.) This is one reason why you will see games where things can be visually grouped into specific "lanes" on the screen. The programmer may be buying time to prep data for objects in those "lanes."

The positioning routine shown previously is very small in size and is very general-purpose, but it uses a lot of cycles for some pixel position values. For fast repositioning you'll probably want an algorithmic approach, or a table-based fast lookup, and even maybe a fixed-cycle kernel that executes from RAM...maybe even a combination of all of the above

Maybe I can cover some of those things at a later time.

Edited November 15, 2018 by AkashicRecord

AkashicRecord · November 15, 2018

I'm currently looking at this since I want to reposition stuff.

I can't really comment on the above because I have no plans on using external hardware assistance, and that doesn't evne handle the playfield at all. Everything that I'm doing is strictly vanilla VCS, and with only a 2 or 4K ROM image target, to boot. My primary "locus of focus" is on strict timing, especially the required "cycle 48" writes to PF2 for an asymmetric playfield. (I can't really even imagine writing a game that doesn't use the playfield at least somewhat extensively.)

The repositioning routine that I highlighted earlier could be rewritten in a variety of ways. Originally, I was using a few different versions of the same routine to position each object individually, rather than using a reusable routine which positions anything and everything.

One other version of the routine could store the calculated positioning values into variables (or the stack) at which point they are peeled off as needed, and applied to the appropriate RESPx and HMxx registers a bit more quickly / efficiently. (WIth an approach like this, then you would probably want to just strobe HMOVE on every single scanline as well...which has an additional side effect of shaving 8 pixels off of the left screen edge..)

Another option would be to reposition objects directly into RAM variables which are interspersed between code which is written to (and executed from) RAM. That is pushing the envelope about as far as it can go, as you are reducing loads and stores down to the theoretical minimum of ~5 CPU cycles by dynamically writing data into an immediate addressing form.

Edited November 15, 2018 by AkashicRecord

Lillapojkenpåön · November 15, 2018

Thanks alot man!!! I've been working with this all day and I'm to tired to write all my follow up questions now but I will tomorrow!

I'm currently repositioning M1 four times down the screen, the first one in vblank and the others in the kernel obviously..

My linecounter increments since I need to think right side up for now, and I reposition when the linecounter hits #30 #60 and #90, at the same time I also store the linenumber I want the ENAM1 to happen in temp5 so it gets enabled later like this, they only need to be one scanline tall so that's enough.

ldy #1

cmp temp5

beq DoEnam1

.byte $24

DoEnam1:

iny

sty ENAM1

What I haven't really grasped yet is the fastest way to check if it's time to reposition? I was thinking that maybe I can check a bit in the linecounter instead (if there's a bit that turns on four times)

Or if I'm suppose to try and skip the regular wsync on the reposition frames where there's allready two wsync?

And how I get rid of the shearing when the missile is at the far left?

And about a thousand other things.. my head is burning

AkashicRecord · November 16, 2018

What I haven't really grasped yet is the fastest way to check if it's time to reposition?

What I've been doing right now is have the target scanline in the A register. I perform a DCP (decrement and compare) on the line counter and BEQ or BCS for a match. (The BEQ condition is also met if the value is 0, so be careful with that. I'm not drawing below line 16 in this game, so that condition isn't an issue for me here.)

You'll want to move to a decrementing counter system for sure, since you'll be getting "free" comparisons with zero and can eliminate a lot of CMP instructions.

And how I get rid of the shearing when the missile is at the far left?

You might not be updating the object in time if it's on the current scanline. Remember you only have ~22 cycles before the leftmost pixel will be displayed. This is why you should typically reposition before the scanline, then strobe HMOVE and enable the object on the next line.) Objects toward the right side of the screen can be updated later, almost 70 cycles later...

Once you set the coarse position (say, during VBLANK), if you don't have to move the objects more than 7 or 8 pixels right or left, then you should be able to just set the HMPx or HMMx registers as necessary and call HMOVE immediately after every scanline sync, clearing the fine-position registers with HMCLR when things don't need to move...

Someone can correct me if I'm wrong.

Edited November 16, 2018 by AkashicRecord

AkashicRecord · November 17, 2018

I would like some kernel exmples, I could write a kernel but it would be incredibly bad since I'm new to this and haven't followed the progress in vcs development which is probably lightyears from my primitive way of thinking of it by now..

I'm tantalizingly close to finishing this 1P asymmetric playfield, single-scanline kernel for the first real prototype of the game.

Once I hammer out these remaining off-by-one and scanline counter bugs, I can go over some of the kernel in detail. Finishing this part is actually one of the biggest challenges to programming this game.

AkashicRecord · November 17, 2018

Here's a test output of the new kernel. This example is loading values from RAM and writing to PF2 asymmetrically (the game playfield won't be this large, or look exactly like this though.)

There is a black line glitch (among a few others) that I'm tracking down, but now that I look at it, it might be beneficial, not to mention look better, to have a black line separating the rows...

As you can probably see, I'm ignoring the player sprite right now by just leaving it black. There's a bit too much going on along the scanline as well, so I should preemptively load and change as much as possible (especially for this game's requirements), but these are some decent steps forward.

As it stands, right now I'm working on squeezing as much state change as I can for the updates on each scanline. For many cases, a lot of the updates are redundant or unchanging data...but this won't be the case for every game.

Edited November 17, 2018 by AkashicRecord

JeremiahK · November 18, 2018

One thing you can do is to write to VBLANK during the kernel. If you set bit D1, it will disable the screen, and then clearing D1 will turn it back on. This way you can simply turn off the screen while you update all the graphics registers for the next line, so that nothing actually gets drawn to the screen. It's a lot easier than setting all the colors to black, for example.

AkashicRecord · November 18, 2018

One thing you can do is to write to VBLANK during the kernel. If you set bit D1, it will disable the screen, and then clearing D1 will turn it back on. This way you can simply turn off the screen while you update all the graphics registers for the next line, so that nothing actually gets drawn to the screen. It's a lot easier than setting all the colors to black, for example.

I really like this idea. It should have been obvious.

I'll experiment with writing the frame counter to VBLANK for an alternating effect every other frame.

tetris A Two-Player Competitive and Co-Op "Tetris" for the Atari 2600 VCS

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members