Jump to content

AkashicRecord

Members
  • Content Count

    208
  • Joined

  • Last visited

Everything posted by AkashicRecord

  1. Thanks so much, Dennis. Can't wait to pour over this during the weekend. Your disassemblies are amazing!
  2. Excellent, can't appreciate it enough. It might not be a necessary tactic, but it's definitely nice to know that it works.
  3. I took a few minutes to throw together a quick test program that just outputs 7 blank lines (over a blue background) and then outputs a VBLANK line and repeats this pattern 24 times down the screen. This seems perfectly fine in Stella and RetroArch for Android, but I have no idea what this would do on a real CRT. Until further notice, I think I'll press on with this idea and continue to integrate it into the new kernel for testing. VBLANK-test.bin
  4. ...I could be wrong however, so maybe someone with more experience can chime in.
  5. Writing to VSYNC would definitely screw up the signal, but VBLANK should be fine. I'm close to putting this stuff to the test later today...but not on real hardware.
  6. I feel guilty not posting any updated code yet, but I've been kind of busy the past few days. I did manage to make some adjustments to my kernel strategy though, so it looks like I should be able to pull of great-looking 1-Player game modes (as far as detail is concerned.) The 2P modes (at first) will probably have simple, solid color schemes though. (Mostly because I'm striving for a single-scanline kernel as much as possible.) I'm trying to make this kernel as "generic" as possible so that I can repurpose it in upcoming games, so it is taking a little more time than expected. (I was originally aiming for a two-month development timeline, but that may have been extended a teensy bit.) ...anyone interested in an "Excitebike" type of game for the 2600?
  7. That positioning routine is basically a perfectly calibrated subtraction loop. Because of HBLANK, we need to push the minimum coordinate out a bit so that the subtraction doesn't cross zero too early. "46" ends up being a suitable value for the loop. The other instructions make up for the lack of a bitwise Negation instruction in the 6502 instruction set. What we can do however, is flip all of the bits of the value, giving the one's-complement of the number. Using some bit-trickery and exploiting wrap-around, we can massage this value a bit and get exactly what is needed. The addition following the Exclusive-OR, adjusts the value via wraparound so that the value falls within a signed range of -7 and 7. The left shifts place the value in the appropriate area of the HMxx register...that is to say that only the upper 4 bits are relevant, don't think of those 4 shifts as mathematical, they are just moving a bit pattern into place. (HMOVE semantics are the real complicated thing here.) One other factor here is that we have to treat that value as an unsigned value, otherwise you wouldn't be able to target pixels 128-159. That's why I say, "it's complicated." As for the first question, DCP works memory and gives a free compare with the A register, saving 1 byte and 2 cycles. There wouldn't be any reason to compare against a register that you are decrementing! And no way to INC or DEC the A register because it is designed for ADC and SBC.
  8. I really like this idea. It should have been obvious. I'll experiment with writing the frame counter to VBLANK for an alternating effect every other frame.
  9. Here's a test output of the new kernel. This example is loading values from RAM and writing to PF2 asymmetrically (the game playfield won't be this large, or look exactly like this though.) There is a black line glitch (among a few others) that I'm tracking down, but now that I look at it, it might be beneficial, not to mention look better, to have a black line separating the rows... As you can probably see, I'm ignoring the player sprite right now by just leaving it black. There's a bit too much going on along the scanline as well, so I should preemptively load and change as much as possible (especially for this game's requirements), but these are some decent steps forward. As it stands, right now I'm working on squeezing as much state change as I can for the updates on each scanline. For many cases, a lot of the updates are redundant or unchanging data...but this won't be the case for every game.
  10. I'm tantalizingly close to finishing this 1P asymmetric playfield, single-scanline kernel for the first real prototype of the game. Once I hammer out these remaining off-by-one and scanline counter bugs, I can go over some of the kernel in detail. Finishing this part is actually one of the biggest challenges to programming this game.
  11. What I've been doing right now is have the target scanline in the A register. I perform a DCP (decrement and compare) on the line counter and BEQ or BCS for a match. (The BEQ condition is also met if the value is 0, so be careful with that. I'm not drawing below line 16 in this game, so that condition isn't an issue for me here.) You'll want to move to a decrementing counter system for sure, since you'll be getting "free" comparisons with zero and can eliminate a lot of CMP instructions. You might not be updating the object in time if it's on the current scanline. Remember you only have ~22 cycles before the leftmost pixel will be displayed. This is why you should typically reposition before the scanline, then strobe HMOVE and enable the object on the next line.) Objects toward the right side of the screen can be updated later, almost 70 cycles later... Once you set the coarse position (say, during VBLANK), if you don't have to move the objects more than 7 or 8 pixels right or left, then you should be able to just set the HMPx or HMMx registers as necessary and call HMOVE immediately after every scanline sync, clearing the fine-position registers with HMCLR when things don't need to move... Someone can correct me if I'm wrong.
  12. I can't really comment on the above because I have no plans on using external hardware assistance, and that doesn't evne handle the playfield at all. Everything that I'm doing is strictly vanilla VCS, and with only a 2 or 4K ROM image target, to boot. My primary "locus of focus" is on strict timing, especially the required "cycle 48" writes to PF2 for an asymmetric playfield. (I can't really even imagine writing a game that doesn't use the playfield at least somewhat extensively.) The repositioning routine that I highlighted earlier could be rewritten in a variety of ways. Originally, I was using a few different versions of the same routine to position each object individually, rather than using a reusable routine which positions anything and everything. One other version of the routine could store the calculated positioning values into variables (or the stack) at which point they are peeled off as needed, and applied to the appropriate RESPx and HMxx registers a bit more quickly / efficiently. (WIth an approach like this, then you would probably want to just strobe HMOVE on every single scanline as well...which has an additional side effect of shaving 8 pixels off of the left screen edge..) Another option would be to reposition objects directly into RAM variables which are interspersed between code which is written to (and executed from) RAM. That is pushing the envelope about as far as it can go, as you are reducing loads and stores down to the theoretical minimum of ~5 CPU cycles by dynamically writing data into an immediate addressing form.
  13. If you use each player, missile, and ball, you only have to really position those once every frame (for most purposes.) If you are reusing objects down the screen, then you have to start doing a bit of work to prep data for the scanlines on which to start (re)drawing the objects. That can definitely get tight, because you have a very limited amount of time to do things as it is. Any area on your screen where nothing is changing is 76 cycles that are mostly unused. (You could even NOP your way across an entire scanline without a WSYNC as long as the used CPU cycles are exactly 76.) This is one reason why you will see games where things can be visually grouped into specific "lanes" on the screen. The programmer may be buying time to prep data for objects in those "lanes." The positioning routine shown previously is very small in size and is very general-purpose, but it uses a lot of cycles for some pixel position values. For fast repositioning you'll probably want an algorithmic approach, or a table-based fast lookup, and even maybe a fixed-cycle kernel that executes from RAM...maybe even a combination of all of the above Maybe I can cover some of those things at a later time.
  14. I'll have a better response later, but here is the positioning that I'm using for all objects. It works pretty well, but there is a 1-pixel discrepancy when you start resizing the player sprite objects...so you'll have to compensate for that one manually, but this routine will handle the more or less "standard" 1px positioning discrepancy inherent in the Missile and Ball objects. To use this subroutine you'll have to call it with: JSR PositionSprite ...but that isn't enough. It takes two arguments, one in the A register, and the other in the X register. The A register should hold the target horizontal pixel, and the X register contains the object to be positioned: X=0 Player 1 (RESP0) X=1 Player 2 (RESP1) X=2 Player 1 Missile (RESM0) X=3 Player 2 Missile (RESM1) X=4 Ball (RESBL) A=(0-159) Target Pixel For the A register, I believe that 0-159 are the valid values and should be able to "hit" every pixel across the entire screen width...but the sprites will wrap around from the right edge of the screen to the left if you start overstepping the bounds...this can be abused to great effect for scrolling objects onto the screen from the right, similar to how it was done in Grand Prix....I think. Here's the routine. Call it from VBLANK or even in the Overscan period if you want to be different: ;; The answer to life, the universe, and everything ;; ...including horizontal positioning on the Atari 2600 MAGIC = 46 ; Douglas Adams was off by 4 PositionSprite ;; Sprite placement in 27 bytes - ;; A=Target Pixel cpx #2 ;; X=0 P0, X=1 P1, X=2 M0, X=3 M1, X=4 BL ; ..check to see if Missile or Ball adc #0 ; add carry from above for 1px missile / ball error clc ; clear carry for ADC adc #MAGIC ;; HERE BE MAGIC WIZARD SHIT sec ; set carry for SBC sta WSYNC ; finish current scanline _SBC15 sbc #15 ; repeatedly subtract 15 bcs _SBC15 ; until crossover sta RESP0,X ; set coarse position (where we are "NOW" in TIA color clock cycles) eor #$FF ; ... adc #$F9 ; it's complicated... asl ; shift into place asl ; asl ; asl ; and get an appropriate value for the upcoming HMOVE sta HMP0,X ; set the fine position ; rts ; and return from subroutine A JSR to call this routine isn't enough however...once you call the routine for any number of objects (0-4), you have to strobe the HMOVE register (STA HMOVE) immediately after a STA WSYNC for it to "take effect"... If the HMOVE isn't done exactly after a scanline sync (or at least on cycle 0), then the values that were just calculated in the function and written to RESPx will be invalid because HMOVE operates differently depending on when it is strobed. How the positioning function above works is it uses the X register to identify the object to be placed, fixes the pixel position for 3 of the 5 choices, and then uses the X register to index the write to RESP0, since the next 4 memory locations are the Player 2 sprite, Player 1 missile, Player 2 missile, and Ball objects respectively. That said, an example to place the Player 1 sprite object (GRP0 via RESP0) at pixel position 80 might look like: ; sta HMCLR stx #0 sta #80 jsr PositionSprite sta WSYNC sta HMOVE The commented-out HMCLR is not necessary if you are only positioning once. I'm showing it just to illustrate that you'll need to be mindful of it if you start repositioning things more than once... Remember that you'll have to write to GRP0 to actually start drawing, and of course have a non-background color written to COLUP0 for the above example to actually do anything.
  15. Here's a very in-depth breakdown of the Nintendo Tetris game. There are quite a few factors here that I was not aware of. One interesting point is that the input has an initial 16 frame delay for lateral input repetition, at which point the game pieces move horizontally every 6 frames (~0.10 seconds). This is something comfortable that I could use as a starting point for this game...although I heavily object to the automatic drop speed doubling from level 28 to 29... It's almost impossible for a human to play at a 1 frame drop increment... http://meatfighter.com/nintendotetrisai/ This page has TONS of stuff, almost too much! Did you know that there was a hidden unfinished 2P versus in the NES Tetris??? There is also information about programming an AI, although that might be too much for the VCS... (I'll still have a look at it though...it may be acceptable to have an crude AI that operates over multiple frames...better than nothing, right?) Notice how it's pretty impossible to play using only the "squiggly" pieces...
  16. Reminds me of the performance pieces where office lights in a skyscraper are wired to a Tetris game logic and a game is played on the side of the building. https://youtu.be/eMBguPuKPi4 The only way this could be better is if the building was one rigged for demolition, and the entire field played and set for a giant line piece the entire length of the building which triggers the demolition charges at the end as all lines are cleared...along with the building... As for the future developments, I'm currently investigating a timer-based kernel idea. Could be interesting.
  17. Today, I'm doing a little more work on the kernel, this time working the Carry flag into the equation. This should help streamline the kernel a little bit more, and things will also make a little more sense. I have some off-by-one errors creeping in that need to be taken care of as well. I'll be posting another playfield test soon, but this one will be a little different. I'm looking at having a completely black background for the game (even outside of the game playfield), and to use gradient shading for the placed bricks instead. I think will look quite nice, but it will require changing another TIA register (obviously.) I should be able to re-use the index for the game playfield data and simply apply it to a color table located in ROM. Each game level would have 20 possible color shades for the game bricks, so I could get kind of "fancy" here, if I want. That leads me to some interesting ideas. One very difficult level could use pure black for some of the lower blocks, essentially requiring you to remember where you placed some of the bricks. Maybe the "ghost" piece outline will come in handy here... I'll come back to these ideas later, because I think this could be interesting and challenging. Maybe one of the 2P modes could allow the opponent to "black out" his opponents game field for a few seconds...or for one move, etc...? Also, If anyone is interested, I can share detailed code snippets and explain what is happening for anyone writing (or wanting to write) their own VCS programs. Some of the routines are generic enough to be reusable in other games and programs.
  18. I feel a little less embarrassed about never upgrading past Snow Leopard now...
  19. Now is probably a good time for a quick list of what's done...or mostly done, and a list of things that are left. (Working or mostly working) - Working application shell - TV signal sync / 192 scanlines - Colored sprites - Animation - Random number generation - Joystick polling / Input control - World coordinate system - Asymmetric game playfield / Data loading (Things still left TODO) - 1P / 2P kernel separation - Game logic - Line clearing / Scoring - Timing adjustments - Sound - Additional game modes - Detail / Graphics enhancements - Public beta testing - Music - Manual / Box - Game cartridge
  20. If it's any consolation, I was using my son's Magna Doodle the other day for some coding...
  21. Maybe if you drastically cut down the visual area...like only use PF2...? I can't imagine squeezing any sort of [good] game logic in between the raycasting and rendering. This demo seems RAM-heavy without even looking at any code yet. The fabled Swordquest: Airworld would merit this sort of treatment...that, or Wizardry.
  22. I made a lot more headway on the kernel timing today by breaking it into a few similar pieces. Here's another screenshot showing the asymmetric playfield by utilizing "cycle 48" PF2 writes. (There are still some bugs to be squashed.) This example is just using junk data and colors, but it illustrates that the principle is working while advancing down the playfield, as well as displaying a game piece (somewhat) properly from sprite data. Every 8th scanline triggers an adjustment of the index into the playfield "brick" data, and this is mostly working properly as well.
  23. This isn't much, but I felt obligated to at least share something recent... This is the debugger output of one of my recent test protos. The disassembly listing to the right shows how I've labeled the CPU cycles entering into the main portion of the kernel. (Note the second write to PF2 occurring on cycle #45 and ending on cycle #48.) (In this test, I'm only performing the "cycle 48 write" on one scanline, but it isn't much to apply this code to the other parts of the kernel...)
  24. This morning I managed to make some more strides forward, especially for the 1P prototype. I fixed an indexing bug which I was completely overthinking, and managed to get a single-scanline kernel which loads the player graphics, the color, and each half of the playfield data, successfully finishing the write for the asymmetric (right-side) portion of the screen to the appropriate TIA register *exactly* on machine cycle 48, and then changes the player color value for loading on the next scanline...whew, that was a lot. The 2P prototypes will probably benefit from a 2-line kernel, but I might be able to squeak by with a 1-liner... We're getting close! I still have some room for improvement, as I can apply the color changes on prior scanlines, but right now that isn't necessary, and I don't have to litter my code with NOPs everywhere. (I actually don't have a single NOP...)
  25. Right now, my graphics resources are very minimal, really only a handful of bytes for everything. But even with that, I'm being a little wasteful. There are a few ways to represent the pieces, whether it be bytes or nibbles, or even mathematical and bitwise operations. For now, I think simpler is better, and that's certainly easier to debug. I'm using a few 16-bit pointers and indirectly indexing into them using the Y register to access the sprite data. After random piece selection, I build a graphics pointer to a table which is indexed by the piece's current rotation. I'm doing pretty good so far with keeping the code to a minimum...sort of. There are a few other tables to handle some adjustments which drive the kernel...things like: what scanline to start and stop drawing on, and (later) when to change colors, and when to start loading dirty playfield data...i.e. the "placed" bricks, etc. That said, I was spending some quality time earlier this morning actually writing part of the kernel in RAM, building opcodes amongst some very carefully placed variables and pointers. This allowed for the fastest possible data loads and stores (about 20 cycles for both players' sprites and colors), as I could use immediate addressing and dynamically rewrite the operands elsewhere. The problem with this was that while it was cool as hell, it was looking like some major overkill. Maybe I'll come back to it later if I'm really starved on clock cycles, because it actually worked quite well...at the expense of about 20 extra bytes of RAM.
×
×
  • Create New...