Open Club  ·  45 members

Atari VCS/2600 development using the Harmony/Melody board. See the GENERAL discussion area for setting up your environment. See the CDFJ discussion area for the tutorial.

1. What's new in this club
2. Tips and Tricks

I thought I'd start a collection of useful tidbits. If you know of something please post it and I'll add it to this list. Hardware Division via Multiplication - the ARM doesn't support division via hardware, but we can use this trick to quickly divide by a fixed value.
3. Hardware Division via Multiplication

In Frantic I draw the playfield using datastreams with a 0.20 increment, which repeats the playfield data over 5 scanlines. All collision processing in Frantic is done using software, so to determine collisions with the playfield I needed to divide the Y position by 5. The ARM used in the Harmony/Melody does not have support for division, but it does support multiplication. Most of you are probably familiar with fractional or subpixel positioning the moveable objects in the 2600, and we can use the same idea to implement hardware division by using what's known as Reciprocal Multiplication. Basically Y / 5 is the same as Y * 0.20 , and we can implement that by allocating the lower X bits of the 32 bit value as the fractional value. The article I linked to has this handy chart: n scaled reciprocal shift count 3 1010101010101011 AAAB 17 5 1100110011001101 CCCD 18 6 1010101010101011 AAAB 18 7 10010010010010011 19 9 1110001110001111 E38F 19 10 1100110011001101 CCCD 19 11 1011101000101111 BA2F 19 12 1010101010101011 AAAB 19 13 1001110110001010 9D8A 19 14 10010010010010011 20 15 1000100010001001 8889 19 Which shows us that to divide by 5 we need to use the lower 18 bits as the fractional value, and that 0.20 works out to be 0xCCCD. #define DIV_BY_5 0xCCCD #define DIV_BY_5_SHIFT 18 y5 = (y * DIV_BY_5) >> DIV_BY_5_SHIFT; You'll notice the list skips a number of values - for those we can use bit shifting to divide by powers of 2: result = value >> 1; // divide by 2 result = value >> 2; // divide by 4 result = value >> 3; // divide by 8 result = value >> 4; // divide by 16 The table is based on needing to divide a 16 bit value, which does not work for /7 or /14. I haven't looked into it, but suspect they'd work just fine to divide a Y position which is an 8 bit value. The article does go into the extra steps needed to divide a 16 bit value by 7 or 14.
4. Part 3 - Beginnings of Collect 3

Ahh... excellent, I get it. Thanks for the response. I can never get all the way through the instructions before beginning to play with code or a new device! 😃
5. Part 3 - Beginnings of Collect 3

You're getting ahead of the tutorial Display Data RAM is not mapped into 6507 address space so you cannot directly load, store, increment, decrement, etc. the value. You would: put MyVar: ds 1 as part of the Zero Page RAM put _MYVAR: ds 1 as part of Display Data RAM update CallArmCode to copy MyVar into _MYVAR Then your 6507 code would use MyVar. In Part 8 the 6502 variables TimeLeftOS and TimeLeftVB are added and get passed to the ARM code.
6. Part 3 - Beginnings of Collect 3

Darrell, I'm a bit confused on how the 6502 passes data to the ARM processor. CDFJ seems to recognize registers such as SWCHA and INPUT4 - etc. when identified like so: _SWCHA: ds 1 ; joystick directions to ARM code .. but what if I want to define a variable in 6507 ASM code, increment the variable value throughout the Kernel, and then make the final value available to the ARM processor in overscan? With the understanding that I would still need to include it in the "CallArmCode:" routine in order to pass it to the ARM, would still I declare the variable in the same location as _SWCHA in the same manner like so: _MYVAR: ds 1 ;a variable that is not a register ...could I then use that variable as I normally would in a pure ASM program, like so: ldx #5 stx _MYVAR inc _MYVAR
7. Part 9 - Arena

Display Data RAM is defined as this: void* DDR = (void*)0x40000800; #define RAM ((unsigned char*)DDR) #define RAM_INT ((unsigned int*)DDR) #define RAM_SINT ((unsigned short int*)DDR) So the memory accessed when using these defines is: RAM[0] is 1 byte at 0x40000800 RAM[1] is 1 byte at 0x40000801 RAM[2] is 1 byte at 0x40000802 RAM[3] is 1 byte at 0x40000803 ... RAM_INT[0] is 4 bytes starting at 0x40000800 RAM_INT[1] is 4 bytes starting at 0x40000804 RAM_INT[2] is 4 bytes starting at 0x40000808 RAM_INT[3] is 4 bytes starting at 0x4000080C ... RAM_SINT[0] is 2 bytes starting at 0x40000800 RAM_SINT[1] is 2 bytes starting at 0x40000802 RAM_SINT[2] is 2 bytes starting at 0x40000804 RAM_SINT[3] is 2 bytes starting at 0x40000806 ... From dasm we get the location of _BUF_JUMP1 and _BUF_JUMP1_EXIT as an offset in bytes from RAM, so we need to divide the offset by 2 for RAM_SINT. If we're using RAM_INT we need to divide the offset by 4. An example of that is also in InitGameBuffers(): // Zero out the buffers used to hold the player, missile, and ball data // It's fastest to use myMemsetInt, but requires // proper alignment of the data streams (the ALIGN 4 pseudops found in the // 6507 code). Additionally the offset(_GameZeroOutStart) and // byte count(_GameZeroOutBytes) must both be divided by 4. myMemsetInt(RAM_INT + _EVERY_FRAME_ZERO_START/4, 0, _EVERY_FRAME_ZERO_COUNT/4); This is one of those things that's easy to miss - Dionoid pointed out one I missed in Part 5.
8. Part 9 - Arena

Is that correct - signed int values are 4 bytes (32-bit) in length, while the jump addresses are 2 bytes (16 bits)? EDIT: I see #define RAM_SINT ((unsigned short int*)DDR), so they are indeed 16 bit, but I'm still unsure why the /2 is needed? Chris

10. Progress update

Program logic for the next update is finished. I'm now doing a review of the code to revise the comments, such as this: Mode: ds 1 ; \$00 = splash, \$01 = menu, \$80 = game became this: Mode: ds 1 ; \$00 = splash, \$01 = menu, \$80 = game ; these values allow for easy testing of Mode: ; LDA Mode ; BMI GAME_ROUTINE ; BNE MENU_ROUTINE ; BEQ SPLASH_routine and revise variable names such as: _ARENA_HEIGHT = 182 became this: _ARENA_SCANLINES = 180 ; number of scanlines for the arena because the original name could easily be confused with the new heights table for the arenas: _ARENA_HEIGHTS: .byte Arena1_Height .byte Arena2_Height .byte Arena3_Height .byte Arena4_Height The 6507 review is finished, and am about half way done with the C review. I hope to get the next entry posted this weekend.
11. Progress update

After a long break, I'm finally getting back into 2600 development. Been working on Part 9 - Arena. It shows off how to use the incremental data streams as well as a jump data stream for program flow control. You may have noticed the playfield data is using 5 bytes per row instead of the usual 6. I'm showing how to convert that to the 6 bytes needed for the playfield registers. I used a slightly more advanced version of this in Stay Frosty 2, which allowed for scrolling levels of arbitrary widths. void PrepArenaBuffers() { // The 40 bits for each row of arena data are stored in 5 bytes arranged like this: // byte 0 byte 1 byte 2 byte 3 byte 4 // 33333333 33222222 22221111 11111100 00000000 // 98765432 10987654 32109876 54321098 76543210 // // They need to be converted to this arrangement for the playfield datastreams: // LEFT RIGHT // PF0 PF1 PF2 PF0 PF1 PF2 // 3333---- 33333322 22222222 1111---- 11111100 00000000 // 6789---- 54321098 01234567 6789---- 54321098 01234567 int r; unsigned char *arena = ROM + arena_graphics[mm_arena]; unsigned char *arena_pf0_left = RAM + _BUF_PF0_LEFT; unsigned char *arena_pf1_left = RAM + _BUF_PF1_LEFT; unsigned char *arena_pf2_left = RAM + _BUF_PF2_LEFT; unsigned char *arena_pf0_right = RAM + _BUF_PF0_RIGHT; unsigned char *arena_pf1_right = RAM + _BUF_PF1_RIGHT; unsigned char *arena_pf2_right = RAM + _BUF_PF2_RIGHT; for(r=0; r<arena_heights[mm_arena]; r++) { arena_pf0_left[r] = BitReversal(arena[r*5 + 0]) << 4; arena_pf1_left[r] = (arena[r*5 + 0] << 4) + (arena[r*5 + 1] >> 4); arena_pf2_left[r] = BitReversal((arena[r*5 + 1] << 4) + (arena[r*5 + 2] >> 4)); arena_pf0_right[r] = BitReversal(arena[r*5 + 2]); arena_pf1_right[r] = arena[r*5 + 3]; arena_pf2_right[r] = BitReversal(arena[r*5 + 4]); } }
12. Part 4 - CDFJ Debugging with Stella

The missing elements are still there. Just increase the debugger's screen size in the developer options.
13. Part 4 - CDFJ Debugging with Stella

It would be a useful feature if Stella allowed you to highlight 1, 2, or 4 bytes in the Cartridge Ram, and it displayed the the result in MSB order for both hex and decimal. The debugger has also changed format with the latest release so it is shorter in height, and the "label", "dec", and "bin" are gone on the cartridge ram. However I think something like I mocked below could work. What do you guys think? I suppose the length of the box for decimal would need to increase, but that shouldn't be a show stopper.
14. Part 8 - Score & Timer

I thought the break would be a week max, turned out to be a month. The last feature request is quite nice, I should have thought of it myself - you can now start a new game of Kaboom! Deluxe! via the paddle button. Anyway, this weekend I plan to review where I left off on the CDFJ tutorial, then start work on Part 9 - Arena.
15. Part 8 - Score & Timer

Saw something today that reminded me of my unfinished Kaboom! Deluxe! hack. There's not much left to do in it, so I think I'll take a brief break from CDFJ and see if I can't finish the hack this weekend.

19. Part 5 - Source Improvements

The datastream for audio is AMPLITUDE: lda #AMPLITUDE sta AUDV0 However, it's a little different from the other datastreams. Instead of reading from a buffer you populated in Display Data RAM, it will either: Retrieve value from a ROM buffer* that's packed with 2 samples per byte (speech in Draconian) Created value on the fly for 3 voice music (music in Mappy) I have a music example somewhere, will track it down and get it to you. In Collect 3 the ARM code is triggered via: ldx #\$FF stx CALLFN Change it to this to have AUDV0 updated once per scanline while the ARM code is running. ldx #\$FE stx CALLFN * I think it might even work if the buffer is in RAM, which would allow you to manipulate them.
20. Part 5 - Source Improvements

Hi @SpiceWare, I'm trying to add a 4-bit PCM audio sound-effect to my game (writing AUDV0 on every scanline). I got something working in plain 6507 assembly, and now I'm trying to use the ARM to pre-calculate the amplitudes so I can simply read them from a data-stream and write to AUDV0 (instead of calculating them on the 6507 each scanline using a 16-bit cycle register and a 16-bit pitch/delta value). However it looks like (partially?) support for that is already in the example project that you shared (referring to methods like 'setNote' and 'setWaveform' in defines_cdsf.h). But I have no idea how to actually set a note from ARM and which predefined(?) datastream to read and store in AUDV0 from assembly code. Also, I was wondering how to keep writing to AUDV0 during the time the ARM is called on VerticalBlank and OverScan. I found this forum post by you, where you say that the 6507 is being fed NOPs during the duration of an ARM subroutine being called. That makes sense, as the 6507 must be doing something while one of the ARM functions is called. And later in that same forum discussion you mention a ZP routine that can run while the ARM is still running, using the 'ldx PosObject' instruction to check if the ARM has finished yet. Do you maybe have a simple example on how to play a single note using 4-bit PCM audio, using CDJF? Maybe you already explained this in one of your earlier posts on DPC+, but I couldn't find it using the forum's search. Cheers, Dion BTW: programming games using CDJF is an amazing experience! I like how it brings me new possibilities, while still I have to fight the limitations of the '2600. Just like my 6502 assembly code, my C code also has to be highly optimized. You can't get sloppy/lazy 🙂
21. Part 6 - Console Detection

Making headway on Part 7. The menu is fully functional, what's left is: add the routines to colorize X'd options red clean up the code add comments It'll be sometime next week though as this weekend I'm taking my nephew to Fully Charged LIVE in Austin.
22. Part 6 - Console Detection

the 2600 must output 262 scanlines for those times to be valid. If you look at the 6507 code before the ARM runs: VerticalSync: ldy #2 ldx #VB_TIM64T sty WSYNC sty VSYNC stx TIM64T sty WSYNC sty WSYNC ldy #0 ; 2 2 - zero out some TIA registers while sty GRP0 ; 3 5 we have some free time sty GRP1 ; 3 8 sty WSYNC sty VSYNC ; figure out which ARM Vertical Blank routine to run lda Mode ; \$00 = splash, \$01 = menu, \$80 = game bmi vbgame beq vbsplash ldy #_FN_MENU_VB ; going to run function MenuVerticalBlank() .byte \$0c ; NOP ABSOLUTE, skips over ldy #_FN_SPLASH_VB vbsplash: ldy #_FN_SPLASH_VB ; going to run function SplashVerticalBlank() .byte \$0c ; NOP ABSOLUTE, skips over ldy #_FN_GAME_VB vbgame: ldy #_FN_GAME_VB ; going to run function GameVerticalBlank() jsr CallArmCode you'll see that both frame 1 and frame 2 will follow the exact same path, thus take the same number of 6507 cycles, resulting in exactly 262 scanlines worth of time transpiring between ARM calls. You can add more 6507 code in there as long as the path taken for frame 1 and frame 2 is identical. Likewise in the C code both frames 1 and 2 will take the same path until they get to the IF/ELSE IF block of code in SplashVerticalBlank(). void SplashVerticalBlank() { int i; int j; int color; int console; color = 0; // default to black // used to show if the console is a 2600 or 7800 console = ((is_7800 ? _SPLASH_78 : _SPLASH_26 ) & 0xfff) + 0x6000; if (frame == 1) // frame is incremented in SplashOverScan() { T1TC = 0; // make sure timer starts at 0 T1TCR = 1; // turn on timer } else if (frame == 2) { T1TCR = 0; // turn off timer after 1 frame // the time it takes to output 262 scanlines is different for // NTSC, PAL, and SECAM consoles, so we can use that to detect // which one and adjust the color values. if (T1TC < (0x11e8ff + 0x11d329)/2) tv_type = NTSC; else if (T1TC > (0x11fd2b + 0x11e8ff)/2) tv_type = PAL; else tv_type = SECAM; } The value in T1TC stops changing as soon as T1TCR is set to 0, so the extra lines of code added for joystick override will have no impact on the values being compared against.
23. Part 6 - Console Detection

How does that work? are those times specific to that exact code, I mean is it adjusted for any c code or 6507 cycles? Or what exactly do I need to be aware of to not change to make sure It keeps working?
24. Part 6 - Console Detection

The option was because Draconian was the first time we used it in a released game, and I didn't know if the detection was 100% or not. Would probably be worth doing a poll at some point to see how well it's worked for people. Instead of a menu option, an override could be implemented by checking the joystick state when the 2600's powered up: FIRE for NTSC UP for PAL60 DOWN for SECAM60 To do that I'd change SplashVerticalBlank() to this: void SplashVerticalBlank() { ... else if (frame == 2) { T1TCR = 0; // turn off timer after 1 frame // first check if the user overrode the detection // if they did not then use the value in T1TC to see // how long it took the 2600 to draw 262 scanlines if (JOY0_FIRE) tv_type = NTSC; else if (JOY0_UP) tv_type = PAL; else if (JOY0_DOWN) tv_type = SECAM; else if (T1TC < (0x11e8ff + 0x11d329)/2) tv_type = NTSC; else if (T1TC > (0x11fd2b + 0x11e8ff)/2) tv_type = PAL; else tv_type = SECAM; } ... }
25. Part 6 - Console Detection

Thanks Darrell, this NTSC/PAL/SECAM detection is really useful! In your games that implement this detection technique (e.g. Draconian) I see that you still offer the option to change the TV type on the menu screen (with the detected TV type pre-selected). Is this because the detection isn't 100%, or just to give people the option to change it if they want to?
26. Part 6 - Console Detection

Correct, the virtual sprites are not tied to a specific player. Blog entry It's full of stars! goes into some detail on how its done. If you play Draconian in Stella with Fixed Debug Colors mode turned on you can see it in action. In this sequence of frames I drew a white box in an area that contains a mine, the player's ship, and an asteroid. Over 3 frames those objects are drawn like this: player0 - asteroid player1 - player's ship player0 - mine player1 - asteroid player0 - player's ship player1 - mine
27.