Jump to content

ZackAttack

Members
  • Content Count

    785
  • Joined

  • Last visited

Everything posted by ZackAttack

  1. Pixel clock that is 2 CPU cycles wide​ means the clock for the 6507 is the pixel clock divided by 2 instead of being divided by 3 like it is now right? In other words the 6507 is clocked 50% faster. You're also missing a superscalar 6502 with L1 cache. I'm sure some out of order execution would go nicely with timing critical memory mapped I/O.
  2. An easy way to set this up would be to start with a simple program which displays a sprite at any coordinate you specify. So the program is drawing one sprite at (30,40) for example. Then modify it so that it changes the position of the sprite every frame. On the even frames it remains located at (30,40) and on the odd frames it moves to (120,70). Now there appears to be two sprites located at (30,40) and (120,70), but they flicker because they are only drawn half the time. Then you need to track the location of each object separately and choose which location, graphics, colors etc to display for each frame.
  3. Halt 6507, take over write signal, update TIA COLUBK register every pixel, game over.
  4. That video looks so much better than I was expecting. For some reason my phone seems to do a better job at holding the 60fps. I'm pretty sure you could just attach a 10mb file to your post here. Isn't the limit 50mb? Did you see my comment about changing the background color per row? Perhaps it would be useful for scenes with 2 bright dominant colors? Now I'm wondering if we can build a youtube app for the Atari or maybe include some cut scenes in future games.
  5. Probably. SpiceWare was very careful to make the Collect code serve as an example of the right way to do things. Make sure you keep in mind that the graphics are stored backwards vertically and it's 0 indexed. This is why P0_HEIGHT is added and 1 is subtracted. The -1 accounts for the fact that the Label is already pointing to the first byte of data and if you add P0_HEIGHT to that you end up pointing at the byte just after the last byte of data. Finally the Y position is subtracted because the y register is used to index into the array, but that y register is also used to track the full 192 lines of screen height. So subtracting Y position allows this to work when the player is moved vertically without having to waste another register or variable to keep track of the player graphics/color index.
  6. I was able to debug this using the following steps. Hope this helps. Search for GRP0 and then trace it back to the lda (Player0Ptr),y instruction. Look at where Plater0Ptr is being set lda ShapePtrLow sta Player0Ptr lda ShapePtrHi sta Player0Ptr+1 Look at what sets ShapePtr ShapePtrLow: .byte <(Guerrilla0 - PF_HEIGHT + 80) ShapePtrHi: .byte >(Guerrilla0 - PF_HEIGHT + 80) Build the project and take note of the following label addresses: Guerrilla0 f0d3 Player0Ptr 86 Player0Clr 8f Run the game and set a breakpoint on the indexed load ~ break DoDrawGrp0 ~ Look at the values of each pointer and Y register to determine where the load is pointing to and compare it with where it should be pointing to Player0Ptr 86 = f105 Player0Clr 8f = 0000 <- not initialized Realize the color pointer was never initialized. Review the code to see why. Found copy paste error. lda ShapePtrLow sta Player0Ptr lda ShapePtrHi sta Player0Ptr+1 lda ColourPtrLow sta Player0Ptr <- storing to the wrong variable, now both pointers are wrong lda ColourPtrHi sta Player0Ptr+1 <- storing to the wrong variable, now both pointers are wrong ​Fix copy paste error, build, set breakpoint again, run until breakpoint is hit. Pointers look better: Player0Ptr 86 = f0ec Player0Clr 8f = f105 Y = 8A Compute the indexed load target address: (Player0Ptr),y = f0ec + 8a = f176 This is much higher than it should be. Guerrilla0=f0d3 and we should be loading P0_HEIGHT=25 past that or f0ec. Hey! f0ec is the ptr value before we added Y offset. Change pointer to account for Y offset. Y starts at PF_HEIGHT and decreases as the screen is drawn by the player offset. So we need to subtract the PF_HEIGHT from the Guerrilla0 label and then add back the player's vertical offset. In this case it's simply hardcoded to $80, but once you start moving the player you'll need to set the pointer correctly by recalculating with the new offset. ShapePtrLow: .byte <(Guerrilla0 - PF_HEIGHT + 80) ShapePtrHi: .byte >(Guerrilla0 - PF_HEIGHT + 80) ColourPtrLow: .byte <(GuerrillaClr0 - PF_HEIGHT + 80) ColourPtrHi: .byte >(GuerrillaClr0 - PF_HEIGHT + 80)
  7. Actually that's a much better idea. I was originally thinking it could be problematic since multiple cycles would hit the same address, but that shouldn't matter because it will always toggle A0 when transitioning to the next instruction. The reset switch on the Atari isn't related to the physical reset pin of the 6507. You have to poll the value of the reset switch just as you would with the joystick button. The only way to put the CPU into a reset state is to power cycle it. Well, I suppose you could modify the motherboard too... I have no idea if either of those will actually work. I've worked with much more powerful microcontrollers only to find that the I/O latency was too high to directly interface with the 6507 buses. PICs are cool too, so don't let me talk you out of them too easily. Either way the first thing I would do is to put a scope or logic analyzer on it and experimentally verify you can capture the A12 and A0 changes and update an 8 bit bus with a new value quick enough. The good news here is that you'll already know what value to output on the data bus before the transition occurs. So you can write it out as soon as you detect it. Always happy to help where I can. If you want perfect color at the expense of graphics you could just vary #3 slightly like this. Due to the 3 pixel cpu cycles the staggered row will lose one more pixel. I marked the potentially incorrect pixels with *. I'm assuming you can't correct with Ball at all because there won't be enough cycles available. color colup0 colup1 A A X Y graphic grp0del grp1del grp0 PCH PCL Y 00000000 11111111 00000000 111*1111 00000000 ***11111 sta---cp0lda---JSR------gp1gp0---sta---cp1stx---cp0st?---gp0sty---cp1ldy---sty---gp1 00000000 11111111 00000000 111*1111 00000000 ****1111 sta---cp0lda---JSR------gp1gp0---sta---cp1stx---cp0st?---gp0sty---cp1ldy---sty---gp1 Personally I would recommend using this 5 sprite kernel instead. Then you can achieve perfect colors and graphics for all 5 sprites. This also leaves more cycles to setting audio and background colors. Since you need to use Ball to mask the PCH bit in the second copy of GRP1 anyway, you can use PF to mask the background color on the sides. I bet having the background color set each scanline would improve picture quality a lot more than going from 10 sprites wide to 12 wide. Of course this can be down with a static mirrored PF and static positioned BL. color colup0 colup1 A A Y graphic grp0del grp1del grp0 PCH PCL 00000000 11111111 00000000 111B1111 00000000 sta---cp0lda---JSR------gp1gp0---sta---cp1sta---gp1sty---cp0 00000000 11111111 00000000 111B1111 00000000 sta---cp0lda---JSR------gp1gp0---sta---cp1sta---gp1sty---cp0 Btw, it would be easier to come up with kernel ideas if we knew what it's going to look like. Would you provide a few mockups? Are you planning on "reshooting" the movies with tiled graphics or writing some automatic image conversion algorithm to downscale to Atari? Would you mind if I port this to the harmony eventually? I think it would allow a lot more people to see the results on real Atari hardware first hand.
  8. A12 is a reliable chip select, but selecting the chip and address decoding aren't the same thing. Since the 2nd kernel uses JSR for colors and bit 0 of the color is ignored anyway it might be possible to ensure A0 always toggles when a new data value is requested. The biggest challenge I see with this is handling the power on reset. The state of the bus will be undefined for some time before the reset vector is read. I think this could be handled by putting $15 on the data bus on startup and leaving it there. That would give a reset vector of $1515 and then a stream of ORA zpg,X instructions. This would produce a repeating pattern on the A12 and A0 lines for many iterations before the PC exceeds the ROM address space. Simply wait for n iterations of the pattern before injecting the first real instruction. I made a mistake in the 3rd diagram. The last sprite wouldn't be missing any pixels. Due to the JSR hack the fourth pixel of the fourth sprite would always be on. That could be masked away with the ball by changing PF priority. I still don't like that option though because of the last color being the bitwise and of the previous two sprites. For the second kernel I think the last two writes are backwards. GRP0 should be written before COLUP1. So using M1 has the same problem as using GRP1. COLUP1 isn't correct until after the first 3 pixels. Ball doesn't have this problem since it has it's own color. Maybe put the ball on the left most pixel and set it to disable, 1 wide, 2 wide or 4 wide depending on what's closest to the desired graphic. For a movie that would probably be acceptable. I wonder if this could be built with an Arduino. They're cheap and SD card shields are readily available for them. If so I could duplicate your setup if you needed help with debugging at some point. I already have all the signals brought out to a breadboard for my harmony project. Would be trivial to fork off A12, A0, and D0-D7 to the Arduino.
  9. Think about a JMP instruction it is going to be at least 4 consecutive reads to ROM. A12 may or may not change depending on the Atari. So if you're not watching the full address bus how will the MCU know when to replace the 1st byte with the second and so forth. So now you're probably thinking just watch A12 and A0 right? Sounds good until you consider what happens in the JMP example after the 3rd byte is read. The 4th byte will be located at wherever the jump target is and that address could have any number of bits in common with the location of the 3rd byte. Thus you have to watch the entire address bus. My other post is a bit misleading because the system I used is a pos jr which toggles A12 every cycle. The other two systems that I have don't. Good idea. I made some changes to test this and it appears that it does. I made the kernel, so I think you should make the demo Or send me a sample CDF project which I can use as a template and I'll adapt it myself if I have time. Also, when unrolling the loop to produce 3 lines it occurred to me that it might be possible to up the palette to 16 colors by exploiting the fact that bit 0 is ignored. fivefun.asm fivefun.bin
  10. Interesting. The brute force approach would certainly make this feasible. You could probably get 1-2hrs per 8gb sd card. However, I think you'll find you need the full address bus because there are no control bus signals brought to the cart. If your goal is to play with hardware this sounds like a good project. If your goal is to play movies on the Atari then I'd stick with the existing harmony hardware. Harmony gives you a large pool of potential beta testers and it saves all the time required to design and build a new board. I'm actually working on a new open source driver for it which should allow the SD card to be used during gameplay. There's also a good chance it will be able to support single bus stuffing reliably.
  11. I made some improvements to the 5 sprite kernel and was able to save enough cycles to make it usable with DPC+. This comes with two limitations. The two right most sprites must share an 8 color palette and that palette must consist of colors in the range CCC1 PPPX where CCC are the 3 bits that vary in a given palette and PPPP chooses one of the 8 available palettes. The second limitation is that there are no cycles left to try to move everything over. You'd have to alternate between frames instead of between lines. Have to do this: x x x x x x x x x x x x x x x x x x x x x x x x x Instead of this: x x x x x x x x x x x x x x x x x x x x x x x x x I've put comments where the fast fetchers would need to be used in the DPC+ implementation. All the loads use accumulator and there's only 11 different streams. Pretty sure that's within the DPC+ specs. ​ fivefun.asm
  12. 5 sprites looks pretty good. The color of D and E depends on the 6507 PC. Harmony would need to dynamically generate the JSR target according to the colors of the next line and then virtually move the routine to the correct address space. It would also need to inject the graphics and color data just as DCP+ currently does fast fetch. Of course the address would need to be in ROM which is why the palette is cut in half. The 13th address bit, A12, must always be set. fivefun.bin fivefun.asm
  13. It's possible with some serious limitations. Here's a demo of how it can be done with a standard 4k cart. I though of a second way to do it which is better, except the 6th sprite is only 6 pixels wide and only the top 64 colors of the palette can be used ($80-$FE) If we limit to 5 sprites the results are much better. The only limitation is the restricted 64 color palette. There's a few more cycles free too. Which should make it possible to move 8 pixels each scanline. In all three cases the effect can be accomplished with a standard 4k cart, but you'd really need to couple it with the Harmony hardware to make it usable in anything more than a demo. What's cool is that it doesn't require bus stuffing, so we could still make a decent RPG engine using the 5 sprites with interleave. I implemented the 6 sprite algorithm enough to render 6 unique graphics and colors. 5 Sprites: color colup0 colup1 lda # PCH PCL graphic grp0del grp1del grp0 A X Y 00000000 11111111 00000000 11111111 00000000 ________ sta---gp1lda---sta---cp0stx---gp0JSR------cp1cp0---sty---gp1lda---sta---gp0 6 Sprites with missing pixels (Ball is used to fill in one of the 3 missing) color colup0 colup1 lda # PCH PCL lda # graphic grp0del grp1del grp0 A X Y 00000000 11111111 00000000 11111111 00000000 __B11111 sta---gp1lda---sta---cp0stx---gp0JSR------cp1cp0---sty---gp1lda---sta---gp0sta---cp1 ​6 Sprites​ with challenging graphics on 4th and 5th color colup0 colup1 A A X A & X graphic grp0del grp1del grp0 PCH PCL Y 00000000 11111111 00000000 11111111 00000000 11111111 sta---cp0lda---JSR------gp1gp0---sta---cp1sty---gp1stx---cp0st?---gp0sax---cp1 Edit: Fixed minor errors in timing diagramssixfun.asm sixfun.bin
  14. They all consume the same amount of cycles. http://www.pagetable.com/?p=410 Also, I don't think it's possible to trigger NMI or IRQ from the cartridge since only the address and data bus pins are brought to the cartridge connector.
  15. Yeah if it were possible to do it with BRK I can think of one cool use for it though. Setting the background color to produce two 3 pixel wide colors every 18 pixels. Stagger this across three scan lines just like I do with the PF for my 40x64 background bitmap kernel. Since the pixels are now 3 wide instead of 4 it becomes a 53x64 bitmap. XX____XX____XX____ __XX____XX____XX__ ____XX____XX____XX Could make for a cool FPS demo assuming it doesn't fry the VCS.
  16. Regular bus stuffing involves a 3 cycle store to zeropage instruction. On the thirst cycle the value placed on the data bus is overridden with the desired value. This allows a TIA register to be written to every 3 cycles as many times as you want without having to spend cycles on loading the registers with new values. Quad bus stuffing involves a 5 cycle instruction that does a read modify write. Due to the implementation of the 6507 there is a dummy write on cycle 4 prior to the actual write on cycle 5. In this case both the address and data busses will be overridden on each of the write cycles giving a total of 4 stuff operations. This allows 2 TIA registers to be written to every 5 cycles. Assuming a new value must be provided for each write and a 76 cycle scan line this allows greater TIA bandwidth and better graphics and sound. No stuffing: 15 TIA writes Bus Stuffing: 25 TIA writes Quad stuffing: 30 TIA writes In theory there could also be hex bus stuffing which would use the BRK instruction to update 3 TIA registers in 7 cycles but that would be even more difficult to implement than quad and doesn't provide any benefit. As of right now the regular single bus stuffing appears to be the only one that is feasible to implement with harmony/melody hardware.
  17. It's also a good way to find malware though. If possible I'd use a dirty system for such treasure hunts.
  18. I like it. It's feasible but still is going to push the system to it's limits. Instead of writing PF0 twice each scanline, just set it once in prekernel and write to COLUPF one more time. As a bonus you're going to have the COLUBK value in register anyway, so just write it again to COLUPF and you save a load there too. Since PF and BK are the same color once you're right of the cat the mirrored PF0 will be invisible. Now this ruins you're multiplexing of the enable bits for ball and M0, but you can use the LSB of some color values instead since those are ignored. Finally, you could easily increase the rainbow size now since the PF is static and mirrored. Simply set PF1 in prekernel too and move the cat right. This might be useful for giving some extra time to set up for the first copy of P1.
  19. Are you thinking multiple food items at different heights or do you mean having multiple food items in the same row by using duplicate or triplicate P0?
  20. I think you should reconsider. It would be much more playable without a giant flickering main character. Maybe something along the lines of this would work? Rainbow, body, background, eyes, and mouth are all drawn with PF/BK. Colors are set multiple times each scanline, but the PF graphics can be set ahead of time and remain static. Timing not to scale: [-Face-] [----HSYNC--][--------------Rainbow][---Body--] COLUPF COLUBK COLUP1 GRP1 COLUBK COLUBK GRP0
  21. SMB uses Mario's vertical velocity to decide who wins in a collision with a goomba. If Mario is traveling in a downward direction when he makes contact it will kill the goomba regardless of where the collision occurs. I point this out because it has the potential to greatly simplify things and it's fun to watch. https://youtu.be/xOCurBYI_gY?t=650
  22. One other idea to consider. Only use an asymmetric playfield for the text. This should work on all systems and fix the orientation issue. The drawback would be that you would only have enough room for 10x10 characters on a page. The good thing is that it would fill the whole screen and the letters would be large and hopefully easier to read.
  23. You would need to create a 16bit variable out of two 8bit variables. I'm not sure if bB natively supports this or not. Perhaps someone has already created some sort of math library that allows this to be done. If not you could always do it by embedding a few lines of assembly in your bB source. I think you should look for topics related to 16bit variables and post a new one if you can't find anything.
  24. Sounds like you have overflowed the 8-bit variable. I.E. 100 + 160 = 266, this is bigger than fits in 8 bits so you end up with 256 less than that. It's easier to visualize in binary: 0110 0100 = 100 decimal + 1010 0000 = 160 decimal ----------- 1 0000 0100 = 260 decimal ^ This 9th bit is discarded because you've done nothing with the carry flag. Effectively giving you the 8-bit number 0000 0100 in binary or 4 decimal or 260-256 decimal. I think an easy fix would be to divide rand by a power of 2 in order to get a smaller range.
  25. That's a cool project idea. It does look it should fit the 2600 well. I reviewed your source and have a few comments that may help you. 1. If you aren't already doing so, use source control. I recommend git. You can use a service like github to back up and share in the cloud. This also makes it easier for others to contribute to your project in the future. 2. You could trade some CPU cycles for RAM by using a single pointer for looking up the digit graphics. This would make the code more complex, so I'd only go this route once necessary. Still, it could potentially get you 10 more bytes of RAM. 3. There is a STA.W instruction which allows a 4 cycle write to zeropage addresses. You can do that instead of "hex 8D 04 00"
×
×
  • Create New...