Jump to content

ZackAttack

Members
  • Content Count

    785
  • Joined

  • Last visited

Everything posted by ZackAttack

  1. What you already have working is already impressive. One other idea could be to use non numeric glyphs for the level indicator. Then it doesn't have to look the same as the score and they could be customized to work well with what's available. Perhaps it would add a little more charm to the game? Otherwise, unroll the loop, use a more powerful bank scheme such as 3E, use harmony hardware with ARM co processing, or just leave well enough alone.
  2. What I'm really try to point out is that you have to use Y as the index to load the value for Y into a non indexed location before loading it into Y. In other words you can't do ldy DigitY,y. The tay instruction is no good either, because then you've lost your value for A and you can't load it after you stomp on the index value in Y. Using the stack is even worse because now you've got to deal with the values for X and SP. Instead you must do this which costs 6 more cycles. ; TODO Restore Y with index lda DigitY,y sta DigitYTemp ; TODO Pre load A, X, SP, GRP0, GRP1 etc. ldy Digit0Temp Maybe someone has a clever solution for this, but I haven't thought of a good one yet. 8 grp updates using ld? zp,y and sta zp is (4+3)*8= 56 cycles 2 nusiz1 updates using lda # sta zp is (2+3)*2 = 10cycles restoring Y index is at least ldy ZP = 3 cycles branch taken for loop = 3 cycles Providing the above calculations are correct 72 cycles are already used leaving just 4 more to deal with the ldy zp,y problem.
  3. Unrolling the loop would also free up some cycles during the loads because you'd no longer need an indexed load. Just hardcode the ZP address for each load. Honestly, I don't see how this would work without unrolling. Even if you don't use Y for loop maintenance don't you still need it for the indexed loads and that means extra overhead to deal with Y being used for two different things each scan line. Of course, there's always bus stuffing. PM me if you ever want to switch over to the dark side.
  4. Trading ROM space for performance has its limits, but with other cart tech like using the ARM processor in the harmony encore a FPS on the 2600 is completely feasible. As is smb 3.
  5. A few questions about this. Will the playfield be drawn between bands? Is the missile going to be limited to just a few pixels per band? How tall did you plan on making the PF pixels? What's the largest/tallest PF pixel you'd accept? Will there be more than one PF image for this mode of the game? Can the PF be limited to the middle 32 PF pixels so that updating PF0 can be avoided? What are you trying to draw with the PF?
  6. Plenty of time since you only execute one code path or the other. Speaking of time. Since you're using bands there's no need for flipdraw or any other overhead on P1. Just pad the graphics and color data with a few bytes if you want to allow a small amount of vertical movement. Then offset the pointer to effect the vertical position within each band. Another nice thing about the bands is that you can use mask draw for P0 with a mask that's 3 * band height. So a 32 line band would only need 96 bytes for the mask. Then you can do P0 with only 5 cycles of overhead. Just as important, this gets rid of all the branches and makes it a lot easier to make the kernel exactly 76 cycles. ; Updates both sprites in 37 cycles lda (ptrgrp0),y and (ptrmask),y sta GRP0 lda (prtcol0),y sta COLUP0 lda (ptrgrp1),y sta GRP1 lda (prtcol1),y sta COLUP1 One more thing to consider is how P1 is being positioned between bands. If you want to draw anything between the bands you may need to have different kernel fragments that all do the same thing, but with a RESP1 strobe at different times. Then before hand you figure out which fragment to jump to based on where P1 is to be positioned. Then have a few lines to use HMOVE for fine positioning. If you have one line for HMOVE you'd need 160/15 = 11 fragments, but with 3 lines of HMOVE you'd only need 160/45 = 4 different fragments. You'd be left with at least 4 lines between each P1, but you gain the potential for some playfield or other enhancements.
  7. I'm not saying you should get rid of the branch. I'm saying that the branch prevents the P0 graphics from being preloaded for the next line. So instead of 0 you get a garbage value. Maybe in pseudo code it could be easier to see. if(drawP0) then y = graphicForNextLine end if GRP0 = y <- y only is set if the if evaluates to true. Instead what you want is this if(drawP0) then y = graphicForNextLine else y = 0 end if GRP0 = y <- now y is 0 if it's not time to draw P0 Diff your previous posted asm file with this one to see the small change which makes all the difference. guerrilla-fix-y.asm
  8. I live my life 76 cycles at a time

  9. Some options include assymetric playfield, 48pixel, 96pixel and 128 pixel kernels. You can do just about anything on the 2600, but it almost always comes at the cost of something else. If you provide more details about what you want to accomplish, we can probably recommend a group of techniques and/or display kernels which would give the best results.
  10. 0 can be replaced with $ff and color change? This is why Atari programming is great. Spending so much effort on 2 cycles would surely get you fired in any modern programming job.
  11. Could you just use 0 for the sentinel value and omit the cmp #END_FOOD? Seems like that would free up a couple cycles.
  12. You have a branch that is skipping the code which loads the value for GRP0 into Y. This is why Y never contains the graphics data. The use of LDA (PTR),Y appears to be fine. lda #P0_HEIGHT ; 2 47 dcp Player0Offset ; 5 52 bcc .p0FlipDraw ; 2 54 (3 54) <- Branch is taken, leaving Y with offset value ldy Player0Offset ; 3 57 lda (Player0Clr),y ; 5 62 sta COLUP0 ; 3 65 lda (Player0Ptr),y ; 5 70 tay ; 2 72 .p0FlipDraw ;(3 54)
  13. The Stella programmer's guide would be a good place to start if you haven't already read it.
  14. Yes, AUDC0 and AUDC1 are set to 0 during initialization and then AUDV1 and AUDV0 are written to every scan line. Fortunately the address of the AUDVx registers are just before the GRPx registers. So the JSR trick can be used twice in a row to update 4 registers before restoring the SP back to GRP1. This demo is only using 4bit samples though. $00 is always written to AUDV1, but that's only because I originally planned on doing 4bit and only asked for 4bit samples.
  15. Great! That system has 7 low failures which are all switched over to high and then it works fine.
  16. Ok, version 4 has improved the detection algorithm based on your feedback. Please try it out.
  17. Thanks for posting the video. That is very helpful. During the stuff-low sweep D6 should be detected as a failure but it's not failing when $00 is stuffed. Looks like this is the same problem that alex_79 has. Had it been properly detected I'm certain the correction routine would have fixed it. Should be a simple change to include all 128 values in the detection routine. I'll try to post a new version soon.
  18. How many low and high failures where there during the first detection phase? And which bits failed? If you look at Hobo's screenshot you can see that D6 had a stuff high failure, I assume this would have also appeared as a failure on the stuff low step before it too. If detection is working properly it indicates your system had 3 bits which failed both high and low. Two of them would have been corrected by varying which register is stored and the third would still appear as a glitch. Most significant bits are corrected first in order to minimize the magnitude of the glitches. For PF it doesn't matter, but for color, move and other TIA registers it would. If it's a detection issue, I could cycle through all 128 values for each bit a few times to improve the chance of detecting the intermittent failures. It appears that this failure only occurs when the value being stuffed is $fc, and even then it's only some of the time. I was also thinking about including a mechanism to allow the detection to be rerun at any given time. Just in case something changes after playing a game for a while. In my notes from the last round of testing it was the systems that TheHoboInYourRoom ​and alex_79 tested which were the most problematic. One is already working completely and the other is very close. We'll need to refine the driver some and test across a broader range of systems. Looks like we will be successful soon enough.
  19. Ok, great. This is exactly what we want to see. The failures were properly detected and compensated for. Obviously the detection doesn't need to be visible or take so long but for debugging purposes it's nice to see it in action. The correction isn't applied until after the detection phase. So it's normal to see glitches at that time. The important thing is that the test pattern was always correct.
  20. Based on your feedback I think I found another problem. Should have a version 3 posted soon. Please make sure you grab the latest when you test tonight. Thanks.
  21. Was this with the updated version I posted this morning? DirtyHairy discovered a bug which cause things to get worse after the first cycle on systems with at least one stuff-low failure. Version 2 should fix that at least. Perhaps the stuff-high code has a bug as well. I'll review the code again to be sure. If the code was working properly that screen shot would indicate that 6 of the 8 bits can't be stuffed high or low. Based on your previous test that seems to indicate the code is not working correctly.
  22. Would you post a video of the first few iterations? Hopefully it can help me find the bug. This is exciting. We may be close to a workable solution.
  23. Here's another attempt which combines my prototype driver, stuffing high and low, detection of failures, and an idea Fred had to use multiple registers and illegal opcodes to correct up to 2 bits. Hopefully this will work on all the machines that previously ran into issues. This must be run on a harmony cart. Emulators will not know how to load it. Current build: stuff-with-detection-and-correction4.bin Previous builds: stuff-with-detection-and-correction3.bin stuff-with-detection-and-correction2.bin stuff-with-detection-and-correction1.bin When you first load the rom it will have some vertical lines on the sides, a test pattern in PF1 and PF2 should be black. A blue bar will sweep across part of the screen. This is the program using collision detection to determine if there are any bits which can't be stuffed low. The blue bar should turn red if a failure is detected. Next pf2 should be all on. Any bits that were detected as low failures will now switch to being stuffed high. The blue bar sweeps the screen again to detect high failures this time and will turn red as soon as it finds any. Once the detection is complete, the algorithm will find the optimal combination of stuffing low, high, and using different store instructions. It will then attempt to display the test pattern with this optimal combination. If all is well there should only be the vertical bars on the sides of the screen were PF0 is. The rest of it should look like the picture below. The program will repeat a cycle of low detection, high detection, test pattern indefinitely. However, the correction values are retained, so there shouldn't be any more failures detected after the first iteration if it finds a correct combination to use. Edit: 12/21 version 4 improves failure detection so stuffing high and low are checked with all 128 values for each data bit multiple times. Should now detect intermittent failures much better. 12/20 version 3 makes P0 positioning more robust, completely restarts detection each cycles, fixes stuff-high driver bug Version 2 fixes bug that corrupted state in subsequent passes
  24. It appears that way if you try to create a Linux VM and hardware virtualization is disabled in the bios. Once you enable the virtualization setting in the bios it will allow you to create 64bit machines. Most consumer class computers ship with this bios setting disabled for security and reliability reasons.
  25. I wasn't able to build 016.asm: guerrilla.asm (508): error: Value in 'cpy #PF_HEIGHT*2+2' must be <$100. --- Unresolved Symbol List EnvGfxOffset 0000 ???? (R ) --- 1 Unresolved Symbol
×
×
  • Create New...