Mauvila #1 Posted September 12, 2007 Maybe someone can confirm this for me, but it seems that the code for the TIA class in Stella (and JStella) is taking measures for performance that are no longer necessary on today's computers. How old is the current TIA class, in its basic form? In JStella, I just bypassed the giant updateFrameScanLine method, calling a much smaller version that consists simply of that last DEFAULT case in the big method. This dramatically decreased the amount of code, and as it would appear at least, added less than a millisecond per frame, and keep in mind that this is in Java. I am asking because I'm wanting to dismantle the "scary" stuff in JStella's TIA class, like the mask constants. One of my goals from the start has been to make JStella's code accessible to the casual programmer, and the current setup is complex. Would performance unduly suffer if these changes were made? JLA Quote Share this post Link to post Share on other sites
+stephena #2 Posted September 12, 2007 Maybe someone can confirm this for me, but it seems that the code for the TIA class in Stella (and JStella) is taking measuresfor performance that are no longer necessary on today's computers. How old is the current TIA class, in its basic form? In JStella, I just bypassed the giant updateFrameScanLine method, calling a much smaller version that consists simply of that last DEFAULT case in the big method. This dramatically decreased the amount of code, and as it would appear at least, added less than a millisecond per frame, and keep in mind that this is in Java. I am asking because I'm wanting to dismantle the "scary" stuff in JStella's TIA class, like the mask constants. One of my goals from the start has been to make JStella's code accessible to the casual programmer, and the current setup is complex. Would performance unduly suffer if these changes were made? JLA The Stella project was started in 1996, and the current TIA code is probably about 6-7 years old. So yes, it probably is optimized for older systems (it once ran on a fast 486). As for the question of performance suffering, that would require testing. You've already seen almost a millisecond increase in removing some code; add a few more of those and it's possible to cause framerate stuttering. To get 60fps, we need to complete everything in approx. 16.7 milliseconds. That includes all emulation of video and sound, and pushing that video and sound to the hardware. And Stella itself must run on PDA type devices, so I'd be opposed to making it much slower. Of course, the JStella port uses Java, and computers have to be pretty beefy to use Java anyway. In those cases, I don't see a problem with making things a little slower for the sake of clarity. You should also consider that some of those optimizations you remove are there for more precise emulation. You may not notice the changes on the ROMs you test with, but there may still be changes. Be careful not to prune too much. Again, this might not be a problem for JStella if you just want to emulate the straightforward type of ROM. It might cause problems with some newer ROMs, though. Finally, everything I say might be crap Brad is in the process of rewriting the TIA class, so he may end up making it simpler anyway. Quote Share this post Link to post Share on other sites
Mauvila #3 Posted September 12, 2007 You should also consider that some of those optimizations you remove are there for more precise emulation. You may not notice the changes on the ROMs you test with, but there may still be changes. Be careful not to prune too much. Again, this might not be a problem for JStella if you just want to emulate the straightforward type of ROM. It might cause problems with some newer ROMs, though. Yeah, I did notice that Medieval Mayhem, Double Dunk, and a few others jump back and forth with the new one, so I've switched back to the old one for now. But as it currently is, I think the current non painting stuff is running at 4 milliseconds with the long optimized one, and less than 5 milliseconds with the K.I.S.S. one. And the K.I.S.S. one has room for improvement. But it's hard to really say because every frame varies by +/- 1,500,000 nanoseconds. (The painting methods for JStella were using anywhere from >17 ms, down to an almost negligible time, based on how much of the frame changes. This time decreased dramatically when I changed a few lines to allow Java to use "compatible" image types, at least on my machine. So if I could optimize the graphics more, I may have a lot of spare time in each frame. Unfortunately, I don't have a lot of spare time in real life.) Quote Share this post Link to post Share on other sites
SeaGtGruff #4 Posted September 12, 2007 The other weekend I was experimenting with some little programs in z26 and Stella, as well as on my heavy sixer Atari 2600 with a Krokodile Cartridge. In particular, I was trying to answer some questions I had about the behavior of the interval timer and the timer interrupt flag. I remembered that batari (Fred) had once stated that none of the emulators duplicate the timer's behavior quite right, and I did notice some discrepancies. I'll try to post a comprehensive list later, along with some example code, although I seriously doubt it will be before this coming weekend. But in the meantime, the results of my preliminary tests can be summed up as follows: INTIM in z26 -- z26 seems to emulate the interval timer correctly, but I need to check more closely. INTIM in Stella -- Stella seems to emulate the interval timer more or less correctly, except for at least one odd difference. If you set the timer using TIM1T, then it runs down at twice the correct speed once it's counted down and wrapped around. That is, once the timer goes from 0 ($00) to 255 ($FF), it starts decrementing by 2 every cycle (255, 253, 251, 249, etc.), instead of by 1 every cycle as it should. But it seems like this happens only if you set the timer using TIM1T. I need to check this out further. TIMINT in z26 -- z26 does not emulate the timer interrupt flag quite right. In particular, the flag seems to get set 4 cycles after the countdown has completed-- i.e., when the timer hits 251 ($FB), rather than when it hits 255 ($FF). TIMINT in Stella -- On the other hand, Stella seems to emulate the timer interrupt flag correctly. I also happened to notice a display bug in batari Basic's score. However, the score looked fine in z26 and Stella; the display bug only occurred on an actual Atari 2600. Of course, this means z26 and Stella aren't emulating the Atari's actual behavior correctly. I need to check into this further, but my initial suspicions are that it might have something to do with the use of VDELxx. In particular, the sixth digit of the score seems to be displaying the first 3 pixels of the fourth digit, followed by the last 5 pixels of the sixth digit-- i.e., the updating of GRP1 seems to be occurring 1 cycle (3 color clocks) too late, in terms of where the players have been horizontally positioned for the score display. Now, I would say that this is a positioning error in batari Basic (and it is)-- that either the score needs to be bumped 3 color clocks to the left, or else the GRPx registers need to be updated 1 cycle sooner during the score display. However, since both z26 and Stella "correctly" (or rather, incorrectly) display all 8 pixels of the sixth digit, I wonder if their timing *would* be correct if VDELxx weren't being used? That is, if we update GRP0 or GRP1 while VDELP0 and VDELP1 are both disabled, perhaps the change is "instantly" apparent, and perhaps this is how the timing works in z26 and Stella. But if we update GRP0 while VDELP1 is enabled (or vice versa), perhaps the change in GRP1 does *not* occur "instantly," but is instead delayed by 1 cycle while the TIA copies the "new" GRP1 value into the "old" GRP1 register, and perhaps z26 and Stella don't emulate this slight delay correctly? Since I haven't done any experiments to confirm or refute these suspicions, they're really just idle speculation. Also, I remember that batari Basic's score was the subject of some discussion a while back, but I thought that particular issue was tracked down to something related to certain models of the Atari 2600 Jr.-- so this might not have any connection to VDELxx at all. But in any case, the actual behavior of my heavy sixer doesn't match the behavior of z26 and Stella. Michael Quote Share this post Link to post Share on other sites
+batari #5 Posted September 12, 2007 (edited) The score bug (if this is the same one) turned out to be due to the ASR instruction working differently on real hardware than it does on emulation. According to every document available, ASR should operate by ANDing the accumulator with an immediate value and shifting the result to the right. And this is exactly what happens in Stella and z26. But on real hardware, the above happens except that bit 1 of the accumulator is copied to the result. EDIT: the above behavior with ASR is merely what I observed - it might not be the whole story. Edited September 12, 2007 by batari Quote Share this post Link to post Share on other sites
supercat #6 Posted September 12, 2007 But on real hardware, the above happens except that bit 1 of the accumulator is copied to the result. EDIT: the above behavior with ASR is merely what I observed - it might not be the whole story. Hmm... I never saw any trouble with it in Strat-O-Gems; the 2005 kernel (unlike the 1994 kernel) uses ASR #$AA; maybe with a $AA operand things work normally? Quote Share this post Link to post Share on other sites
Nukey Shay #7 Posted September 12, 2007 No probs with Hack'Em either AFAIK, it uses the instruction 8 times, tho it's true that bit 1 is only set in a couple of them, and none of them are located in time-critical areas (still, it would lead to intermission animation problems if true). Arguments: $0C (setting up title display pointers) $10 (to set the monster's reflect register based on the frame counter) $07 (picking a random monster direction) $3F (setting the siren pitch) $06 (generic, to determine if sprites animate during intermissions) $78 (slower animation of large Pac sprite in intermission 1) $04 (vertical animation of monster cloak in intermission 3) $10 (rolling eyes animation in intermission 2) $78 (used in a generic intermission subroutine) Quote Share this post Link to post Share on other sites
+batari #8 Posted September 12, 2007 The skepticism doesn't surprise me - I'm just relating what was observed. So I will put this in the proper context. This code was to set score pointers and return the values in Y and X registers. LF480: and #$0f asl asl asl adc #$9C tay txa asr #$F0 adc #$9c tax rts This code worked on an emulator but not real hardware. Whatever was in bit 1 at the txa was present in X after the tax on real hardware, by observation of the results. See topic here with test binaries. The new code which DID work on both a real 2600 and an emulator was: LF480: and #$0f asl asl asl adc #$9C tay txa arr #$F0 tax sbx #$64 rts If not asr, can anyone explain why? Quote Share this post Link to post Share on other sites
SeaGtGruff #9 Posted September 12, 2007 (edited) The score bug (if this is the same one) turned out to be due to the ASR instruction working differently on real hardware than it does on emulation. According to every document available, ASR should operate by ANDing the accumulator with an immediate value and shifting the result to the right. And this is exactly what happens in Stella and z26. But on real hardware, the above happens except that bit 1 of the accumulator is copied to the result. EDIT: the above behavior with ASR is merely what I observed - it might not be the whole story. Okay, then I guess what I observed is something else, because I don't see how the behavior you described could produce the results I'm seeing. But I just ran it again on my Atari, and I think my original description was a little off. First, here's one of the little programs I wrote to test the behavior of INTIM and TIMINT: TIM1T = 1 b = TIMINT a = INTIM score = 0 if a <> 0 then for c = 1 to a : score = score + 1 : next if b{6} then score = score + 64000 if b{7} then score = score + 128000 scorecolor = $0E loop drawscreen goto loop Stella shows "128237": z26 shows "000246": My heavy sixer shows "128246," but the left edge of the "6" (last digit) is distorted/corrupted/?: I originally thought it looked like the first few pixels of the "8" were carrying over into the "6," but that wouldn't make sense, because the "8" is drawn with GRP0, and the "6" is drawn with GRP1. Then I thought the first few pixels of the second "2" are carrying over into the "6," which makes sense if the Atari is starting to draw the last digit with GRP1 a few color clocks before the new GRP1 register is copied to the old GRP1 register. But it isn't exact; some of the lines look like the "6" does contain part of the "2," but other lines almost look okay. It's like the timing isn't exactly the same on all of the lines. When I get a chance, I'm going to create some different tests to get a better idea of what's going on. In any case, I'm not sure that ASR is the culprit, because I don't see how bit 1 could be affecting the pixels that correspond to bit 7 and bit 6. EDIT: I wrote this before I saw batari's second post, which I haven't digested yet. Aside from the quirky "bug"(?) with the last digit of the score, this particular test does show that the timer and the timer flag aren't being emulated correctly. z26 shows the correct value of INTIM, but not the correct value of TIMINT. On the other hand, Stella shows the correct value of TIMINT, but not the correct value of INTIM. By changing the program a little bit, I can see that z26 eventually does show the correct value of TIMINT, but it's like the timer flag is getting set 4 cycles too late: TIM1T = 5 a = 0 : rem * this is just to add a 5-cycle delay b = TIMINT a = INTIM score = 0 if a <> 0 then for c = 1 to a : score = score + 1 : next if b{6} then score = score + 64000 if b{7} then score = score + 128000 scorecolor = $0E loop drawscreen goto loop z26 says "000245." TIM1T = 4 a = 0 : rem * this is just to add a 5-cycle delay b = TIMINT a = INTIM score = 0 if a <> 0 then for c = 1 to a : score = score + 1 : next if b{6} then score = score + 64000 if b{7} then score = score + 128000 scorecolor = $0E loop drawscreen goto loop z26 says "128244." This is my understanding of what the timer has in it at each cycle of these programs: LDA #$05 STA TIM1T; ?? ?? ?? 05 LDA #0; 04 03 STA a ; 02 01 00 LDA TIMINT; FF FE FD FC; TIMINT should be %10000000 STA b ; FB FA F9 LDA INTIM; F8 F7 F6 F5; INTIM is $F5 or 245 STA a ; F4 F3 F2 ; z26 says "000245" (should be "128245") LDA #$04 STA TIM1T; ?? ?? ?? 04 LDA #0; 03 02 STA a ; 01 00 FF LDA TIMINT; FE FD FC FB; TIMINT is %10000000 STA b ; FA F9 F8 LDA INTIM; F7 F6 F5 F4; INTIM is $F4 or 244 STA a ; F3 F2 F1 ; z26 says "128244" As this shows, when z26 loads TIMINT, it's actually picking up the value that TIMINT had at the last cycle of the previous instruction. On the other hand, I can see that Stella gives the correct behavior for the timer flag, because it's set at the same moment that the timer rolls over from 0 to 255-- but then the timer decrements twice as fast as it should: TIM1T = 4 b = TIMINT a = INTIM score = 0 if a <> 0 then for c = 1 to a : score = score + 1 : next if b{6} then score = score + 64000 if b{7} then score = score + 128000 scorecolor = $0E loop drawscreen goto loop Stella shows "000243." TIM1T = 3 b = TIMINT a = INTIM score = 0 if a <> 0 then for c = 1 to a : score = score + 1 : next if b{6} then score = score + 64000 if b{7} then score = score + 128000 scorecolor = $0E loop drawscreen goto loop Stella shows "128241." LDA #$04 STA TIM1T; ?? ?? ?? 04 LDA TIMINT; 03 02 01 00; TIMINT is 000000 STA b ; FF FE FD LDA INTIM; FC FB FA F9; INTIM should be $F9 or 249 STA a ; F8 F7 F6 ; Stella says "000243" (should be "000249") ; Stella is doing this: STA b; FF FD FB ; LDA INTIM; F9 F7 F5 F3 LDA #$03 STA TIM1T; ?? ?? ?? 03 LDA TIMINT; 02 01 00 FF; TIMINT is %10000000 STA b ; FE FD FC LDA INTIM; FB FA F9 F8; INTIM should be $F8 or 248 STA a ; F7 F6 F5 ; Stella says "128241" (should be "128248") ; Stella is doing this: STA b; FD FB F9 ; LDA INTIM; F7 F5 F3 F1 On the other hand, if I set TIM8T to 0, then Stella *almost* decrements the timer correctly after rollover occurs: TIM8T = 0 b = TIMINT a = INTIM score = 0 if a <> 0 then for c = 1 to a : score = score + 1 : next if b{6} then score = score + 64000 if b{7} then score = score + 128000 scorecolor = $0E loop drawscreen goto loop Stella says "128244," z26 says "000245." LDA #$00 STA TIM8T; ?? ?? ?? 00 LDA TIMINT; FF FE FD FC; TIMINT is %10000000 STA b ; FB FA F9 LDA INTIM; F8 F7 F6 F5; INTIM should be $F5 or 245 STA a ; F4 F3 F2 ; Stella says "128244" (should be "128245") But if I set TIM64T to 0, or set T1024T to 0, then Stella *does* decrement the timer correctly after rollover occurs: TIM64T = 0 b = TIMINT a = INTIM score = 0 if a <> 0 then for c = 1 to a : score = score + 1 : next if b{6} then score = score + 64000 if b{7} then score = score + 128000 scorecolor = $0E loop drawscreen goto loop Stella says "128245," z26 says "000245." Michael Edited September 12, 2007 by SeaGtGruff Quote Share this post Link to post Share on other sites
+batari #10 Posted September 12, 2007 (edited) In any case, I'm not sure that ASR is the culprit, because I don't see how bit 1 could be affecting the pixels that correspond to bit 7 and bit 6. The problem you're seeing isn't the ASR weirdness, as that seemed to affect the 1st, 3rd or 5th digit. As for ASR, I am going back over the original problem, and I think I'm a little off. It seems like if bit 4 of the Y register is set, it causes the ASR result to be two more than it should be. Yes, I know this doesn't make any sense. I also can't remember if I tested it myself on real hardware, so if not, I will do so. I've commented things a little here: LF480: and #$0f ; if A=$02, $03, $06, or $07, there are problems. asl asl asl adc #$9C tay ; Since we shifted left, it's actually bit 4 now txa ; now that A is overwritten, it's just bit 4 of Y that seems to mess things up asr #$F0 adc #$9c tax ; if bit 4 of Y is set, then X=X+2 on real hardware, but not on an emulator. rts Edited September 12, 2007 by batari Quote Share this post Link to post Share on other sites
SeaGtGruff #11 Posted September 12, 2007 The problem you're seeing isn't the ASR weirdness, as that seemed to affect the 1st, 3rd or 5th digit. Since it's the first few pixels of the last digit, and since at least some of those first few pixels look like they could be held over from the beginning of the fourth digit, I just figured it must be a timing/positioning issue when updating the player registers, perhaps especially when using VDELxx. When I get a chance-- which probably won't be until this coming weekend at the earliest-- I plan to do some extensive testing on my heavy sixer to investigate the timings related to changing the graphics data of the players, missiles, balls, and playfield as they're being drawn, so I'll know (if only to satisfy my own nagging curiosity) what those timings are. For example, I know there are well-known timing issues related to updating the playfield registers, and I've already done experiments on the emulators to get a listing of these timings, but I've never done them on a real Atari! And I want to compare the timings of updating the GRPx registers both with and without VDELxx enabled, as well as test with different player/missile/ball sizes, etc. And that's just the shape changes-- I'm also eventually going to test the color changes, horizontal positioning, HMOVES, etc.-- all of which I presume has been tested by many other people already, but I've never done it myself, and I want to put it all together in a nice document if possible. And besides testing on my heavy sixer (and eventually on my early-model 7800, once I find my Cuttle Cart 2), I plan to test on the various emulators as well, so I can document any differences between the real hardware and the emulators. In fact, that's the only reason I brought this up here-- in the interest of helping the Stella developers improve the TIA emulation. I just wish I had a variety of 2600s, in case there are any differences between different models-- but I plan to include my little test programs when I document all of this, so other people can test on their machines, too. Michael Quote Share this post Link to post Share on other sites
+stephena #12 Posted September 12, 2007 In fact, that's the only reason I brought this up here-- in the interest of helping the Stella developers improve the TIA emulation. I just wish I had a variety of 2600s, in case there are any differences between different models-- but I plan to include my little test programs when I document all of this, so other people can test on their machines, too. Michael And it's very much appreciated. I've notified Brad about this thread, so hopefully any information you find will speed up development of the revised TIA code. I only wish I could help more on the TIA front, but alas, I'm still trying to decipher that stuff myself. Quote Share this post Link to post Share on other sites