Jump to content
IGNORED

Requesting help in improving TIA emulation in Stella


stephena

Recommended Posts

The easiest thing to do in this case is send me a diff. If you've grabbed the code from SVN, then it would be as easy as 'svn diff', or something similar in the UI-based version of SVN. I'm in the process of documenting this in a developers howto webpage.

 

Also, I notice your code is almost exactly the same as WSYNC, except you also add a check for >22. What is the significance of this 22??

 

I'm counting 6502 time in 1-76 cycles, and there are three unique areas for the bars in my test rom.

 

- at cycle 76 the bar moved with RSYNC is 1 pixel behind the bar moved with a normal WSYNC

- at cycles 1-22 both bars are aligned

- at cycles 23-75 the bar moved with RSYNC is 3 pixels behind the bar moved with a normal WSYNC

 

There are 160 pixels per line taking 160 TIA cycles, and 76*3 = 228 TIA cycles available. 228-160 = 68 TIA cycles for HBLANK, or 68/3 = 22.7 6502 cycles.

 

This is where the >22 comes from. At 22 cycles HBLANK is not quite complete, but at 23 cycles it is.

 

Why RSYNC and WSYNC are different in the first place is more elegantly explained by Eckhard in the thread with the RSYNC rom I built.

 

The objects always get positioned three pixels further to the right after a WSYNC than they do after a RSYNC, but this is to be expected. Triggering WSYNC will halt the CPU until the horizontal sync counter wraps around to zero. Triggering RSYNC will reset the horizontal sync counter to zero immediately. But the warp-around will actually happen after one more cycle of this counter. Since the horizontal sync counter counts once every 4 pixels, one more CPU cycle occurs before the counter warps around to zero. Therefore the positioning code will hit RESPx one cycle sooner after a RSYNC than after a WSYNC.

Link to comment
Share on other sites

IMHO the first thing to tackle is what the correct behavior of what midline NUSIZx changes are supposed to be, for all cases of the most common Atari consoles. I'm wondering if Curt has some VHDL files kicking around from making the Flashback2? It would be of great help if somebody has already gone through and got the correct behavior at the gate level.

 

Once we understand the rules then we can try and understand what Stella does now, and whether or not rewriting the whole core is the way to go. I'm not really Gung-Ho on rewriting the whole core, although I agree if the correct behavior was well known it might be better to implement it from the ground up.

Link to comment
Share on other sites

IMHO the first thing to tackle is what the correct behavior of what midline NUSIZx changes are supposed to be, for all cases of the most common Atari consoles. I'm wondering if Curt has some VHDL files kicking around from making the Flashback2? It would be of great help if somebody has already gone through and got the correct behavior at the gate level.

 

Once we understand the rules then we can try and understand what Stella does now, and whether or not rewriting the whole core is the way to go. I'm not really Gung-Ho on rewriting the whole core, although I agree if the correct behavior was well known it might be better to implement it from the ground up.

 

IMO, it's still related to keeping track of when drawing started, and when you've passed a certain point where current NUSIZx writes would influence what comes next. I see both the normal case and the midline case as aspects of the same issue. Sometimes an NUSIZx write occurs too late, and the drawing continues as it was doing (and the NUSIZx contents apply to the next draw). But sometimes the NUSIZx write is timed such that just after the write, from the current pixel on, it uses the new NUSIZx value for subsequent drawing. I think you can kind of see that with Meltdown. The rightmost columns are drawn, but not stretched, since the drawing is using the old NUSIZx value and not the new one (which probably changed the width of the player). Note that I haven't traced the code yet to see if this is actually what's happening, but it's plausible.

 

Actually, if someone could completely trace drawing a few scanlines in Meltdown, showing exactly what cycle the drawing should change, and then compare that to Stella, it would be incredibly helpful.

 

SeaGtGruff did a similar trace for BattleZone, which clearly showed the same information. I couldn't make much sense of it at the time, but perhaps with more experience I can look at it again.

 

Anyway, that's it for me for tonight. It's 10PM here, and I've been at this all day ...

 

EDIT: Before I go to bed, something else I forgot to mention. In updateFrame(), the player masks are set once, and drawing uses them from that point on. That's definitely a problem, I think. For example, if you call updateFrame(20), it will set the player masks once, then draw 20 pixels. But what if suppression was supposed to happen sometime during that 20 pixels, or an NUSIZx change was supposed to occur. Obviously it wouldn't work, because it would still be using the old mask with the old state. I think this is why MESS calls its updateFrame multiple times from NUSIZx; once to draw up to the point where the old NUSIZx value applies, then changes NUSIZx, then again for the new value.

 

Again, all this is a guess at this point, but it's an educated guess that seems to be support our findings so far.

  • Like 1
Link to comment
Share on other sites

I made a fatal error in my RSYNC code. I don't need to check for HBLANK end at all. The code should simply be like this (decrementing cyclesToEndOfLine by 1, and doing no checks):

 

uInt32 cyclesToEndOfLine = 76 - ((mySystem->cycles() -
 (myClockWhenFrameStarted / 3)) % 76);

 mySystem->incrementCycles(cyclesToEndOfLine-1);
}

 

 

I discovered it when I let my demo fully run. It was failing at 54 and above (RSYNC count). I then realized I had my numbers backwards, since it was cycles remaining not cycles gone by. I never noticed it before.

 

Seems good now, and I double checked Extra Terrestrials too.

Edited by Omegamatrix
Link to comment
Share on other sites

After rereading Andrew's notes again and comparing with some tests (more have to follow) I have in my mind, I think NUSIZ can be handled as follows:

  1. Size and copy changes have to be treated separately
  2. A size change has immediate effect to the currently drawn player/missile (copy)
  3. A copy change has no effect on the currently drawn player/missile (copy), but it is delayed until the end of that copy

So for the size change, we have to define a new value, depending on the new size and the old copy (so we can have e.g double width, 3 copies). This is used for updating the frame while we wait for the current copy to finish drawing. This means one would have to extend the mask tables for the size change already happening, but the copy change not.

 

For now this is only a theory, with lots of complications to still to consider (e.g. when does the 2nd copy of quad width, 3 copies close player start?), so I will use the next time to write some tests confirming that. And maybe play a bit with the tables later.

 

BTW: This would also explain why the 8 pixel delay was chosen: A copy of a multiple player is always 8 pixel wide, so when we change NUSIZx, this delay prevents affecting the currently drawn copy. I suppose the effects of a size change were not considered, because the logic and/or tricks were not known or used back then.

Link to comment
Share on other sites

He is a test similar to my previous one, now switching size and/or copy while a player is displayed:

 

The three groups show:

  1. a switch from double (101) to quad width (111)
     
  2. a switch from double width (101) to three copies medium (110) and back
     
  3. a switch from quad width (111) to three copies medium (110) and back

Again MESS gets pretty close to what I see on real hardware (you should check the ROM there to compare). So I attached an (non-corrected) MESS screen shot.

 

For the first group, you can see how the size increase happens in intervals. There seem to be two clocks involved. I have to check with the Andrew's notes. (on real hardware the right edges look a bit different)

 

The left copies of the 2nd and 3rd group show a different angle on the right edge. So it seems that when the switch happens, not the complete player is stretched but only the remaining pixel.

This theory is supported by the right copies. Note that for the first tests of each group, the switch back happens too early for the some shifts. So the third copy is not displayed at all.

 

But as you can also see, here are some noticeable differences to the theory. Especially the right copies' edges not always look as smooth as expected (though they are cleaner on real hardware than with MESS). The left edge distortions on the right copies are not there at all on real hardware.

 

Isn't the TIA a nice, extremely flexible beast? )

 

BTW: For missiles, the a similar size/copy switch shows simpler results. Probably because there is just one pixel and thus no 2nd clock involved, a size switch creates a clear cut edge.

post-45-0-75454700-1361694754_thumb.png

testSize2Copies_A.bin

testStella.asm

Edited by Thomas Jentzsch
Link to comment
Share on other sites

Sooo, after looking at those results and the TIA notes, (at least to me) these show, that a table based approach is getting complicated. The pixels of a player are not only defined by current values of NUSIZx, but by how the internal clocks are and were ticking.

 

The Scan Counters are never reset, so once the counter receives

the Start signal it will count fully from 0 to 7.

...

The count frequency is determined by the NUSIZ register for that

player; this is used to selectively mask off the clock signals to

the Graphics Scan Counter. Depending on the player stretch mode,

one clock signal is allowed through every 1, 2 or 4 graphics CLK.

...

The NUSIZ register can be changed at any time in order to alter

the counting frequency, since it is read every graphics CLK.

As a result, we don't have just 3 possible widths for a player, instead it can have (nearly?) all widths between 8 and 32 pixel. So when changing NUSIZ at a certain clock while the player is already displaying, we will have to calculate how many pixel have been displayed with the old stretching and how many will go with the new stretching. And from that we will have to form a new index into the mask tables (instead of using NUSIZ) directly.

 

Does that sound feasible? Maybe.

 

But this would only cover the immediate size changes. For the copy changes we would have to wait until the next player copy start signal to become effective (maybe waiting until the end of the current player is sufficient).

 

Maybe I do not understand how updateFrame() works exactly, but with 32 pixel players, we cannot simply update the frame to that point. Because many changes can happen in between. And here I am currently lost with no idea how to add this to the existing code. Any ideas???

 

Also, what happens if the Scan Counters of a quad width player are still counting and then the player is set to three copies close? My tests show that the 1st copy and the 2nd can copy completely blend together then.

 

And even the right edge of the blended player is still affected. I would expect when the 2nd copy starts the Scan Counter to be reset to 0 and then counting every 1 graphics CLK. This is not the case. Probably because some counts from the quad size copy are left.

  • Like 1
Link to comment
Share on other sites

As a result, we don't have just 3 possible widths for a player, instead it can have (nearly?) all widths between 8 and 32 pixel. So when changing NUSIZ at a certain clock while the player is already displaying, we will have to calculate how many pixel have been displayed with the old stretching and how many will go with the new stretching. And from that we will have to form a new index into the mask tables (instead of using NUSIZ) directly.

 

But this would only cover the immediate size changes. For the copy changes we would have to wait until the next player copy start signal to become effective (maybe waiting until the end of the current player is sufficient).

 

Maybe I do not understand how updateFrame() works exactly, but with 32 pixel players, we cannot simply update the frame to that point. Because many changes can happen in between. And here I am currently lost with no idea how to add this to the existing code. Any ideas???

 

Also, what happens if the Scan Counters of a quad width player are still counting and then the player is set to three copies close? My tests show that the 1st copy and the 2nd can copy completely blend together then.

 

And even the right edge of the blended player is still affected. I would expect when the 2nd copy starts the Scan Counter to be reset to 0 and then counting every 1 graphics CLK. This is not the case. Probably because some counts from the quad size copy are left.

 

All of this gives more insight into what MESS is doing with its p0gfx and p1gfx structures, and how it's calculating results for current NUSIZx changes as well as future effects of the current NUSIZx value. In some ways, I know understand more how MESS works than how the current Stella code works :(

Link to comment
Share on other sites

I made a fatal error in my RSYNC code. I don't need to check for HBLANK end at all. The code should simply be like this (decrementing cyclesToEndOfLine by 1, and doing no checks):

 

uInt32 cyclesToEndOfLine = 76 - ((mySystem->cycles() -
 (myClockWhenFrameStarted / 3)) % 76);

 mySystem->incrementCycles(cyclesToEndOfLine-1);
}

 

 

I discovered it when I let my demo fully run. It was failing at 54 and above (RSYNC count). I then realized I had my numbers backwards, since it was cycles remaining not cycles gone by. I never noticed it before.

 

Seems good now, and I double checked Extra Terrestrials too.

 

I have another test ROM here that you wrote, I believe. It doesn't work with the current RSYNC updates. Basically, you press the fire button and the bar moves all over the place on real hardware. This doesn't happen in emulation.

 

RSYNC.zip

Link to comment
Share on other sites

Here's another test program that tests NUSIZ changes for players (here P0).

 

It draws a couple of bands, where P0 is set to maximum size at the beginning of a line (111, not testing missiles), and then changed to other values in the middle of the screen (cycle 49) (110 - 000, a new value each band). In the first band, COLUP0 is changed, not NUSIZ, to better show where the write happens. Also, if you press the left button then the players move horizontally.

 

The images show it running on a PAL 4-switch woody (left) and in Stella (right)

 

post-27536-0-43616200-1361719214_thumb.jpgpost-27536-0-20570300-1361719230_thumb.png

 

On my crappy TV it is difficult to see exactly, but I think the following can be said:

  • The position where the change takes effect is about two pixels after where the color change becomes visible. So the "delay" in Stella is definitely too large.
  • When moving the players, it seems that the position of the change stays the same, it also does not seem to depend on the NUSIZ values, just the graphics differs.
  • When the players are moved, the new graphics appear pixel by pixel, but are sometimes set back by one pixel.
  • The distance between the first and second copy is not constant.

I had a look at the Player Graphics Scan Converter in the schematics and it can be seen that it is built out of 5 flops. 3 form an 8-pixel counter, and the two on the left are for control. The delay of 2 is probably because a change in NUSIZ must go through the first two flops before having an effect. As said in the hardware notes, the counter will count down even when NUSIZ is changed on the fly.

 

What makes it difficult to understand in all the different cases is that the Player Position Counter logic can reset the scan counter. And this is not only the case for multiple copies but can also happen when RESP is hit.

 

My personal feeling is that for a 100% accurate emulation this becomes too difficult to handle with tables and if cascades. Just imagine kernels that change RESP and NUSIZ multiple times per line.

 

Like the whole 2600 emulation in broken up into component emulation (6507, TIA, RIOT, ...) it might make sense to split the TIA emulation in micro-emulation of components. This does not necessarily mean that a complete rewrite would be needed. One could add new components step by step, like for example add a Scan Converter class and feed it with the pixel clock and use its output in updateFrame(). It seems MESS is already doing something similar.

NUSIZTest.bin

NUSIZTest.zip

Edited by Joe Musashi
  • Like 1
Link to comment
Share on other sites

I have another test ROM here that you wrote, I believe. It doesn't work with the current RSYNC updates. Basically, you press the fire button and the bar moves all over the place on real hardware. This doesn't happen in emulation.

 

RSYNC.zip

 

Nope, not mine. That is Wickey's I believe. He updates RSYNC right before VSYNC so the screen has already been drawn by then. I suspect that this would be hard to integrate into Stella.

Link to comment
Share on other sites

BTW: This would also explain why the 8 pixel delay was chosen: A copy of a multiple player is always 8 pixel wide, so when we change NUSIZx, this delay prevents affecting the currently drawn copy.

 

I was thinking the same thing. I read Andrew's notes last night and looked at the schematic, and learned quite a bit about how objects get displayed with the TIA. The player horizontal position counters caught my eye, and it made sense how the objects get the spacing they do. I think designing a TIA emulator might be done well by simply using all the clocks they have, but I'm not sure what Stella does yet.

Link to comment
Share on other sites

I think designing a TIA emulator might be done well by simply using all the clocks they have, but I'm not sure what Stella does yet.

Stella emulates mainly what is described in the Stella Programmer's Guide (with some tweaks). So it has an abstract view to the TIA and emulates what it is supposed to do. That's why most original games work without problems and why demos which do undocumented stuff usually do not work.

 

And that's also the reason why IMO it is a problem to change that without a major rewrite of some parts.

Link to comment
Share on other sites

My personal feeling is that for a 100% accurate emulation this becomes too difficult to handle with tables and if cascades. Just imagine kernels that change RESP and NUSIZ multiple times per line.

 

Like the whole 2600 emulation in broken up into component emulation (6507, TIA, RIOT, ...) it might make sense to split the TIA emulation in micro-emulation of components. This does not necessarily mean that a complete rewrite would be needed. One could add new components step by step, like for example add a Scan Converter class and feed it with the pixel clock and use its output in updateFrame(). It seems MESS is already doing something similar.

I agree. But my talent is limited here. I wouldn't know where to start.

Link to comment
Share on other sites

I was thinking the same thing. I read Andrew's notes last night and looked at the schematic, and learned quite a bit about how objects get displayed with the TIA. The player horizontal position counters caught my eye, and it made sense how the objects get the spacing they do. I think designing a TIA emulator might be done well by simply using all the clocks they have, but I'm not sure what Stella does yet.

 

MESS seems to use all the different clocks as described in the TIA notes, while Stella definitely doesn't. I mean it does in the case of HMOVEs (which is why that part of the code works, I guess). But it definitely doesn't for player graphics, which is the whole issue we're seeing. The more I look into it and get all your feedback, the more I'm thinking that the current approach just isn't sustainable.

 

In the "Stella 3.8 released" thread, Trebor actually extended an invitation to the MESS people to see if we could collaborate on the TIA code they have. Personally, I've already fixed some minor issues in their code, and I believe the code could be easily integrated into Stella. But the licensing isn't compatible, unless the specific people who wrote their code give permission. I guess we'll see ...

Link to comment
Share on other sites

Stella emulates mainly what is described in the Stella Programmer's Guide (with some tweaks). So it has an abstract view to the TIA and emulates what it is supposed to do. That's why most original games work without problems and why demos which do undocumented stuff usually do not work.

 

And that's also the reason why IMO it is a problem to change that without a major rewrite of some parts.

 

Yes, the explains the current situation perfectly. As an implementation of the 'machine' described in the Stella Programmer's Guide, the current code is exemplary. But as an emulation of the real hardware (warts and all), there are many corner cases not considered. As I've stated previously, my main concentration is the debugger. But that's really only part of a larger issue; I want Stella to be a developer's emulator, and that's why I want to concentrate on the debugger. Unfortunately, developers tend to push the hardware to the extreme, which is exactly the cases where the current TIA code is weaker. So it sort of implies that the current code is insufficient for what I want from the project.

Link to comment
Share on other sites

Greetings from the MAME/MESS collective.

 

With regards to the licensing question:

 

The developer of a given piece of code in MAME retains the full rights to re-license the code as he sees fit. As such, Wilbert Pol ('judge') giving permission for Stella to use the code is all the permission that should be necessary,though we're willing to assist Judge in making sure it can be relicensed. As far as any of us on the MAME/MESS team know, the current TIA code is pretty much exclusively by him. All non-global-change-related mods to tia.c are his dating back to SVN being introduced in 2007.

  • Like 6
Link to comment
Share on other sites

Greetings from the MAME/MESS collective.

 

With regards to the licensing question:

 

The developer of a given piece of code in MAME retains the full rights to re-license the code as he sees fit. As such, Wilbert Pol ('judge') giving permission for Stella to use the code is all the permission that should be necessary,though we're willing to assist Judge in making sure it can be relicensed. As far as any of us on the MAME/MESS team know, the current TIA code is pretty much exclusively by him. All non-global-change-related mods to tia.c are his dating back to SVN being introduced in 2007.

 

Thats great to hear! I've been following your conversation in the original thread, and I know most don't have an account here (and you may not be personally keeping up with the responses here). So I've created an account there, and am waiting for approval. Once approved, I'll move the licensing part of the discussion there.

Link to comment
Share on other sites

I don't mind at all if any parts of the mame/mess TIA code that I changed were used in stella (the same applies to other a2600 code that I wrote for mess).

 

However, to be honest, I am not the only one that worked on the TIA video code. After some googling (mame svn history starts in late 2007), it appears that Stefan Jokisch did the initial TIA code in for MAME 0.68 in 2003. The parts that i changed (in 2007) were all related to the NUSIZx, HMOVE, etc timings in an attempt to increase compatibility for the atari 2600 driver in mess.

 

Fwiw, somewhere on my TODO list has been a total rewrite mame's TIA code using a real state machine, although it has been on there for a couple of years already ;)

Currently whenever something changes a temp full scanline is drawn and just the new pixels get taken out of the temp scanline and copied into what gets put on the screen.

 

@Stephena: If the above information isn't enough to proceed let me know. Email will work best since I don't check these boards that much (I still use the same email address as in 2007).

  • Like 5
Link to comment
Share on other sites

However, to be honest, I am not the only one that worked on the TIA video code. After some googling (mame svn history starts in late 2007), it appears that Stefan Jokisch did the initial TIA code in for MAME 0.68 in 2003. The parts that i changed (in 2007) were all related to the NUSIZx, HMOVE, etc timings in an attempt to increase compatibility for the atari 2600 driver in mess.

 

The stuff I'm interested in is precisely the improved NUSIZx and HMOVE timings (although the latter I've already used, with your permission from an email in 2009).

 

@Stephena: If the above information isn't enough to proceed let me know. Email will work best since I don't check these boards that much (I still use the same email address as in 2007).

 

PM being sent now. Hopefully I have the correct address.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...