Jump to content
IGNORED

Requesting help in improving TIA emulation in Stella


stephena

Recommended Posts

I could be wrong, but I doubt the TIA notes are 100% complete. Maybe we can fill any gaps by analyzing the results of some test ROMs.

 

What I meant with my statement is, that I wouldn't know to build the framework. Would we handle the various clocks in parallel? If we go for an object based approach, would we just ask all objects pixel by pixel if they are on and with which color? Wouldn't this become too slow?

 

BTW: Maybe this could help too? http://www.visual6502.org/images/pages/Atari_10444D_TIA.html If they ever post the source code for download...

Link to comment
Share on other sites

I'm still researching the best way forward. I'm in talks with the MESS people, and the person directly responsible for the 2600 portion (known as 'Judge' on the MESS forums). So I have permission to use that code, if necessary.

 

The if necessary part comes from the fact that the MESS code, while more accurate than Stella in certain areas, is actually less accurate in others. And if I were to just take the code as-is right now (even forgetting about some cases where it isn't accurate), I'd lose quite a bit of Stella functionality. Specifically, hooks into the debugger, fixed debug colours mode, disabling object graphics/collisions, PAL colour loss effect, runtime configuration of ystart and height, and probably some speed as well. IMHO, a lot of those are things that make Stella invaluable for development.

 

So we may work to improve the MESS core to add these new features, and share the core between projects. Or maybe Judge can explain his improvements in MESS, and I can implement similar functionality in Stella. Or perhaps a complete rewrite, separate from both projects, is the way to go.

 

As always, the number of people willing to help, and the expertise they bring to the table will determine how this goes. If I'm the only one working on things (as it's been in the past), assuming I don't burn out, I'll probably just continue to work on the current Stella core. As the saying goes, go with the devil you know.

 

Or if some other group decides to do a TIA implementation from scratch, I may just leave this part of Stella alone and move to other areas, coming back to it when we have some code to work with.

 

Decisions, decisions. But at least I feel I'm not alone in the battle, as it seemed for the past few years (and eventually lead to my rant in the original thread).

  • Like 1
Link to comment
Share on other sites

The best long term approach is probably a gate-level emulation, so this issue is fixed for good. But as always (and this has been the problem all along), getting people (both numbers and with expertise) is the issue. I personally don't know enough about reading schematics and implementing it in software to do this alone. I haven't done that sort of thing since my university days, and even then it was on greatly simplified circuits.

Link to comment
Share on other sites

I suggest a fun and leisurely rewrite, it should be gate-level emulation. This has the potential to be 100% accurate. It is also important to keep it maintainable and upgradeable.

If we get some people who are willing to do the coding until it is really done, I support this approach.

Link to comment
Share on other sites

Specifically, hooks into the debugger, fixed debug colours mode, disabling object graphics/collisions, PAL colour loss effect, runtime configuration of ystart and height, and probably some speed as well.

This is really the issue. Is is not so simple as taking the emucore out and plunking in another one. I was thinking too about all the savestates, and the rewind function in the debugger. The whole emulator starts with emucore, and then you build features around it. Stella is in a very advanced state and to replace the core would require a major rewrite of the whole emulator. Once you switch from a behavioural to a component architecture you have to rethink how all the features built on top of the core should be integrated.

 

The best long term approach is probably a gate-level emulation, so this issue is fixed for good.

I agree, I'm just frightened because I think it will be a big project. A FSM with gate level emulation is the way to go for the core.

 

If we get some people who are willing to do the coding until it is really done, I support this approach.

This. I love Stella and want to help, but my time commitments to programming have not been stable over the last year or so, and will probably be sporadic for at least the next year or so. But there is enough talent among all of us to accomplish this. We should start with block diagramming the whole TIA, and then discuss how the state machine should function. But saying all that, my time is not good over the next week or so.

Edited by Omegamatrix
Link to comment
Share on other sites

Sometimes when I am half asleep I have the best ideas, but sometimes they are just stupid. :)

 

Anyway, here it goes: Could we just rewrite updateFrame() and leave everything else (almost) like it is now?

 

Inside updateFrame() we query the 6 objects (PF0/1, M0/1, BL, PF). The objects handle everything else and update themselves whenever they get queried. The return value for each query would be a flag for the pixel being enabled and the color. By using priorities, we might even (partially) skip lower priority objects here, when a pixel is already enabled.

 

updateFrame() would just handle the pixel count and start a new scanline when it is due. Either we keep the current delay parameter, or, if that isn't sufficient, we inform the objects about the poke command when we query it. Finally we move all TIA logic outside updateframe to the objects and query updateFrame whenever we poke a TIA register.

 

So all the magic would be inside the 6 objects. And we could implement them one by one and maybe leave the existing logic until we replace it with an object. Also the object's implementation could start by using the existing tables and then step by step be adjusted to the new requirements.

 

For me this seems possible as far as I understand the existing code and the TIA requirements. But I am a bit afraid I am missing something important.

Edited by Thomas Jentzsch
Link to comment
Share on other sites

This is really the issue. Is is not so simple as taking the emucore out and plunking in another one. I was thinking too about all the savestates, and the rewind function in the debugger. The whole emulator starts with emucore, and then you build features around it. Stella is in a very advanced state and to replace the core would require a major rewrite of the whole emulator. Once you switch from a behavioural to a component architecture you have to rethink how all the features built on top of the core should be integrated.

 

Ah, I forgot about state saves and rewind. Also, the current code creates 2 'framebuffers' for the display code to use; this is how phosphor mode and blending works in the various rendering modes. If we go to a single framebuffer, then the rendering engines need to be modified too. As you say, it quickly spirals out of control.

 

All that being said, I think it would be relatively straight-forward to add all that back to another engine. But at that point, wouldn't it have been better to stick with what we have, and fix the remaining issues??

Link to comment
Share on other sites

Sometimes when I am half asleep I have the best ideas, but sometimes they are just stupid. :)

 

Anyway, here it goes: Could we just rewrite updateFrame() and leave everything else (almost) like it is now?

 

Inside updateFrame() we query the 6 objects (PF0/1, M0/1, BL, PF). The objects handle everything else and update themselves whenever they get queried. The return value for each query would be a flag for the pixel being enabled and the color. By using priorities, we might even (partially) skip lower priority objects here, when a pixel is already enabled.

 

updateFrame() would just handle the pixel count and start a new scanline when it is due. Either we keep the current delay parameter, or, if that isn't sufficient, we inform the objects about the poke command when we query it. Finally we move all TIA logic outside updateframe to the objects and query updateFrame whenever we poke a TIA register.

 

So all the magic would be inside the 6 objects. And we could implement them one by one and maybe leave the existing logic until we replace it with an object. Also the object's implementation could start by using the existing tables and then step by step be adjusted to the new requirements.

 

For me this seems possible as far as I understand the existing code and the TIA requirements. But I am a bit afraid I am missing something important.

 

This is partly what the MESS code is doing now. There are methods to draw each object. Inside these methods, it just basically steps through pixels, one by one, and draws them based on the state at that point in time. That works great for missiles, the ball and playfield, since the changes to registers they use is immediate. However, for the players, this drawing has to take timing into account, since the output can depend partly on the previous NUSIZx value, and partly on the (currently written) new NUSIZx value. This is where Stella fails; the drawing it does is based on the current NUSIZx only, not a combination of two values. MESS gets around that by pre-computing the effects of old and new NUSIZx and storing it in an array, which the player draw methods then iterate over.

 

I've also been thinking about ways around this. First, a little explanation of the 'masks' in Stella. These look complicated (and granted, I plan to simplify them a little if I stick with the code), but really all they are are bitmasks of the serial output from an object. So basically if a bit is set, the object is drawn, otherwise it's not. Very simple stuff. Where the complication comes in is that the player masks are pre-computed with the assumption that NUSIZx will not change midstream. So one possibility is to use the logic from MESS wrt player graphics to create a player bitmask on-the-fly for the current state. And of course, when the state changes (NUSIZx or RESPx write), the mask is updated accordingly. That would allow the current bitmask approach in Stella to keep working. And I'm convinced that there's nothing wrong with a bitmask approach; the only issue is that the bitmasks must be relevant to the current state at any point in time.

 

The code is capable of being adapted for this. Remember, the original HMOVE functionality in Stella was table-based (ie, didn't take timing into account). This was replaced by the MESS way of doing things (calculating delays on the fly), but the mask approach still worked; it just used updated (correct) delays. I'm convinced that something similar can be done for player stuff.

 

So again, I'm back to the same place. Do we improve Stella, improve MESS (and add the stuff from Stella), or go with a complete gate-level rewrite?

 

P.S. I just heard back from Judge, and he's willing to work together to improve the TIA emulation (probably based on what MESS has). So depending on his schedule, maybe the decision is already made??

Link to comment
Share on other sites

However, for the players, this drawing has to take timing into account, since the output can depend partly on the previous NUSIZx value, and partly on the (currently written) new NUSIZx value.

True, but if we write the new value to NUSIZ and then updateFrame, this would call all TIA objects, e.g. player0.getState(). The player0 object would compare the internally stored relevant TIA registers with the current TIA registers and identify the changes. Alternatively (faster, but less nice) each write to e.g. NUSIZ would actively inform all relevant TIA objects.

 

I've also been thinking about ways around this. First, a little explanation of the 'masks' in Stella. These look complicated (and granted, I plan to simplify them a little if I stick with the code), but really all they are are bitmasks of the serial output from an object. So basically if a bit is set, the object is drawn, otherwise it's not. Very simple stuff. Where the complication comes in is that the player masks are pre-computed with the assumption that NUSIZx will not change midstream. So one possibility is to use the logic from MESS wrt player graphics to create a player bitmask on-the-fly for the current state. And of course, when the state changes (NUSIZx or RESPx write), the mask is updated accordingly. That would allow the current bitmask approach in Stella to keep working. And I'm convinced that there's nothing wrong with a bitmask approach; the only issue is that the bitmasks must be relevant to the current state at any point in time.

Nothing wrong with that. We just have to move that into the TIA objects.

 

So again, I'm back to the same place. Do we improve Stella, improve MESS (and add the stuff from Stella), or go with a complete gate-level rewrite?

I think (and hope) my approach allows both. We move all logic which is scattered around the code into the objects and then change the logic inside the objects. We can start with the existing tables and tweaks and decide step by step how to continue.

 

Having the logic in separate objects classes would also make parallel development easier.

Edited by Thomas Jentzsch
Link to comment
Share on other sites

Sometimes pseudo code helps me transferring my ideas:

 

Inside the TIA class, we define the following classes

   class AbstractTIAObject 
   {
   public:
       // Triggers an update of the object until the current clock, returns true if the current pixel is enabled (TODO: color, priorities).
       virtual uInt8 getState() = 0;
       // Informs the object that a TIA register has been updated. The object decides if and how to handle it.
       virtual void handleRegisterUpdate(uInt8 register, uInt8 value) = 0;
   };

   class AbstractPlayer : public AbstractTIAObject
   {
       // special player logic in here
   };

   class Player0 : public AbstractPlayer
   {
   };

   class Player1 : public AbstractPlayer
   {
   };

   class AbstractParticle : public AbstractTIAObject
   {
       // special missile and ball logic in here
   };


   class Missile0 : public AbstractParticle 
   {
   };

   class Missile1 : public AbstractParticle // maybe aggregate a common player/missile abstract class
   {
   };

   class Ball : public AbstractParticle // maybe aggregate a common player/missile abstract class
   {
   };

   class Playfield : public AbstractTIAObject
   {
   };

 

And in case we poke to NUSIZ0 we simply do:

   case NUSIZ0: 
   {
       myPlayer0.handleRegisterUpdate(addr, value);
       myMissile0.handleRegisterUpdate(addr, value);
       myNUSIZ0 = value;
       updateFrame();
   }

 

In updateFrame we do:

   ...
   enabled = myPlayer0.getState() | myPlayer1.getState() ... | myPlayfield.getState();
   ...

Edited by Thomas Jentzsch
  • Like 1
Link to comment
Share on other sites

Sometimes pseudo code helps me transferring my ideas

 

OK, makes perfect sense. When I'm thinking of masks vs. arrays, specific pixels, etc, that's all abstracted away in the classes you mention. This is definitely the better 'object-oriented' way to do it, and can accommodate both mask and non-mask approaches. I can easily see it now.

 

And even if we decide to fix MESS instead of Stella, this concept is still needed, since one of the main things Judge wanted to fix was making the objects a state machine. Currently, all the objects are drawn into one array, and copied out into the framebuffer array. This would be the better way to do it. And if sticking with Stella, anything that makes updateFrame() easier to follow would be welcome.

Link to comment
Share on other sites

So, would it make sense to create a branch where we can start refactoring the TIA code and then start working on the details?

 

Are you volunteering to start work on this, or at least act as the lead on the restructuring? :)

 

I ask because I can make a new branch in about 30 seconds if you like.

Link to comment
Share on other sites

Well, my C++ is a bit rusty (10 years or so), but think I could start.

 

But someone has to review what I am doing. :)

 

BTW: I might get interrupted when Nathan gets the final graphics done for Star Castle.

Edited by Thomas Jentzsch
Link to comment
Share on other sites

Well, my C++ is a bit rusty (10 years or so), but think I could start.

 

But someone has to review what I am doing. :)

 

BTW: I might get interrupted when Nathan gets the final graphics done for Star Castle.

 

OK, there's a new branch named 'new-tia' just created. You will have to check out from this branch instead of the current 'trunk' that you're looking at now. To do this, refer to the SVN instructions on the Stella Development page, but change the address as follows:

 

And have at it :D

Link to comment
Share on other sites

What I meant with my statement is, that I wouldn't know to build the framework. Would we handle the various clocks in parallel? If we go for an object based approach, would we just ask all objects pixel by pixel if they are on and with which color? Wouldn't this become too slow?

 

I haven't written a digital simulator yet, but yes something along these lines. The basic idea would be to encapsulate functional TIA units in classes like the Player Graphics Scan Counter, Player Position Counter, Missile Position Counter, etc. A class would itself work as a state machine of the respective unit.

 

For example, for the scan counter the state would be the contents of the 5 flops, also some variables would be needed to buffer its inputs. According to the input signals there would be methods to be called when an input line changes (pixelclock, start, nz0, nz1, nz2, ...) and methods to query the output lines (GS0, GS1, GS2,...).

 

Having the position counter work together with the scan counter might be something like:

 

PlayerPositionCounter poscnt;
PlayerGraphicsScanCounter scancnt;
...
for each pixel
{
...
pixelclock = 1; // rising clock

poscnt.setPCK(pixelclock);
start = poscnt.getStart();

scancnt.setStart(start);
scancnt.setPCK(pixelclock);

pixelclock = 0; // falling clock

... // same as above

gs[]=scancnt.getGS(); // current player pixel number (3 bit)
p0 = someUnit(gs); // player serial graphics.

output(p0);
}

 

That's a very crude example, it's more complicated but you get the idea. And yes it would be slow. However, there is a lot of room for optimization. Like there could be shortcuts if the counter is 0 and start is not triggered.

 

I would probably try to write a ScanCounter class first, then try to plug it into the current code, then the position counter an so on. I would agree that a top-down approach and starting a rewrite from scratch would be cleanest, but it would take long to have something working. Doing it bottom-up, i.e., making the current code more modular while making the emulation/simulation granularity finer would quicker yield results.

 

I don't know if and when I would have time to work on this. But the good thing is, it would not conflict with current refactoring efforts. If there would be classes for all TIA graphics objects, it would be even easier to plug something like this in.

Link to comment
Share on other sites

I need a user and password to commit.

 

Oops, I forgot about that. To have commit access, you need a Sourceforge account (ie, username and password). Once you have that, I can get you added to the project.

 

If you already have one, please forward your username. Otherwise, you'll need to go to https://sourceforge....er/registration and set up an account. Again, once you have it, send me the username.

Link to comment
Share on other sites

I like the idea of a gate-level emulation, although it wouldn't necessarily need to emulate *every* gate-- for example, there are places where there are two or three inverters in a row, with the second and third usually forming a "super inverter" if I understand correctly, so there's no overall effect on the logic, just on the signal level and/or rise-fall timing (again, if I understand correctly).

 

I've actually been piddling around off and on (no pun intended) with a spreadsheet simulation of the TIA, so I've had to teach myself how to read the schematics-- but sadly, that means my teacher only knows as much as I do! ;) It would be interesting to emulate the TIA at the core level, because then the randomness of its initial state could be emulated as well. For example, my spreadsheet has helped me to see that the phi-0 output doesn't "settle down" to a proper signal until the horizontal position counter gets reset and triggers a "reset phi-0" signal (which explains why that "reset phi-0" signal line is needed in the first place). Also, the first two states of the horizontal sync counter (and likewise the other counters) can be disconnected or out-of-sequence-- by which I mean the second state isn't determined by the first state, since each "bit" of the counter is driven by an H-phi-1 gate and an H-phi-2 gate, with each gate having a separate random initial state, hence the first pattern of bits (determined by the random settings of the H-phi-2 gates) and the second pattern of bits (determined by the random settings of the H-phi-1 gates) don't necessarily follow the normal sequence, although the sequence does proceed normally from the second bit pattern onward.

 

However, there's a possible problem with a gate-level emulation-- namely, signal delays due to cumulative gate delays, plus the delay between the outgoing phi-0 signal and the incoming phi-2 signal. This can probably be worked around by using the TIA timing diagrams for the critical spots, but the point is that some of the signals will be out of synch with each other, since they don't change states at the same times.

 

To be accurate, I think the emulation should simulate all of the TIA's inputs and outputs. I see there's been mention of the graphics objects, but don't forget the blanking signal, sync signal, lum signals, and color signal, as well as the two audio signals, plus the ready signal, etc.

Link to comment
Share on other sites

Three inverters in a row are typically used either for delay and making sure a signal gets to a point at a certain time, or as a buffer to ensure a solid pull high or low, while still completing the inversion. In rarer cases it is a form of noise isolation, or a source of noise.

Edited by Keatah
Link to comment
Share on other sites

Doing it bottom-up, i.e., making the current code more modular while making the emulation/simulation granularity finer would quicker yield results.

 

I don't know if and when I would have time to work on this. But the good thing is, it would not conflict with current refactoring efforts. If there would be classes for all TIA graphics objects, it would be even easier to plug something like this in.

It may take a few days until it is done, but I am working on it. In between (maybe tomorrow or so) you can check out the branch too and have a look at it to see, if the refactoring suits your needs or if I should change something.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...