Jump to content
IGNORED

ANTIC decap and reverse engineering


ijor

Recommended Posts

Woohoo! I just discovered one of my 800's also has a CTIA, so I can definitely spare one. I've bought a few machines on ebay recently and hadn't checked them out thoroughly.

Nice find!

Yeah, I was hoping to have one and here I have two. Back in the '80s they'd think me mad for being excited about CTIA. :) I suppose it's still a little pathetic today, but it's like an archeological dig. :D

 

What I want to know is if CTIA contains GTIA circuitry with the register bits disabled. That would confirm that CTIA was only shipped because GTIA was behind schedule.

Link to comment
Share on other sites

I did a test, and it does look like you can DMA into the players from the playfield. I'm not sure what to do with this besides trying to emulate it -- you can only DMA into 2 players + missiles at most, and the highest DMA rate interferes with LMS instructions -- but maybe someone can figure out a creative use for it in a demo.

 

Phaeron,

 

My (very personal) opinion is that emulation should be as accurate as possible, disregarding how useful a particular emulated feature is. The ultimate goal should be that software won't be able to distinguish between emulator and real hardware.

 

Of course that there is a consideration of how much effort it is worth. And in this case it is certainly your call. But it seems to me that, if you are going for all the trouble of somehow emulating the "DMA clock" and the HSCROL/DMA tricks, then this particular issue of multiple concurrent DMA doesn't look the hardest part.

Link to comment
Share on other sites

My (very personal) opinion is that emulation should be as accurate as possible, disregarding how useful a particular emulated feature is. The ultimate goal should be that software won't be able to distinguish between emulator and real hardware.

 

You and I totally agree on this point. That's part of the reason I started Acid800 -- it was clear that the hardware behaviors I was starting to look at weren't intentionally used by anyone and were going to be hard to verify just through software testing. However, even if they aren't worth using, they're work emulating just so people know if they're doing something wrong before it gets tested on real hardware.

 

Of course that there is a consideration of how much effort it is worth. And in this case it is certainly your call. But it seems to me that, if you are going for all the trouble of somehow emulating the "DMA clock" and the HSCROL/DMA tricks, then this particular issue of multiple concurrent DMA doesn't look the hardest part.

 

I do want to eventually emulate all of this, but I have to get a better idea of the exact trigger conditions and what portions of the existing ANTIC code I have to rewrite. I think the first priority is to emulate the playfield scan address getting skewed by writes to HSCROL just after HBLANK, as that's one that's actually screwed people up in the wild. To do that, I need to write some tests to figure out the critical cycle for writing HSCROL on a wide fetch line. After that, it'll probably be mode 2-7->8-9 corruption, then image-exact output (since it's program visible via GTIA collisions).

 

The hard part implementation-wise is trying to keep as much as possible out of the per-cycle loop. Altirra's primary performance bottleneck is the main cycle loop between ANTIC and the 6502, both of which are running single cycle interleaved in order to support things like cycle-exact mid-screen writes to playfield memory. Anything I put in there noticeably slows down the emulation, which means I have to be pretty creative at times.

  • Like 1
Link to comment
Share on other sites

Hey, Guess what! I have a CTIA in one of my 400's (POKE 623,64 does nothing!). I'd be willing to donate it if someone wants to decap it for comparison with GTIA.

 

Very nice, and I am willing to accept a CTIA unit for decapping. But it would be a pitty to "spoil" a CTIA considering they are so rare. May be we should make a last attempt asking Curt to release the original CTIA schematics. Curt, are you reading this?

 

What I want to know is if CTIA contains GTIA circuitry with the register bits disabled. That would confirm that CTIA was only shipped because GTIA was behind schedule.

 

So are you saying that you suspect that CTIA already had "GTIA" functionality, but it didn't work correctly and then it was disabled ?

 

May be, but I'm not so sure. When I checked the GTIA (barely readable) schematics, I had the feeling that the "GTIA" functionality was mostly a quick patch. Hinting that initially there were schematics for a chip without GTIA functionality.

  • Like 1
Link to comment
Share on other sites

Very nice, and I am willing to accept a CTIA unit for decapping. But it would be a pitty to "spoil" a CTIA considering they are so rare. May be we should make a last attempt asking Curt to release the original CTIA schematics. Curt, are you reading this?

 

Well, I'll keep my CTIA's around in case we need them.

 

So are you saying that you suspect that CTIA already had "GTIA" functionality, but it didn't work correctly and then it was disabled ?

 

Yeah, the question is did CTIA exist first, or was GTIA the target all along. If none of the GTIA stuff is present in CTIA, then it would suggest that CTIA is an earlier completed design and GTIA is a reworking of CTIA. If CTIA contains traces of GTIA features, then it would suggest that CTIA is just an incomplete GTIA.

 

May be, but I'm not so sure. When I checked the GTIA (barely readable) schematics, I had the feeling that the "GTIA" functionality was mostly a quick patch. Hinting that initially there were schematics for a chip without GTIA functionality.

 

In that case, I wonder if the reason for releasing GTIA was because Atari wanted the new modes out there or if they were mainly concerned about fixing other CTIA issues (like pixel alignment) and the GTIA modes were just a nice bonus George McLeod had added.

  • Like 1
Link to comment
Share on other sites

Fascinating thread!

 

Incidentally, I am interested in cloning the Atari 800XL in FPGA. This needs many things to be working before one even gets a boot-up screen, so I started by cloning the simpler Acorn Atom. I deliberately designed the basic video timing to match the Atari, so that I would not need to start from scratch. See:

 

I wanted to implement 8-bit Atari in FPGA few years ago:

 

http://repo.or.cz/w/AtosmChip.git

 

Quite a lot of the logic seems implemented, but most of that is untested. I'm sure there is lots of small and bigger bugs.

 

It's in Verilog and was mostly an exercise for me to implement something in this language. I simulated this code with Icarus Verilog, I've never tried to synthesize it for a real FPGA chip. The chips were not supposed to be replacements for the original ones. They are not compatible with the original bus, I used Wishbone instead:

 

http://opencores.org/opencores,wishbone

 

Maybe I'll continue this project some time this year, but I can't say anything for sure now.

  • Like 1
Link to comment
Share on other sites

Fascinating thread!

 

I wanted to implement 8-bit Atari in FPGA few years ago:

 

http://repo.or.cz/w/AtosmChip.git

 

Quite a lot of the logic seems implemented, but most of that is untested. I'm sure there is lots of small and bigger bugs.

 

Aye, there's the rub.

 

A lot of projects get mostly done, but not completely done.

I heard some guys got most of the BBC micro done, but that was a decade ago and still nobody has done it completely.

It is a bit like doing a crossword. We can quickly get all the easiest bits done, then the last clues take most of the time and effort.

Link to comment
Share on other sites

  • 1 year later...

I've been doing some research into ANTIC's unstopped playfield DMA bug, and I have a question about the logic not covered by the schematic: are the addresses fully decoded for all of the line buffer RAM cells? I would have thought that about a third of the cells would be partially decoded, but after running a test app it looks like cell addresses 48-62 are unresponsive on the internal data bus. The test app is showing a solid band of color in the middle that indicates the 6-bit address LFSR is cycling through a full 63 address sequence and that about a quarter of it is coming back $FF no matter what I load into the line buffer. I assume this also means that any writes to those addresses would be dropped as well.

 

This also makes me wonder why they bothered to add the 48th RAM cell since as far as I can tell the timing only ever allows it to display garbage.

  • Like 2
Link to comment
Share on other sites

How does the memory scan update @ end of scanline though... does it rely on DMA incrementing it or is it <previous line start + 32/40/48> type of thing?

 

I guess regardless of method, 48 is a nice round number for programming purposes - not a power of 2 but more program friendly in some cases.

Link to comment
Share on other sites

That one I do know: the memory scan counter increments with each DMA cycle or where each DMA cycle would be. There is at least one game that has a scanline where the playfield width is different at the start and end resulting in an unusual memory scan length (Aztec Challenge).

Link to comment
Share on other sites

I've been doing some research into ANTIC's unstopped playfield DMA bug, and I have a question about the logic not covered by the schematic: are the addresses fully decoded for all of the line buffer RAM cells?

 

Yes, internal RAM address is always fully decoded. Each RAM row (one byte) is selected by a single address only. As implied (but you are right, not explicitely enough) by the schematics, each row is selected by a wide NOR that is activated by a single and unique combination of the address signals.

 

The test app is showing a solid band of color in the middle that indicates the 6-bit address LFSR is cycling through a full 63 address sequence...

 

Yes, the LFSR logic has a full 63 stages sequence. Under "normal" ANTIC operation, the LFSR is reset at the start of the line, and it would never advance past the 48th stage. Each one of the first 48 stages (after LFSR reset) of the LFSR selects one RAM row, the other 15 stages (there is no stage with all bits zero) don't select any RAM row.

 

So if ANTIC, for some reason, attempts to access (read or write) internal RAM at any of the 15 last stages of the LFSR, no actual RAM would be selected.

 

I admit I was a bit lazy on that part of the schematics. I would need somehow to make this more clear and explicit.

Link to comment
Share on other sites

Hardly anyone could look at this schematic and think of you as lazy. :)

 

I've attached the test app I used to check this. Here's what it looks like on a real Atari (it does not work on any emulator... yet):

 

post-16457-0-53157500-1334985831_thumb.png

 

This program loads the line buffer using a wide mode E line which has a single $FF byte sweeping across it, then changes HSCROL mid-way through to activate the DMA bug for the next mode line. Playfield DMA is disabled for the first scanline to prevent ANTIC from reloading the line buffer, and then vertical scrolling is used to stretch the mode line so it is visible. Because ANTIC is reading at double rate here (every cycle), it runs through the line buffer about a third of the way across. The yellow line is the 48th cell which contains bus data, and after that are the 15 unassigned entries. After that, the addressing sequence repeats and a double appears on the right side. Thus there are two animating bars sweeping across the lower line. The program was meant to eventually decode the aliased address mappings for the tail of the address sequence, but with this behavior that turned out to be unnecessary.

 

I assume based on the stability of the bar and its independence from the 48th cell that the internal bus is not floating during that time.

 

Sadly, I haven't been able to think of a practical use for this bug. It takes up way too much memory bus time and the existing modes already max out bandwidth over the ANx bus, relegating it to the "stupid hardware tricks" category.

lineaddress.zip

  • Like 2
Link to comment
Share on other sites

  • 4 years later...

I uploaded a zip file with all the stuff except the huge high rez image:

 

http://vapi.fxatari.com/Decap/AnticReSchemLayout.zip

 

It includes:

- Updated schematics (just minor cosmetic updates)

- Floorplan

- Re digitized Layout in PNG format (single layer)

- Re digitized Layout in SVG format

 

The SVG file has separated layers and shapes. It can be rendered by most modern browsers, but it is meant to be used with illustration software, like Inkscape, Corel Draw or Adobe Illustrator.

 

The high rez full die is still available separately as posted in a previous message.

 

  • Like 6
Link to comment
Share on other sites

I assume based on the stability of the bar and its independence from the 48th cell that the internal bus is not floating during that time.

 

I just see that I never answered that. It is precharged, it should be stable.

 

Btw, you made me realize I made a small mistake in the schematic. It doesn't affect functionality, just that ram write wouldn't work reliably like that ...

 

The wiring of the output enable of the ram write drivers (the tristate inverters that drive the ram columns when writing) is wrong. The output enable is the same, it is actually RamWe, for both the positive and the inverted drivers.

  • Like 1
Link to comment
Share on other sites

After some time I am trying to reconnect with my Atari activities ...

 

Antic high resolution die image mosaic.

WARNING. ~160MB download. Intentionally posted like this, you have to join the domain and the path below:

 

http://vapi.fxatari.com/

Decap/AnticMosaicFull.png

 

Will upload some more stuff shortly.

Bueatiful image. Real shame about the small hole in the top. Was the chip damaged while de-capping?

 

P.S.

I have a 4k 28" Samsung monitor. Text is WAY too small at that res, but for these images I cranked the monitor to the max (I usually have to run it at 2560*1440). 3840*2160 is killer for these kinds of shots!

Link to comment
Share on other sites

Bueatiful image. Real shame about the small hole in the top. Was the chip damaged while de-capping?

 

No. The image is a composite mosaic (stitch, panorama if you want). It is composed of about 50 individual shots. A couple of shots were taken a little bit too far from their neighbor ones, creating this hole in the mosaic.

 

Fortunately the missing area is simple and not very dense. It was easy to fetch the missing pieces from the low rez images (see the one at the top of this thread).

 

Yeah, I realize it ruins the artistic aspect of the image. Sorry about that :)

  • Like 3
Link to comment
Share on other sites

  • 2 years later...

Hi all. I have find out this thread with the ANTIC schematic available. I am so exited this was created!

I am checking the schema and trying to understand some of the ANTIC behavior. Unfortunately I am successful only partially. On high level I can see support for lot of ANTIC functionality I know about. On the other hand when I try to go to deeper details I am failing to verify ANTIC behavior.

As a good example of what I cannot figure-out is sprites data reading. It should happen at the beginning of scan-line together with display list reading. I can see (pg.4 in PLAYER MISSILES AND DISPLAY LIST CONTROL) how bit Hrz0-2 of horizontal counter are connected to the address bus A8-10. I can see as well that the rest of address is coming from PMBASE and VCOUNTER. This is what I called above the "high level" of functionality confirmation I can do.

However if I try to verify how A8-10 is evaluated out from Hrz0-2 I fail and and the address calculated by me for the first 8 cycles of scan line is not what it supposed to be.
I assume my problem is in not understanding how nHrzX is transformed to pHrzX/pnHrzX and what does it mean when the signal goes through "circle connected to S01" (S01 I understand as clock signal)

I have been struggling with this quite a lot w/o success :(

Please would somebody be able to help me to understand how this particular fragment of the ANTIC schema works? I hope this helps me to read other parts too ;)

  • Like 2
Link to comment
Share on other sites

I've not looked at any generated schematic of the modern day but, for address generation...

 

High bits would be latched via the PMBASE register. Then there'd be a bit or two dependent on 1 or 2-line DMA mode, followed by 2 bits determined by which object is being fetched with the lowest bits determined by scanline.

Edited by Rybags
  • Like 2
Link to comment
Share on other sites

My naive non-hardware-engineer-probably-horribly-wrong understanding:

 

The points where the clocks intersect signals with circles are where the logic is split into clocked domains on the two opposite phases of the clock, so that transitions across the gate will only occur on a specific clock phase. On the other phase the input will float and hold the last value. All the logic in between is expected to re-evaluate from inputs to new outputs in under a half-cycle.

 

The horizontal timing is controlled by the PLA right below "player missile and display list control." It is wired to assert the signals on the following timing:

  • Missile DMA request (msReq2/msReq3): hpos[0:2]=6. There are two lines that effectively OR DMACTL bits 2 and 3 together through partially redundant logic. This implements the behavior that missile DMA is active whenever player DMA is enabled even if missile DMA isn't.
  • Display list IR DMA request (enLdIR): hpos[0:2]=7.
  • Player DMA request (plyReq): hpos[0:2]=0-3.
  • Display list LMS address DMA request (mscanReload): hpos[0:2]=4-5, valid mode line (IR[0:3] >= 2), and IR bit 6 = 1.
  • Display list jump address DMA request (dlJump): hpos[0:2]=4-5, invalid mode line (IR[0:3] < 2), and IR bit 0 = 1. Thus, this handles both jump and JVB instructions.

All are also gated on not being in vblank (think of all inputs as inverted inputs to an AND). Working out the logic with all of the inversions is annoying, you have to apply De Morgan's laws a lot to flip between AND/OR duals of inverted and non-inverted logic.

 

The horizontal position values are a bit weird because the horizontal counter within ANTIC doesn't have the values you would expect from the public documentation or GTIA position values. It counts from 0-113, but the eight special cycles at the beginning of the scanline are numbered 6-13, which you can see from the PLA next to the horizontal counter: two of the output lines turn on and off an R/S flip flop during this time to generate the special DMA (nSpcDma) signal. This also explains why VCOUNT seems to increment a bit early, because that's around the time that the horizontal counter rolls over.

 

For players, the high address bits are pushed to the address bus as follows (hpos[2] = 0):

  • A10/A9: not hpos[2] = 1
  • A9/A8: hpos[1]
  • A8/A7: not plySel0 = not (not hpos[2] xor hpos[0]) = hpos[0]

Thus, for the eight equal-size slots in the 1K or 2K P/M graphics block, players 0-3 DMA from the last four blocks in sequence. For missiles (hpos[0:2] = 6):

  • A10/A9: not hpos[2] = 0
  • A9/A8: hpos[1] = 1
  • A8/A7: not plySel0 = not (not hpos[2] xor hpos[0]) = 1. The plySel0 logic is to force this bit high for missiles.

Missiles therefore DMA from slot 3, right below player 0.

  • Like 8
Link to comment
Share on other sites

Hi phaeron, thank you for your response.


If I got you right following diagram 'in---NOT---(S01)---out' can be understood as simple negation 'out = NOT in', provided this is during the "specific clock phase".

Then this diagram 'in---(S02)---NOT---(S01)---NOT---out' where 'S02 = not S01' should work as signal delay 'out(t) = in(t-1)'. Did I get it right?


I completely agree with your explanation of DMA signals. I was confused by unexpected horizontal counter values for DMAs and completely missed the 'nSpcDma' gate signal.

Now I have much better picture of this part. This is great, thank you.


Still, if I follow your explanation about the P/M address it works nicely in your email, but I cannot see support for this in the schema.

  • e.g. you say "A10/A9: not hpos[2]" and in the schema I see "nHrz2---(S01)---NOT---pHrz2---A10/A9"
I read it 'pHrz2 = not nHrz2' => pHrz2 is the same as Hrz2 (not sure why there is used 'p' prefix). It means in your notation "A10/A9: hpos[2]" and this doesn't work. Not sure what I am missing here?
Link to comment
Share on other sites

Hi phaeron, thank you for your response.

 

If I got you right following diagram 'in---NOT---(S01)---out' can be understood as simple negation 'out = NOT in', provided this is during the "specific clock phase".

Then this diagram 'in---(S02)---NOT---(S01)---NOT---out' where 'S02 = not S01' should work as signal delay 'out(t) = in(t-1)'. Did I get it right?

Yes, where you see multiple inverters separated by gates connected to the clock there are going to be half cycle delays. As I understand it, each span of logic between opposite clock phases should take half a cycle. I've had bad luck trying to correlate exact delays in the schematic against measurements, though.

 

IStill, if I follow your explanation about the P/M address it works nicely in your email, but I cannot see support for this in the schema.

  • e.g. you say "A10/A9: not hpos[2]" and in the schema I see "nHrz2---(S01)---NOT---pHrz2---A10/A9"
I read it 'pHrz2 = not nHrz2' => pHrz2 is the same as Hrz2 (not sure why there is used 'p' prefix). It means in your notation "A10/A9: hpos[2]" and this doesn't work. Not sure what I am missing here?

 

'p' is probably for positive, to contrast against 'n' for negative.

 

There is a NAND gate between pHrz2 and A10out/A9out -- the circle at the end indicates that the output is inverted:

post-16457-0-23317900-1553924839.png

 

The address driver block then contains NOT and NOR gates which cancel out, so A10out/A9out go out A10/A9 on the address bus uninverted.

 

For A9out, the output will be the AND of the two NAND gate outputs, such that the output is low if and only if one of the gates is pulling it low. Only one of the plySngOe/plyDblOe (player single/double output enable) signals will be high, while the other one will be low, forcing the corresponding NAND output high and disabling it. Note that there is a pull-up at the beginning of the address driver.

 

  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...