Jump to content
Sign in to follow this  
Bennet

Antic DMA Timings

Recommended Posts

Does anyone have a chart that shows exactly when Antic does DMA for every type of graphics mode and every HSCROL value? Basically a 114 cycle graph showing the instruction, address, player, missile, graphics, characters, and memory refresh DMA locations.

 

I started to make one based on information in the hardware documents I could find and by running many DLIs that change colors on different cycles to see where they are delayed. It was initially a lot harder than I thought because things don't work the way you'd assume by reading the documentation. But... they are working the way I'd expect if I actually had to build the hardware (pipeline delays, etc).

 

So if anyone is curious about my findings I'd be happy to share. There's a lot more to the Antic chip than meets the eye.

Share this post


Link to post
Share on other sites

That would be handy. There is the chart in the Atari Hardware manual.

 

There is some doc somewhere that shows where the PMG DMA occurs but I can't remember the site address.

 

Memory refresh is one I'd like to know about - the official docs are unclear, plus it is supposedly different in hires graphics modes.

 

I wrote a short program which loaded Player data from actual instruction data (2 bytes from a LDA instruction address) - it's a poinless technique though since DMA is quicker and you can only control the graphics of 2 players using the technique I used.

Share this post


Link to post
Share on other sites

Yep, I started with the chart in the Atari Hardware Manual but it only deterministically describes the following:

M=Missile, P=Player, A=Address, I=Instruction. Starting with the last cycle of the previous scanline: MIPPPPAA

 

And yes, DMA is much quicker. STA <address> takes 4 cycles assuming you have the graphics data in a register already. (Are you seriously saying you merely turned GTIA Player DMA on, and had the LDA instruction data go through the bus at just the right time to load two players worth of data?)

 

I've noticed that at least one emulator lets you specify which cycle to release WSYNC on, as though it needs to be fudged for certain programs. The real reason is that it's not a constant value, it changes based on a couple different factors.

 

Oh.. and memory refresh *usually* occurs every 4 cycles like the hardware manual says but I have proof it doesn't, but at least what it does makes sense (in a way). I'm still experimenting.

Edited by Bennet

Share this post


Link to post
Share on other sites

Here is some information from the source code of the Atari800 emulator that might be helpful.

 

 

/* ANTIC Timing --------------------------------------------------------------

NOTE: this information was written before NEW_CYCLE_EXACT was introduced!

I've introduced global variable xpos, which contains current number of cycle
in a line. This simplifies ANTIC/CPU timing much. The GO() function which
emulates CPU is now void and is called with xpos limit, below which CPU can go.

All strange variables holding 'unused cycles', 'DMA cycles', 'allocated cycles'
etc. are removed. Simply whenever ANTIC fetches a byte, it takes single cycle,
which can be done now with xpos++. There's only one exception: in text modes
2-5 ANTIC takes more bytes than cycles, because it does less than DMAR refresh
cycles.

Now emulation is really screenline-oriented. We do ypos++ after a line,
not inside it.

This simplified diagram shows when what is done in a line:

MDPPPPDD..............(------R/S/F------)..........
^  ^	 ^	  ^	 ^					 ^	^ ^		---> time/xpos
0  |  NMIST_C NMI_C SCR_C				 WSYNC_C|LINE_C
VSCON_C										VSCOF_C

M - fetch Missiles
D - fetch DL
P - fetch Players
S - fetch Screen
F - fetch Font (in text modes)
R - refresh Memory (DMAR cycles)

Only Memory Refresh happens in every line, other tasks are optional.

Below are exact diagrams for some non-scrolled modes:
																								11111111111111
	  11111111112222222222333333333344444444445555555555666666666677777777778888888888999999999900000000001111
012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
						/--------------------------narrow------------------------------\
				/----------------------------------normal--------------------------------------\
		/-------------------------------------------wide--------------------------------------------\

blank line:
MDPPPPDD.................R...R...R...R...R...R...R...R...R........................................................

mode 8,9:
MDPPPPDD....S.......S....R..SR...R..SR...R..SR...R..SR...R..S.......S.......S.......S.......S.......S.............

mode a,b,c:
MDPPPPDD....S...S...S...SR..SR..SR..SR..SR..SR..SR..SR..SR..S...S...S...S...S...S...S...S...S...S...S...S.........

mode d,e,f:
MDPPPPDD....S.S.S.S.S.S.SRS.SRS.SRS.SRS.SRS.SRS.SRS.SRS.SRS.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.........

Notes:
* At the beginning of a line fetched are:
 - a byte of Missiles
 - a byte of DL (instruction)
 - four bytes of Players
 - two bytes of DL argument (jump or screen address)
 The emulator, however, fetches them all continuously.

* Refresh cycles and Screen/Font fetches have been tested for some modes (see above).
 This is for making the emulator more accurate, able to change colour registers,
 sprite positions or GTIA modes during scanline. These modes are the most commonly used
 with those effects.
 Currently this isn't implemented, and all R/S/F cycles are fetched continuously in *all* modes
 (however, right number of cycles is taken in every mode, basing on screen width and HSCROL).

There are a few constants representing following events:

* VSCON_C - in first VSC line dctr is loaded with VSCROL

* NMIST_C - NMIST is updated (set to 0x9f on DLI, set to 0x5f on VBLKI)

* NMI_C - If NMIEN permits, NMI interrupt is generated

* SCR_C - We draw whole line of screen. On a real computer you can change
 ANTIC/GTIA registers while displaying screen, however this emulator
 isn't that accurate.

* WSYNC_C - ANTIC holds CPU until this moment, when WSYNC is written

* VSCOF_C - in last VSC line dctr is compared with VSCROL

* LINE_C - simply end of line (this used to be called CPUL)

All constants are determined by tests on real Atari computer. It is assumed,
that ANTIC registers are read with LDA, LDX, LDY and written with STA, STX,
STY, all in absolute addressing mode. All these instructions last 4 cycles
and perform read/write operation in last cycle. The CPU emulation should
correctly emulate WSYNC and add cycles for current instruction BEFORE
executing it. That's why VSCOF_C > LINE_C is correct.

How WSYNC is now implemented:

* On writing WSYNC:
 - if xpos <= WSYNC_C && xpos_limit >= WSYNC_C,
we only change xpos to WSYNC_C - that's all
 - otherwise we set wsync_halt and change xpos to xpos_limit causing GO()
to return

* At the beginning of GO() (CPU emulation), when wsync_halt is set:
 - if xpos_limit < WSYNC_C we return
 - else we set xpos to WSYNC_C, reset wsync_halt and emulate some cycles

We don't emulate NMIST_C, NMI_C and SCR_C if it is unnecessary.
These are all cases:

* Common overscreen line
 Nothing happens except that ANTIC gets DMAR cycles:
 xpos += DMAR; GOEOL;

* First overscreen line - start of vertical blank
 - CPU goes until NMIST_C
 - ANTIC sets NMIST to 0x5f
 if (NMIEN & 0x40) {
  - CPU goes until NMI_C
  - ANTIC forces NMI
 }
 - ANTIC gets DMAR cycles
 - CPU goes until LINE_C

* Screen line without DLI
 - ANTIC fetches DL and P/MG
 - CPU goes until SCR_C
 - ANTIC draws whole line fetching Screen/Font and refreshing memory
 - CPU goes until LINE_C

* Screen line with DLI
 - ANTIC fetches DL and P/MG
 - CPU goes until NMIST_C
 - ANTIC sets NMIST to 0x9f
 if (NMIEN & 0x80) {
  - CPU goes until NMI_C
  - ANTIC forces NMI
 }
 - CPU goes until SCR_C
 - ANTIC draws line with DMAR
 - CPU goes until LINE_C

 -------------------------------------------------------------------------- */

Share this post


Link to post
Share on other sites

Good info, it matches the information I have except for the missile dma. I was under the impression the missile DMA was done as the last cycle of the previous line. The Atari Hardware Manual is actually pretty clear about it.

 

It's missing the hard stuff though, such as the DMA timings on the first line of a character graphics line, and how with HSCROL the wide playfield overlaps the WSYNC area. My experiments have shown that Antic actually cuts out DMA early on wide playfield lines, and in fact if you view the overscan you can see that it's definitely not pulling the data for the last visible character when HSCROL offsets DMA by 1 cycle (you see garbage and this phenomenon is described in the CGIA document).

 

The confusing behaviour is when you set a DLI that sets WSYNC then immediately sets the background color you can see that with a wide playfield it's delayed by two cycles compared with the normal playfield. But.. if you WSYNC the previous line (and a mode 0 line so there's no WSYNC delay) and simply count the number of cycles to the next line you wind up with the really confusing behaviour of Antic seemingly releasing WSYNC a cycle *before* the line DMA starts so that it's not delayed by too much. I'm not convinced this is actually the case yet, but there just aren't enough free DMA cycles after cycle 208 to have the color change happen when it does with a wide playfield.

 

Oh, and the first annoyance I noticed is that my first DLI started with a little bit of setup then a 'STA WSYNC' then changed the color. And it jittered.. *after* the WSYNC. The only explanation I have for this is that when an STA WSYNC completes on a cycle before Antic DMA, WSYNC is delayed by a cycle. To remove the jitter, I simply NOP a few times to get past the memory refresh DMA and there's never any jitter. The same thing occurs with playfield DMA. If anyone can explain this, that would be great!

 

Other interesting things (that are also shown on the chart provided) is that Antic will DMA character data 4 cycles before the display pixel starts, and graphics data 3 cycles before. This makes sense given that with this setup it will have 2 cycles to take the graphics, load the shift register, and shift the first pixels out.

Share this post


Link to post
Share on other sites

I did a program which "forced" program data onto the bus which PMGs then assumed.

 

WSYNC does not in any way wait for the start of the next scanline. You can learn a lot by using a capture card when trying graphics effects.

 

Like the hardware manual diagram states, WSYNC actually releases the processor at the first cycle following the end of a standard (40) display line.

 

As such, a store operation to a colour register immediately after WSYNC will take effect about 2 character positions to the right of the playfield. On most (all?) TVs, this is in the overscan area and invisible.

 

My cap-card shows the entire overscan area - you get some strange artifacts when H-Scrolling in widescreen mode. Some time soon, I'll do a couple of grabs and post them here.

 

Also to clarify, DLIST DMA is done before the line starts - naturally, since a DLI has to occur shortly after the start of the last scanline of a row of graphics (or blank lines).

 

I'm unsure of the order of PMGs - but I still have the source code and will dig it up. I found that A800Win produced different results, although that might have been on an older version. I also found (not 100% sure) that if ANTIC doesn't do PMG DMA, it uses at least one of those cycles to fetch the DLIST data.

The timing of PMG DMA I assume would be fixed regardless of graphics mode since GTIA has to fetch data from the bus at the exact point in time during the scanline - and there is no communication from ANTIC to indicate whether it is even doing DMA.

 

 

The program I did was pretty simple - just use NOPs and LDA $1234 to generate the appropriate amount of delay. Then do a LDA abs, then LDA imm instruction - only problem is that you're unable to provide all the data since the operands aren't long enough

 

One thing we need to work out and document is HSCROLL artifacts that occur when you make mid-scanline changes. It's already known some of the effects you can get by changine VSCROL on the fly.

Edited by Rybags

Share this post


Link to post
Share on other sites

There are schematics out there for all A8 custom chips except Antic (& Sally), right?

 

-Bry

Share this post


Link to post
Share on other sites

Heaven/TQA:

 

Actually yes, I think with an Antic Mode E display the timings would allow for a few palette changes per scanline. It'd be like Spectrum 512 for the Atari ST. But if anyone wanted to do this they could have done it already with some simple experiments since it's easy to make it work on a DLI per DLI basis (just try it and see what happens, then adjust until it looks right). I think it'd only be useful for static images (or dynamic images that don't change HSCROL since changing HSCROL shifts the timings around in a not so nice way).

 

With high resolution character graphics the entire first line is gone due to DMA, so no changes mid-scanline there.

 

The thing I'm more interested in is accurate emulation timings, since just missing 1 DMA cycle (or adding an extra one) makes things faster/slower than they should be.

 

Rybags:

 

Yep, on 40 column modes, STA WSYNC the STA COLBAK takes effect two character positions after the end of the 40 column line which corresponds to the write cycle of 'STA COLBAK' being the 4th cycle of that instruction. But, on wide playfield modes if you STA WSYNC on the previous line, then NOP and 'STA COLBAK' to see exactly where DMA stops (by delaying by 1 cycle every time until you see the colors change on the right of the playfield) then delay that by another three cycles (to be as though the STA COLBAK happened after the playfield DMA) you'll see something interesting. Namely that 'STA WSYNC STA COLBAK' on that line beats you by two cycles, implying that WSYNC was released before playfield DMA even started!

 

Yes, the PMG DMA timing is fixed. I never actually checked the order the bytes were read, just that they were always read at the same time. If you wanted to figure out the ordering that would be great!

 

Also, I never tried changing HSCROL on the fly. I have to try this! That might reveal more secrets.

 

Bryan:

 

I've only been looking for Antic and GTIA, and in that search I've only found very fuzzy GTIA schematics. I'd love to have Antic schematics.

Edited by Bennet

Share this post


Link to post
Share on other sites

do your docs describe HIP-effect or explain or the 40x40 charmode?

 

http://www.s-direktnet.de/homepages/k_nadj/hip.html

http://www.s-direktnet.de/homepages/k_nadj/mode9++.html

 

 

 

and the old effect...in an antic f line switching on scanline into mode9 (with prior register)...waiting and then switching back to antic f high res and you get not a gr.8 anymore but a antic e like mode instead?

 

check out joyride f.e. http://www.atarimania.com/detail_soft.php?...D&SOFT_ID=11309

40x40.zip

Share this post


Link to post
Share on other sites

oh...it was not joyride...can not remember which polish demo it was...

 

joyride has just 2 gfx modes per scanline.

post-528-1148666786_thumb.png

Share this post


Link to post
Share on other sites

"joyride has just 2 gfx modes per scanline."

 

The docs just describe what types of Antic DMA happen when, it doesn't cover GTIA timings. The GTIA mode timing should be pretty easy to figure out once I get to it. Joyride looks awesome, I never saw anything like that back in my Atari childhood.

 

"and the old effect...in an antic f line switching on scanline into mode9 (with prior register)...waiting and then switching back to antic f high res and you get not a gr.8 anymore but a antic e like mode instead?"

 

I can only guess that the GTIA loses the 'high res mode' setting and interprets the Antic signals just the same as in mode E. Antic E and F are basically the same mode, just that at the start of the scanline, Antic tells GTIA to use 'high res' through their private interface. I think it's called 'undefined behaviour'!

Edited by Bennet

Share this post


Link to post
Share on other sites

benne, havent seen joyride?

 

well...then look at numen, overmind and tons of other great demos... ;) joyride is dated 1995 and uses well knew routines from c64...

 

but joyride was the reason for the GTIA-"Bug" which lead to HIP/TIP-mode... its in the so called DIL-Plasma screen... (i guess mode10/11 combined).

 

http://numen.scene.pl/

 

or check out g2f.atari8.info as well as nice handy tool. maybe for some of your tests?

Share this post


Link to post
Share on other sites

FWIW...

 

I discovered another interesting feature of Antic while writing Castle Crisis. You cannot use blank lines in the middle of a Vscrolled area. Normally, the 1st entry with Vscrol turned on has its top chopped off, and the 1st entry after that with Vscrol turned off has its bottom chopped off. A blank line causes Antic to forget the scroll state and chop off the top of the next entry with Vscrol turned on. The upside of this is that you can create short text by mixing regular text lines with single blank ones (as long as a blank line between rows is acceptable).

 

-Bry

Share this post


Link to post
Share on other sites

My Atari 8-bit experience stopped in 1992... but it eventually drew me back :)

 

Bryan:

 

Ah, so it seems that any instruction without the VSCROLL flag (which would include blank line and jump because they just don't have that bit) are treated as an end to VSCROLL. I wonder if the number of blank lines affects the display (like an 8 blank line instruction). I knew that Jump reset VSCROL through watching other programs, but forgot about blank!

Share this post


Link to post
Share on other sites

As per previous post, "forcing" data into PMGs.

 

Here's an example program:

 

5 GRAPHICS 0:TRAP 10:FOR A=1536 TO 1570:READ D:POKE A,D:NEXT A
10 POKE 512,0:POKE 513,6
15 POKE 704,54:POKE 705,76:POKE 706,214:POKE 707,246:POKE 710,2
20 POKE 54286,192
30 DL=PEEK(560)+PEEK(561)*256
40 POKE DL+12,132
50 POKE 53248,80:POKE 53249,96
60 POKE 53250,112:POKE 53251,128
70 POKE 53255,144:POKE 53254,146
80 POKE 53253,148:POKE 53252,150
90 POKE 623,1›100 POKE 53277,3
110 POKE 16384+255+8,255
120 POKE 16384+255+16,127
130 POKE 16384+255+24,63
140 POKE 16384+255+32,31
150 POKE 16384+31,128
160 POKE 16384+23,64
170 POKE 16384+15,32
180 POKE 16384+7,16
1000 DATA 72,152,72,160,40,141,10,212,234,234,234,234,234,185,255,64,136,208,242,104,168,104,64,0,-1

 

Disassembly of DLI routine:

0600	PHA		; 3cyc; 48
0601	TYA		; 2cyc; 98
0602	PHA		; 3cyc; 48
0603	LDY #$28; 2cyc; A0 28
0605	STA $D40A;WSYNC; 4cyc; 8D 0A D4
0608	NOP		; 2cyc; EA
0609	NOP		; 2cyc; EA
060A	NOP		; 2cyc; EA
060B	NOP		; 2cyc; EA
060C	NOP		; 2cyc; EA
060D	LDA $40FF,Y; 4cyc; B9 FF 40
0610	DEY		; 2cyc; 88
0611	BNE $0605; 2cyc; D0 F2
0613	PLA		; 4cyc; 68
0614	TAY		; 2cyc; A8
0615	PLA		; 4cyc; 68
0616	RTI		; 6cyc; 40

 

 

Zipped version of SAVEd BASIC program: pmtest1.zip

 

 

post-7804-1148736577_thumb.jpg

 

In the screenshot, the players are spaced 0, 1, 2, 3. The missiles are grouped together 3,2,1,0 (so that they properly reflect the data that is being read).

 

Note the following:

PLR 0 (red) shows $40. This is the operand of the high address from the LDA $40FF,Y instruction

PLR 1 (pink) shows stepped lines. That is the data which is being fetched by the instruction (incorrectly) since it crosses a page boundary

PLR 2 (green) shows stepped solid. That is the data which is being fetched by the instruction (correcly) into the A register

PLR 3 (orange) shows 2 vertical lines (data $88) - that is just the DEY instruction at the end of the loop.

 

Missiles - they are showing the Display List Instruction. Note that the shapes on screen only change in time with DList instruction fetches. Somehow GTIA must see this activity as a prompt to read the data bus, since the DLI routine is presenting different data on the bus every scanline but the PMGs only change on DL instruction reads.

 

The black horizontal line is just ANTIC mode 4 with a DLI, I made it that mode to make it easier to see where the action starts.

 

Strangely, the first line of the altered graphics seems not to adhere to the timing of the subsequent lines.

 

Also note Player 3 (orange) at the top of the GR.0 display. It is displaying the low byte from the LMS instruction.

Edited by Rybags

Share this post


Link to post
Share on other sites

I should have mentioned... the above program only works on a real machine. Atari800Win seems to produce different results.

Share this post


Link to post
Share on other sites

post-7804-1148825923_thumb.jpg

 

This screenshot is from my capture card, displaying the overscan area.

 

There is a DLI running which changes the background colour immediately after WSYNC.

It also switches from standard - narrow - wide playfield modes. I have HSCROLL enabled on the wide playfield lines.

 

Notes about wide playfields:

- An extra character will become visible at the left of screen. This character will display invalid data if H-scrolling is enabled and the HSCROLL register value is 14 or 15 (like part of an inverse Space chr.)

- An extra 4 characters will become visible at the right of screen. The last character displays invalid data (it seems to be bus data in similar fashion to PMGs without DMA being done by ANTIC).

Share this post


Link to post
Share on other sites

This is a link worth looking at, it has some info too:

 

http://www.atarimax.com/jindroush.atari.org/atanttim.html

 

post-7804-1148828994_thumb.jpg

 

DLI changed to alter PF2 register in widescreen area ... note that there must be an early return from WSYNC when in widescreen mode - the colour change is only 1 cycle later than the others. Either that, or ANTIC is doing DMA somewhat earlier than the display of the graphics. Also, if HSCROLLing is enabled there, the DMA times are delayed as stated in the Hardware Manual.

Edited by Rybags

Share this post


Link to post
Share on other sites

Very nice, as for the widescreen areas, the DMA timings I've discovered show that DMA continues through cycle 104, so that WSYNC return is delayed by a cycle. Because of this I've been treating WSYNC as a separate problem to solve and not using it except on blank lines (since then there's only refresh DMA and it's not affected).

 

Oh, and Antic does DMA for characters 4 cycles before the playfield starts (which for wide is confusing because there's a physical start and a visible start).

 

My timing diagrams now cover all of the graphics modes for every playfield width and every HSCROL value. The character modes I'm still looking into because they behave interesting... Wide playfield is certainly the weirdest due to DMA cutoff (Antic stops doing DMA, but continues to increment memory scan counter).

 

Did you try using Antic mode 4? It shows the background color so you can see where during the playfield color changes occur.

Share this post


Link to post
Share on other sites

Ok! I think I made a breakthrough!

 

WSYNC is *NOT* released on cycle 104, it's released on cycle 105. Believe it? Read my notes below. It's the only way to explain my timing results:

 

WSYNC: When an STA WSYNC is executed, Antic takes 1 cycle to respond before halting the CPU. It releases WSYNC on cycle 105 on a scanline. This has the appearance of the CPU restarting on cycle 104, but it's really that you get 1 cycle after STA WSYNC and restarting on 105. Note that if Antic is doing DMA on the cycle immediately after STA WSYNC completes, the CPU misses that extra cycle (due to the DMA).

 

Forget about my crazy talk earlier about WSYNC being released before character DMA starts on high res modes. This explanation makes a lot more sense, and is a lot more reasonable to expect from the hardware.

 

Anyone else care to verify my results? :D

Share this post


Link to post
Share on other sites

Have you compiled any charts?

 

If you can put something up, then it will be fairly simple to do some programs to verify your findings.

 

How are you gathering your info? Just by programming, or are you using test /analysing equipment?

 

I also did a program which just alternates doing STA/STX to COLBK... the results I got weren't really what I expected.

 

I'll post about it later.

Edited by Rybags

Share this post


Link to post
Share on other sites

Yes, I have a huge chart. It took me ages but it's complete. Every mode, every HSCROL, every playfield width. Once it was complete the patterns became obvious.

 

It really won't post well because it's incredibly wide, but it's just a text file. I can email it if you're interested so you can check it and see if I'm not insane.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...