Jump to content
IGNORED

Display lists for fun - unlimited horizontal scroll?


Recommended Posts

I've been spending way too much time just thinking
about Atari graphics, as a cool diversion
from my full time programming job (C, C++, Java, perl, SQL).

I've been reading about ANTIC and GTIA, and all
the stuff they can do (or you can do with them).
I've read and understood how to do a coarse horizintal
scroll (and a fine scroll is just a variation on top of that).

But all the examples I found assume that your in RAM screen
is large-but-finite, and that your screen is just a window
onto the RAM screen. For any given
screen row, the offset from the base of your row
is the offset from your physical screen to your virtual
screen. Simple.

But what is your data is really large,
larger than your physical memory (or each
row is larger than 4K :-)

This might be the case in a procedurally generated
(or otherwise compressed) horizontal scrolling game,
for example.

I kept thinking about this, and designed an answer.
Clearly, because you can't have all your large screen drawn
ready, you have to draw it on an as needed basis.

Cue diagram (screen width 10 cells)

Memory               visible     offset
abcdefghij...........abcdefghij..0
kbcdefghijk..........bcdefghijk..1
klcdefghijkl.........cdefghijkl..2
klmdefghijklm........defghijklm..3
klmnefghijklmn.......efghijklmn..4
klmnofghijklmno......fghijklmno..5
klmnopghijklmnop.....ghijklmnop..6
klmnopqhijklmnopq....hijklmnopq..7
klmnopqrijklmnopqr...ijklmnopqr..8
klmnopqrsjklmnopqrs..jklmnopqrs..9
klmnopqrst...........klmnopqrst..0
ulmnopqrstu..........lmnopqrstu..1
uvmnopqrstuv.........mnopqrstuv..2

So, when you start, you just have your start row.
As you scroll to the right, you only need to add the
newly exposed bit of virtual screen ('k') on the right,
and offset the memory (LMS) by 1. To keep scrolling,
we just keep adding more stuff on the right,
and keep incrementing the row base via LMS.
But it gets interesting when we've drawn a whole screen
of new stuff. We could now like to reuse the
whole (now wasted) screen's width of memory to the left,
but we'd have to redraw the entire screen ('k' to 't').

However, since we KNOW this is going to happen ahead of time,
we can do what the diagram shows. When we draw a new piece
of virtual screen, we draw it TWICE (or draw it once and copy),
offset by a full screen width.

So in the second line of the diagram we not only append
a 'k' to the row, we replace the unused 'a' with another 'k'.
This 'k' is very much a long term investment; we won't use
it until the LMS memory offset is once again set
to 0, 10 rows down the diagram.

So - is this how everyone does it? I searched this forum
(and the net) but could find anything. If this tehcnique

has a name, I don't know it (so I couldn't search for it...)

BugBear (dipping a toe in the cool Atari waters)

  • Like 1
Link to comment
Share on other sites

Zybex uses a similar approach if memory serves, i can't remember the details but there are two bitmapped buffers in play.

 

For coarse scrolling, my character-based engine has one LMS at the start of the display which gets nudged up by one when the fine scroll finishes a cycle; the new column of data is written in at whatever is now the furthest right column of the display after each move and there's usually only have about sixteen to twenty screens per level (again, i'm struggling a little to remember because it's a while since i looked at the code!) but as long as the column writer takes the 4K boundary into consideration it can go on until the data runs out.

Link to comment
Share on other sites

Zybex uses a similar approach if memory serves, i can't remember the details but there are two bitmapped buffers in play.

 

For coarse scrolling, my character-based engine has one LMS at the start of the display which gets nudged up by one when the fine scroll finishes a cycle; the new column of data is written in at whatever is now the furthest right column of the display after each move and there's usually only have about sixteen to twenty screens per level (again, i'm struggling a little to remember because it's a while since i looked at the code!) but as long as the column writer takes the 4K boundary into consideration it can go on until the data runs out.

That sounds (almost) as if you you keep on writing to the rightmost (virtual) column, and keep on incrementing the LMS. Of course, if the level has unlimited (e.g. 100s) screens, you eventually run out of memory (the technique I posted resets the LMS every whole screen).

 

However, I strongly suspect I'm not fully understanding your "takes the 4K boundary into consideration" . Could you expand on that, and reduce my ignorance please?

 

BugBear (intrigued)

Link to comment
Share on other sites

4K boundary - when the graphics chip increments its screen memory address, if this hits a 4K boundary (i.e. every $1000) then the address/offset will wrap back to zero and hence data obtained from a different/lower memory location, usually corrupting the display. Zybex does indeed use a clever technique and would suffer from this problem but it gets around it by copying a small block of screen data to the lower area to cover things. Take a look at this thread.

 

From what I read, are you after something like watching the Background map in a VRAM viewer of a GameBoy emulator?

 

In the R-Type example the incoming parts are added to the right of the viewport and (for the GB) the h/w wraps back to the same line. This differs from the A8 in that the the next line would be accessed instead. So you can see from the screenshot that to the left of the viewport is more of the background that the player will scroll into.

 

From what I think your proposal is, related to the viewport on the example, you intend have a virtual area effectively two screen widths. As the incoming graphics are draw to the right of the viewport, and there is space in the column to the left of the viewport, then you can copy the same column drawn into screen area 2 to that column in screen area 1. Therefore, when the viewport reaches screen area 2, you can reset it to point to screen area 1 instead. That would seem sensible to me, though of course any soft-sprite stuff going on is going to introduce some fun.

 

post-1822-0-88119400-1493374671.png

Link to comment
Share on other sites

Probably he meant that you take the 4k limit into your calculatins, f.e. if you use $2000-$2FFF area, then virtually every memory calcuation you AND with $2FFF. So when the LMS points to $2FF0 and you need to draw 40 cahrs further... it's $3018, AND it with $2FFF and you end up with $2018 and you can scroll forever (as ANTIC will display bytes $2FFE,$2FFF,$2000,$2001...).

Edited by MaPa
Link to comment
Share on other sites

That sounds (almost) as if you you keep on writing to the rightmost (virtual) column, and keep on incrementing the LMS. Of course, if the level has unlimited (e.g. 100s) screens, you eventually run out of memory (the technique I posted resets the LMS every whole screen).

 

However, I strongly suspect I'm not fully understanding your "takes the 4K boundary into consideration" . Could you expand on that, and reduce my ignorance please?

LMS commands stay within a 4K boundary so, if you set one to $3fff the first byte of the screen will come from there, the second from $3000, third from $3001 and so on. As long as the column writer is written so it can't move out of that 4K space as well, you can just keep cycling around for however long is needed.

Link to comment
Share on other sites

Thank for being patient with me guys. I think I've been both too clever, and simultaneously not clever enough.

 

This whole design process started with the observation that

you can't just "keep adding to the right". You'll just eventually crawl

through the whole of what memory you have, and fall off the end.

 

In any case (so I thought) the 4K video limit will cut in long before that.

 

So I designed the scheme I first posted (in deference to Wrathchild, I'll call it

"two page scrolling"). This is designed to use the least memory

possible for a single row. I was thinking very much in single row

terms, and assuming that, for a complete image, you'd just have

more rows (very object oriented thinking).

 

I then realised (in this thread) that you could exploit the 4K "wrap round",

so that you could indeed just "keep adding to the right",

at the cost of using 4K of RAM for a single row. It would work

(in some pedantic sense of "work"), but use too much memory.

 

4K per row is madness.

 

I believe I now have a better idea (that is probably how you've all been doing it

for years...)

 

If we (by which I mean me...) consider multiple rows, as each row moves to

the right, each row no longer needs its left-most-cell, and needs to acquire a right-most-cell.

 

If the rows just happen to be arranged nose-to-tail, this works out nicely,

with each row's new right-most-cell being exactly the just released left-most-cell

of the next row.

 

The only issue (in a flat memory space) is when the batch of rows hits

the 4K "limit". But if we stop thinking of it as a limit, and think of

it (Asteroids style) as a wrap-around, we're fine. We just keep expanding

to the right, modulo 4K, and the issue becomes a non-issue.

 

But you guys knew all this already. Thanks for being kind as I worked it out

for myself.

 

BugBear

Link to comment
Share on other sites

You will not use 4k per row... you will use 4k for whole screen (assuming character screen not bitmap) and using only one LMS. You don't have to have LMS for each row on screen, just only at the start, the next rows just displays whatever is next in RAM.

 

The "standard" two screen scrolling needs LMS for every line on screen and requiring cca. 2kB of RAM, the single LMS method requires only one LMS and 4kB of RAM (assuming endless scroll and standard 40x24 screen charcater mode). With come other pros and cons to each method.

Edited by MaPa
Link to comment
Share on other sites

Or even with the one LMS method you don't need to "waste" all 4kB, after scrolling 1000 chars to the right, whole screen at the start is unused, so you can start drawing the right incoming screen to the start if the block and after scrolling one screen (cca. 40 chars) switch LMS at the beginning so it will use only cca. 2kB of RAM too.

Edited by MaPa
  • Like 1
Link to comment
Share on other sites

What you wrote first sounds like "Nes (PPU) scrolling" (used in Super Mario etc).

Nice explanation here: https://wiki.nesdev.com/w/index.php/PPU_scrolling

 

post-14652-0-58502000-1493624643.gif

 

Your second idea sounds more like Analmux's MWP (minimum wrapping principle) method.

 

As you've said you can use bytes that "left the screen" to show incoming stuff.

 

I've tried making more sense of it earlier on this topic:

http://atariage.com/forums/topic/222253-mwp-scrolling-the-better-way-imho/

 

And implemented it (it's not complete, I still haven't made - update column-row part :) ):

This one runs in 50fps with very little cpu time required for scrolling of full bitmap. You only need to change couple LMS addresses and positions inside display list and draw new stuff that enters screen.

 

 

  • Like 5
Link to comment
Share on other sites

What you wrote first sounds like "Nes (PPU) scrolling" (used in Super Mario etc).

Nice explanation here: https://wiki.nesdev.com/w/index.php/PPU_scrolling

 

attachicon.gifSMB1_scrolling_seam.gif

 

No - the whole point of my first technique is that it is based on AVOIDING hardware wrapping, on the assumption that the wrap is both NOT useful

and indeed troublesome.

 

Each displayed row is complete and contiguous in memory, and the required wrapping

is performed by my software drawing most things twice.

 

To be clear, this is not a good idea

BugBear

 

(I've only read the first half of your reply, I'll reply to the second half when I've looked into it)

Link to comment
Share on other sites

What you wrote first sounds like "Nes (PPU) scrolling" (used in Super Mario etc).

Nice explanation here: https://wiki.nesdev.com/w/index.php/PPU_scrolling

 

attachicon.gifSMB1_scrolling_seam.gif

 

Your second idea sounds more like Analmux's MWP (minimum wrapping principle) method.

 

As you've said you can use bytes that "left the screen" to show incoming stuff.

 

I've tried making more sense of it earlier on this topic:

http://atariage.com/forums/topic/222253-mwp-scrolling-the-better-way-imho/

 

And implemented it (it's not complete, I still haven't made - update column-row part :) ):

This one runs in 50fps with very little cpu time required for scrolling of full bitmap. You only need to change couple LMS addresses and positions inside display list and draw new stuff that enters screen.

 

 

Cool thread! :)

 

popmilo,

I really liked the video of the Minicraft game with it's high resolution 4-way scrolling at 50 FPS - is there a second display buffer behind that like with the NES PPU scrolling?

 

 

 

No - the whole point of my first technique is that it is based on AVOIDING hardware wrapping, on the assumption that the wrap is both NOT useful

and indeed troublesome.

 

Each displayed row is complete and contiguous in memory, and the required wrapping

is performed by my software drawing most things twice.

 

 

bugbear,

your technique sounds like my 2600 Display lists approach I am working on porting to the A8/5200 - a soft blitter chip that draws changes twice in both buffers whenever the screen CAM buffer overlaps the changed area in the larger display buffer:

 

post-30777-0-38354300-1493733250.jpg

 

Instead of being twice the area of the screen the like the NES PPU the large buffer is 10x the size of the screen (5 screens by two screens). This is more like a tile/text mode because the full screen CAM is 20x10 large tile pixels (comprised of two sub-pixels each).

 

Like with the NES PPU you can also make changes just to the CAM buffer and wipe off your changes like a sheet of glass.

 

I'm still experimenting but think I will need to step away from ANTIC to get GTIA match the raw continuous speed WARPDRIVE can obtain, it moves the playfield CAM at 60 FPS when you toggle the BW switch during play (first press fire to start). You can launch the game in Javatari here or download it to try in Stella (the ROM url is embedded in the link):

 

http://javatari.org/?ROM=http://relationalframework.com/WARPDRIVE_AFP.bin

 

The game and the large scrolling text display require the large bitmap buffer to be many times larger than the screen - the game only utilizes the bottom half of the buffer (5 screens) but the text rendering utilizes all of it.

 

2600 Display lists are simplified to let the programmer easily create multiple CAM's onto the virtualworld buffer. There are only 10 display lists since they are large tile pixel rows and you can only control them two or three at a time (you can combine to control six, five or four at a time since 2600 DLI's can be called from either the top or bottom vertical blank).

 

Here is a demo with three CAM's (actually four, but two DLI's are combined to create the top CAM):

http://javatari.org/?ROM=http://relationalframework.com/Flashback_BASIC_DLI_Demo.bin

  • Like 2
Link to comment
Share on other sites

If the rows just happen to be arranged nose-to-tail, this works out nicely,

with each row's new right-most-cell being exactly the just released left-most-cell

of the next row.

This is exactly what MWP does. You just need to set new column very quickly so the change doesn't become visible at the wrong moment. Best way to achieve this is to expand screen to 48 which happens anyway when you turn on fine scrolling. Then you have 4+4 char "buffers" on the side so you can prepare new graphics as you wish.

 

No - the whole point of my first technique is that it is based on AVOIDING hardware wrapping, on the assumption that the wrap is both NOT useful

and indeed troublesome.

Sorry if I wasn't clear when comparing it with Nes :)

 

On Nes it wraps at double screen width. On Atari you would of course have to do it yourself using LMS in every row. Hardware wrap works only on regular widths (32,40,48).

  • Like 2
Link to comment
Share on other sites

popmilo,

I really liked the video of the Minicraft game with it's high resolution 4-way scrolling at 50 FPS - is there a second display buffer behind that like with the NES PPU scrolling?

8-way scrolling ;)

And no second buffer. Screen is 48 bytes wide and wraps using hardware. Those extra 4 bytes at each side give you enough time to draw whole column of tiles in couple of frames it takes to scroll screen by using fine hardware scrolling. Vertically it can be solved in similar matter. So for whole screen you would need space like (48_wide x (24+vertical size of tiles)).

 

I've split screen into 3 sections, each is 64 scanlines high.Because of 4K limit I needed something that would be easy to work with. 48x96 (half of 24 char rows as vertical height) would be too much as one segment would be more than 4k and cause problems.

 

With 64x48=3Kb it's not a problem. I do have to make copy of last scanline in each segment to make it wrap nicely when I move segments horizontally by changing LMS addresses. It's not a simple matter but it can be done with help of pen and paper ;)

 

My Pal-blending experiment complicated this routine ten times more as I had to put color changes of three registers each scanline. I've used free cpu time in each scanline to copy those repeated scanlines every frame so I don't have think about them separately.

  • Like 2
Link to comment
Share on other sites

The make track option shows the size of the game world, it looks to be a 3x2 group of characters representing a single screen (wraparound occurs after about 12 screens in each direction).

 

In order to have it represented as a static world where you simply use LSM redirects to scroll around would require over 70K. The game originates on cart for 16K Ram systems so has to be tight with it's memory usage.

 

The screen uses the area roughly from $2000 - $2A00 and is dynamically generated. If you use Altirra, open the debug and type the following which will display onscreen the LMS value of the first and last gameplay area lines:

WW 1E00

WW 1E27

 

Unsure what strategy it uses for rendering and scrolling and deciding on the screen origin - it seems to jump around.

Given the way it jumps around, there would be duplication of some or all of the rendering. Fairly sure the game world itself isn't modified which makes that easier, ie explosions and stuff are done with PMGs.

Link to comment
Share on other sites

I got the game running, and broke in

 

Following $230 led to a display list @ $1e00

db 1e00
1E00: 30 40 75 04 25 75 44 25-75 84 25 75 C4 25 75 04 |0@u.
%u.%u.%u.|
1E10: 26 75 44 26 75 84 26 75-C4 26 75 04 27 75 44 27 |&uD&u.&u.&u.'uD'|
1E20: 75 84 27 75 C4 27 55 04-28 80 00 C2 25 1F 00 42 |u.'u.'U.(...%..B|
1E30: 15 81 41 04 20 44 20 84-20 C4 20 04 21 44 21 84 |..A. D . . .!D!.|
1E40: 21 C4 21 04 22 44 22 84-22 C4 22 04 23 00 70 E0 |!.!."D".".".#.p.|
1E50: 50 C0 30 A0 10 80 F0 60-D0 40 B0 20 90 00 70 E0 |P.0....`.@. ..p.|
1E60: 50 C0 30 A0 10 80 F0 60-D0 40 B0 10 10 10 11 11 |P.0....`.@......|
1E70: 12 12 13 13 13 14 14 15-15 16 16 17 17 17 18 18 |................|

I have attempted to decode this, mainly using info from here

30 ; 4 blank
40 ; 5 blank
75 04 25 ; mode 5 (16 lines), LMS, V+H scroll
75 44 25 ; 2nd "
75 84 25 ; 3 "
75 C4 25 ; 4 "
75 04 26 ; 5 "
75 44 26 ; 6 "
75 84 26 ; 7 "
75 C4 26 ; 8 "
75 04 27 ; 9 "
75 44 27 ; 10 "
75 84 27 ; 11 "
75 C4 27 ; 12 "
55 04-28 ; mode 5, LMS, H scroll
80 ; blank line, DLI

I'm not sure what happens after that 80. Anyway, the rows in Antic mode 5 should take 40 bytes, but the LMS's are 64 bytes apart ($40), so there some scope for "normal" horizontal scrolling. It's certainly not MWP!

 

BugBear

 

PS I don't know why the first line of the DB keeps breaking; it's

|0@u. %u.%u.%u.|
Edited by bugbear
Link to comment
Share on other sites

Altirra has the .dumpdlist command which does the hard work of decoding what each instruction is doing. Following the scrolling playfield is blank lines and the mode 2 text/status area:

 

  1E26:      mode.h 5 @ 290C
  1E29:      blank.i 1
  1E2A:      blank 1
  1E2B:      mode.i 2 @ 1F25
  1E2E:      blank 1
  1E2F:      mode 2 @ 8115
  1E32:      waitvbl 210C

 

The 64 byte gap from 1 line to the next - since there's scrolling going on each line reads 48 bytes anyway. By using 64 it allows some back/forth movement without needing to do redraws. Also, a 64 byte margin is a direct power of 2 which can greatly speed up maths calculations.

  • Like 1
Link to comment
Share on other sites

This is exactly what MWP does. You just need to set new column very quickly so the change doesn't become visible at the wrong moment. Best way to achieve this is to expand screen to 48 which happens anyway when you turn on fine scrolling. Then you have 4+4 char "buffers" on the side so you can prepare new graphics as you wish.

 

 

 

Exactly the technique we used on Shadow of the Beast!

 

sTeVE

  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...