I was thinking, with my title screen for Sokoboo (see pic).... wouldn't it be amazing if those little guys could animate and walk around. So sad they can't. But then I got to thinking... hang on... maybe. So, this is a bit of a stream of consciousness pretty much a what-if/how. This is the sort of thing I write to myself when I'm designing a new system - just a walkthrough of what data and processes are required, a preliminary examination of the sizes of things, the locations and where the data lives, etc. I thought I'd do this for everyone to see for this one. Generally I turn this into a series of defines for the buffers required, banks to use, etc, and then I start to write the code to transfer data between the buffers. So, here goes... how would it work....?
So, the colour graphics are achieved through Interleaved Chronocolour (TM) - ICC - which is basically three scanlines of differing colours, and on/off pixels on each of those scanlines forming 8 possible combinations (over the 3 lines). Those 8 combinations appear to the eye as 8 colours, as the scanlines/colours blend together. Mostly. You can't actually choose what colour each of those 8 are... you can choose the 3 scanline colours, and of course no colour - black. The other 4 colours are a consequence of blending of the three you choose. So, colours A, B, C, and black(*), you get ***, A**, *B*, **C, AB*, A*C, ABC, *BC - and in this case, A*C probably doesn't 'blend' terribly well because of the blank line between them.
Now, there are 40 pixels across the screen. 33 or so are used in the actual title-screen picture shown, and 7 for the "SOKOBOO". But let's say, 32. Or maybe 40. What is the RAM requirement if we were storing all this as a modifyable bitmap? Well, 6 bytes/line. In the sample, it's 210 scanlines (/3 = 70 'pixels'). But, 210 scanlines * 6 bytes, thus 1260 bytes. That's too much. Need it to be < 1024 bytes so it fits in a single RAM bank (3E format). So, let's go with 32 pixels (i.e., drop the right-side PF2 - 8 pixels), OR, drop the left 4 and right 4. Doesn't really matter right now. We now need 5 bytes/line. That's 5*210 = 1050. Still too much but not by far. So, how many scanlines can we support? 1024/5 (byte/line) --> 204 lines. That's not much less, should still look "full screen".
So, we have 204*5 = 1020 bytes of RAM - a single bank - for the colour screen bitmap.
3E format requires the RAM to be a 1K segment, and the upper 1K (+$400) being the write address.
However, the sneaky bit, if we first switch to RAM bank for writing the bitmap, and THEN switch it as a ROM bank, then we can use the UPPER 1K for code that accesses the (read-only) bitmap for display purposes. So, the idea - bitmap in low 1K, display code in high 1K. And manipulation code in the fixed bank (2K). So the bank can 'display itself' when switched in as ROM. That should save space in the fixed bank, which is critical space.
The sprite needs to be shifted (appropriately, based on X position and screwy PF ordering), so we add another byte for shifting purposes. 2 bytes x 32 -> 64 bytes. Probably way big but let's go with that for now... So, I'm thinking put that buffer in zeropage. You first copy a frame from ROM into the zero page buffer. Code in fixed bank. Doesn't matter, could be any bank. Then you shift the buffer as required. Then you switch in the RAM bank for the bitmap and you first mask OUT the pixels that are being drawn, and then OR-in the zeropage buffer into the correct place in the RAM bitmap.
So, onto the 'drawing'. Think of one of those men as a "sprite". So, sprites might be a maximum of (say) 8 pixels wide (one byte), 32 lines high. But, 32 lines isn't divisible by 3, so let's take that down to 30 lines high. That's 10 ICC pixels. Mmmh. Not enough. The guys in that title screen are roughly 24 ICC pixels high, so this needs a rethink. 24 ICC pixels --> 72 lines high. x2 -> 144 bytes, that's not going to fit into ZP. Mmh.
SO, let's put the sprite draw buffer in with the bitmap itself. That's going to reduce the bitmap size.
1024 bytes - 144 (sprite buffer) = 880 bytes left for bitmap. /5 bytes/line --> 176 scanlines. So, it's getting smaller, but let's forge ahead.
Use the tile engine concept of allocation of time into chunks so you only do this stuff when time is available.
We have, in a single RAM bank (1K RAM)
870 byte screen, consisting of 5 bytes/line x 174 lines
Why 174? Because it's divisible by 3, and ICC pixels are 3 lines high.
144 byte buffer for sprite transfer
consisting of 24 ICC pixels deep x 1 byte (8 pixels) wide.
with an extra 'padding' byte so we can do a full 8-pixel shift as required. Shifting is going to be a bottleneck/slow, but let's brush that under the carpet for now.
we could have in ROM 'shift' tables (7 of 'em) which give 2-bytes shifted for a single byte input. And then copy those two bytes to the sprite transfer buffer. Later.
OK, so given frame X, from the fixed bank we switch in that frame, and we copy its data into an accessible RAM location. That's going to be 1 byte wide x (24x3) bytes deep (i.e., 72 bytes). That pretty much has to be in ZP ram. But we can, of course, put it in an overlay - shared with other variables when we're not doing actual transfers.
So, now we have the frame in the ZP buffer, we shift and copy to the RAM bank holding the sprite transfer buffer. Two bytes/line (shifted already) and 72 lines deep. This puts the raw, shifted bitmap in the correct RAM bank. At this stage we have pretty much 0 bytes left in the RAM bank so all code accessing the RAM MUST be in fixed bank.
The next major hurdle is the odd PF bit ordering. IN fact, this should be taken care of by the fixed-bank code at the point it's doing the shifting. The bit ordering will be based on the x position of the sprite - and that will also affect how the shifts are done. Can pre-shift then 'mangle' to the PF coordinates, or if we have table lookups then perhaps that can be incorporated there. Tricky.
Let's assume that the sprite buffer in the RAM bank aligns properly with the write to the bitmap buffer. Then, it's just a matter of copying two bytes/line DIRECTLY into the bitmap, and doing that for 72 lines. Perhaps skipping zero-bytes if that's quicker. That will be a simple 'overwrite' draw. The next major milestone would be to do a masking-draw. That is, to keep the existing pixels in the bitmap, and only draw in the new pixels. Well that can be done simply with an "OR' draw - load/or/store. That would work OK - but colours would morph as men walk over each other. That might be a necessary evil.
Then we get into way more complex stuff of masking OUT pixels which are set (in any of the 3 ICC scanlines per pixel), and then OR-ing in the data. This would suggest a mask be provided by the fixed-bank code. Note the mask would be 1/3 the depth, as 3 lines -->1 ICC pixel. So, mask would be 2 bytes/line * 24 lines = 48 bytes. We don't have that free in RAM bank, but at this stage the 72 lines in ZP are now unused/free, so the mask could be placed there.
The draw would be load/and mask/or data/store. For 2 bytes/line * 72 lines. That's going to be about 40 cycles/line --> say, roughly 300 cycles per sprite to copy the data to the bitmap + looping overhead. That's workable.
That would be fine if we had a clean slate and could draw afresh everything every frame. So that really suggests that two buffers are available - one that is shown on screen, and one being drawn-to. So, introduce another RAM bank which is effectively a duplicate of the above. One being drawn to, the other being displayed. And they toggle. Then, we don't need a 'sophisticated' erase. The bitmap is just bock-cleared (perhaps a bounded-clear for efficiency). In any case, writing 0 to 870 bytes would take, what, I dunno... let's say 4000+ cycles. At best. That is going to have to be broken down into a multi-part (fragmented) bit of timesliced code. That's OK. But it will take a while. We could even have a THIRD bitmap buffer which holds the immutable background. That is, the background in front of which, and over the top of, all sprites are drawn. Think of backgrounds to fighting games, for example.
So, clear the buffer either through copy of the BG buffer (difficult/very slow with the banking) or 0-clear. Draw the sprites. Swap buffers. Repeat.
Could use the 72 byte ZP sprite buffer as an intermediary for the BG transfer. 72/5 = 14 scanlines, so 174/14 that would be 13 or so iterations to do the lot. Again, timeslice as required.
Well, I can't see anything inherently unworkable about this. It's much much much simpler than boulder dash, just will take a lot of byte shuffling. I think it's workable, in other words. The frame rate will probably be an issue.
IN short, a bitmap screen 32 pixels wide (say, 4 bytes blank Left/right edges) which is an ICC image 32 PF pixels x 174 scanlines (58 ICC pixels) high. Each ICC pixel (3 scanlines high) can be one of 8 'colours' - the colours as described above. This is analogous to the screen image shown - the limitations are similar, except the bitmap will be a tad shorter in height. On that bitmap we have a static background in front of which we have an arbitrary number of animating sprites. Each sprite can be positioned on any X pixel, and on any ICC-y pixel. The sprites would overlap/mask correctly, with priority in reverse of draw order.
That would actually be a super-cool system. My first thoughts are: "do-able"! My second thoughts are "possibly slow". But you know, even 10fps is acceptable in some games. It would be so super-cool to see that screen with the guys walking about, the blocks actually being pushed...!