Day 2
So, let's go over the basics of a polygon renderer -
So all 3d models are made of triangles. You do some funky matrix math on them (coming in a later session) to project their coordinates in the world-space into the screen space. From there, you have to determine which pixels are on the top somehow and get into the messy business of actually turning coordinates into pixels.
Now, taking a triangle made of x1,y1,x2,y2,x3 and y3, how can we actually determine which pixels are inside of it? The naive way is to check literally every pixel on screen - but this has some pretty obvious performance costs. Now, if our triangle had one edge that was perfectly horizontal, our job would be a lot simpler. We could just figure out the left edge and the right edge, and fill in a line between them.
Well, in the classical style of the engineer, to solve a difficult problem we just turn it into an easier problem and solve that. If we slice our triangle into two smaller triangles using a horizontal cut, they're now both the easy case to solve. We just iterate through each row, filling from the left edge to the right.
But we can do better still - this is where we're getting into the shaky math that i've come up with myself. In bitmap mode, no matter how you lay out your characters one byte on screen is going to be adjacent in ram to the one below it on screen rather than next to it. This obviously is gonna waste a lot of our time calculating all the addresses for each pixel on screen - so why not take advantage of it while we're at it?
First, we initialize every tile on screen to pattern 0 - this pattern is literally all zeros, will never display a color and will never be changed. But all other patterns are up for grabs - they're arranged in a nice little queue. Every time we try to draw a row of a triangle, we start with the left edge. We first check which pattern is currently there - is it a solid color pattern 0? if it is, we're going to grab a new pattern from the queue, copy 8 bytes of 0 into it since that's what was going to be displayed before, and write the new pattern we've set aside for it into it's place on the screen. Now, if it wasn't a solid color tile, we already know which pattern we're going to be rendering into. For each row of the pattern, it's simple to fill in using some bitshifts since the whole row fits in a register. If we were setting each bit to 1 as the color of this triangle, we'd calculate a mask with $FF>>(X&7). Then we'd just set NewByte = OldByte | mask for each row. And changing up the bitwise operations we can easily switch to setting each bit to 0 or some sort of dithered pattern for an intermediate shade.
Now, the right edge is pretty much the same thing, but what about the center? This is where I feel pretty pleased with myself. If we imagine for a second that the colors for each 8x8 tile are themselves a 32x24 bitmap, why couldn't we just do the exact same process on a triangle that's an 8th the scale of the first one? Except, instead of writing it pixel by pixel, we fill each space with pattern 1, which is just 8 rows of $FF. These still display solid blocks of the right color, but we don't have to write every transfer every single byte of them into the VDP. And if we ever tried to render part of a triangle onto one of them, we'd just turn it into a new pattern initialized with $FF on every row.
Now, obviously there's a million edge cases and minor optimizations to do, but that's the basic idea. For instance, it would probably be easier to use pairs of tiles as our patterns, since one row of a 16x8 tile would fit perfectly into our 16 bit registers, and then we could just rearrange it into two patterns when writing them into VDP ram (as I understand it, the VDP can't go quickly enough to do two writes one after the other, so a few SWPBs should take care of two birds with one stone). Hopefully all that made a bit of sense, and I'll be going over each part of it in more detail as we actually implement it.
Also, really dumb question, but how do I actually reserve a block of memory for variables? Right now I'm using the XDT cross development stuff since it seemed like the standard, but this really simple question has me a bit stumped