Jump to content
IGNORED

F18A 3D graphics


Asmusr

Recommended Posts

Here's my first attempt at some 3D vector graphics using the F18A: a spinning polyhedron.

 

post-35226-0-21768000-1420752945_thumb.png

 

You use the joystick to spin: left/right to spin around the y-axis, up/down to spin around the x-axis, and fire button to spin around the z-axis (one way only).

 

The demo is using the F18A 4 color bitmap layer to draw the graphics, and the F18A GPU for the 3D calculations and the polygon rendering routine. For each frame the GPU is performing the following steps:

  • Clear the bitmap layer
  • Draw the (up to 5) polygons that are currently visible
  • Rotate the vertices to prepare for the next frame
  • Translate the vertices to screen coordinates

Calculations are done using fixed point math, and sine and cosine values are looked up in a table.

 

The main CPU is only responsible for waiting for vertical retrace and activating the GPU at the right moment to draw a frame, it then reads the joystick (which the GPU cannot do) and stores the result in VDP RAM for the GPU to read. This goes on in a endless loop.

 

This all looks very well in the demo, but drawing on the visible screen would not be plausible in a game. With 6 polygons I can just manage to clear the screen and draw the polygons before the beam catches up. (I even had to change my scanline routine from one that was using the F18A GPU PIX instruction to draw a pixel, to a faster one that uses direct memory access in order to draw 8 pixels at a time, before it looked OK.)

 

So we need double buffering, which is fortunately very easy to do but is limited by the amount of VDP RAM. The demo is using a bitmap size of 192x192 pixels which takes up 9216 bytes of VDP RAM, so two of those would not fit. We also need room for the GPU code (which might fit into the additional 2K VDP RAM only accessible to the GPU), and we need room for the standard VDP tables. So this is already pretty crowded, but...

 

Perhaps we also need room for a third screen buffer: a depth buffer? In the demo I'm using a simple algorithm to remove polygons that are facing away from the 'camera' to remove the backside of the polyhedron. In another algorithm (Painter's) you sort polygons by their depth and draw them back to front, but this too only works for relatively simple scenes. So a standard solution is to calculate a depth value for every pixel you write, compare it to the value from the buffer of any pixel already drawn at the same place, and only draw the new pixel if it's closer to the camera than any pixel already written. Perhaps this is overkill for our level of graphics, I'm not sure what algorithm 8-bit games were using?

 

One question that perhaps Matthew can answer: Why is there a pixel at the bottom right corner of the bitmap on my hardware? :) It doesn't show up in emulation (js99er.net). Does anyone else see it?

 

 

 

 

Poly3D.zip

  • Like 6
Link to comment
Share on other sites

Very nice indeed! Perhaps a more reasonable approach however for a game would be to just use wire frame graphics in order to avoid the delays in filling and hiding polygons. Coupled with a smaller view window (say only 2/3rd of the bitmap screen) it might just be possible to get it to work. I have attempted such work within the confines of TI FORTH using the excellent book called Flights of Fantasy as a reference for the necessary calculations, but it was very slow despite using trig look up tables as you did.

Link to comment
Share on other sites

Very nice!

 

 

Perhaps we also need room for a third screen buffer: a depth buffer? In the demo I'm using a simple algorithm to remove polygons that are facing away from the 'camera' to remove the backside of the polyhedron. In another algorithm (Painter's) you sort polygons by their depth and draw them back to front, but this too only works for relatively simple scenes. So a standard solution is to calculate a depth value for every pixel you write, compare it to the value from the buffer of any pixel already drawn at the same place, and only draw the new pixel if it's closer to the camera than any pixel already written. Perhaps this is overkill for our level of graphics, I'm not sure what algorithm 8-bit games were using?

 

Well, it really depends on what type of 3D graphics you want to do... But a software z-buffer will almost always not be the right answer.

 

If you're trying to make a 1st or 3rd person view indoor shooter, there are basically two common approaches you can follow:

  • A "portal" rendering engine (ala Unreal): you divide the world into convex rooms (called "sectors") that are connected via "portals" (typically doorways or windows in levels). First you determine the sector you are in, then you determine the visible polygons in that sector via frustrum culling (simply testing whether polys are at least partially within your field of view). For each of the polygons that are flagged as being a portal, you adjust your frustrum and recursively render the sector it links to. Everything else within a sector is rendered using the painters algorithm, which we know will always work since we have a convex space.
  • A BSP rendering engine (ala Quake): instead of storing your polygons in a flat list, you store them in a binary tree. You build the binary tree (offline) by taking a random first polygon and using that to divide your space in two subspaces (hence the binary nature of the tree): a subspace consisting of the polygons that are in front of the plane defined by your first polygon, and a subspace consisting of the polygons that are at the back of that plane. When rendering your BSP tree, you start at the root node and simply test if your viewpoint is in front or in back of the polygon defined in the root node and recursively follow the same procedure for the "side" of the tree from that node onwards you've just determined. Once you hit a lead node, you render the polygon there and move back up through your recursion, rendering the polygon at that node, and so on...

For resource constraint devices, the BSP tree is by far the fastest approach (linear, and no sorting required, just a simply back/front test). However, it was never used in 8-bit games since Doom was the first game to actually use the technique (although in pseudo 3d, 'cause Doom used lines instead of polys).

 

For outdoor rendering, or other non-first person setups there's less room for aggressively optimized rendering algorithms. Most efficient would probably be to simply do higher level culling and occlusion tests (e.g. bounding box tests, octrees, ...) combined with the BSP approach for what are typically called "brushes".

 

Having said that, most 8-bit 3D that I've seen is not really 3D, but uses 2D raycasting to create a 3D effect like in my Wolfie3D demo.

Edited by TheMole
Link to comment
Share on other sites

For resource constraint devices, the BSP tree is by far the fastest approach (linear, and no sorting required, just a simply back/front test). However, it was never used in 8-bit games since Doom was the first game to actually use the technique (although in pseudo 3d, 'cause Doom used lines instead of polys).

 

Yeah, I think you're right, BSP would be the best approach. A Z-buffer would take up too many resources unless you made it very low-res. But the next thing I will try is probably double-buffering with more polygons.

 

BTW, I have now added the demo as one of the resident items in js99er.net, but be aware that the GPU timing is far from realistic.

Link to comment
Share on other sites

I have not had a chance to run the code yet, but as always it looks like it will be great. Keep in mind that the GPU has 16-bit access to the VRAM and you can clear two bytes at a time as long as you stick to even address. The blitter in the upcoming V6 firmware should also help with clearing the screen between frames, as well as horizontal and vertical fills.

 

I will check on that pixel you mentioned

Link to comment
Share on other sites

As a curiosity I have made a version of the 3D demo that's not dependent on the F18A and is using standard bitmap mode. I'm drawing to a buffer in CPU RAM and copying the full 6K to VPD RAM each frame.

 

post-35226-0-05231400-1420925533_thumb.png

 

Run: E/A#3 P3D2 or E/A#5 POLY2

 

The frame rate is actually a little better than I expected (2-3 FPS), but adding colors would slow it down significantly since this would add another 6K to be copied. And colors would also create problems on the boundaries between polygons.

POLY3D.dsk

  • Like 3
Link to comment
Share on other sites

Impressive work on this! I'd be really surprised if there isn't more speed to be found. I've seen a spinning cube on just about every other 8-bit-era system so far. It's kind of a demoscene in-joke as to who will be the first to implement it on any given platform (Rasmus, I believe you've just claimed yourself a title here...) Is the TI really this sluggish?

 

Here's one on Atari VCS for example:

http://youtu.be/wcCJM7b9EMU?t=2m58s

 

I know I've seen VIC-20 & ZX Spectrum examples before too. And of course approx. a million on C64.

Link to comment
Share on other sites

Impressive work on this! I'd be really surprised if there isn't more speed to be found. I've seen a spinning cube on just about every other 8-bit-era system so far. It's kind of a demoscene in-joke as to who will be the first to implement it on any given platform (Rasmus, I believe you've just claimed yourself a title here...) Is the TI really this sluggish?

 

Here's one on Atari VCS for example:

http://youtu.be/wcCJM7b9EMU?t=2m58s

 

I know I've seen VIC-20 & ZX Spectrum examples before too. And of course approx. a million on C64.

 

I'm sure it could be optimized if your only objective was to make a spinning cube. Then you could take advantage of the symmetric properties of the cube, for instance, and try to calculate as much as possible in advance. But my polyhedron is not actually a cube, and I have tried to make my routines general rather than optimizing them for a specific purpose. The biggest speed gain could probably be obtained by reducing the part of the screen that is updated. I'm clearing 6K of CPU RAM and copying it to VDP RAM every frame, and that will always take some time, but it could be optimized a bit by running the code from scratch pad.

Link to comment
Share on other sites

Now that I'm looking at this again, it seems like you are doing parallel instead of perspective projection. Is that an artistic choice, or are you doing it to save on a division? Any idea if the additional division per vertex would pull down the framerate by much (on the F18A version, that is...)?

Link to comment
Share on other sites

Now that I'm looking at this again, it seems like you are doing parallel instead of perspective projection. Is that an artistic choice, or are you doing it to save on a division? Any idea if the additional division per vertex would pull down the framerate by much (on the F18A version, that is...)?

 

I don't think a parallel projection would pull down the frame rate. The F18A should have capacity for displaying several more polygons provided you switch to double buffering. But I don't see much point in a parallel projection for a single, simple object.

Link to comment
Share on other sites

I might be wrong, but aren't you doing parallel projection now (it is the cheaper of the two operations)? It's a bit difficult to see 'cause of the odd shape of the polyhedron you're using, but if in the demo I rotate over one axis I can clearly see the projected line segments remaining the same on-screen size as they rotate into the screen (towards the back). If you were doing a perspective projection the line segment would become smaller the further away from the viewer it goes. Typically, perspective projection is used in 3D games since it gives a better impression of depth, whereas parallel projection is mostly only used in certain CAD applications and views where the length of a line segment needs to be visually comparable to others regardless of z-position.

 

Note, parallel projection is often called orthogonal or orthographic projection, like in the picture below, so my terminology might be causing some confusion. Just to be clear, from the looks of it you are not doing the second type of projection, right?

 

IC15161.jpg

Link to comment
Share on other sites

I might be wrong, but aren't you doing parallel projection now (it is the cheaper of the two operations)? It's a bit difficult to see 'cause of the odd shape of the polyhedron you're using, but if in the demo I rotate over one axis I can clearly see the projected line segments remaining the same on-screen size as they rotate into the screen (towards the back). If you were doing a perspective projection the line segment would become smaller the further away from the viewer it goes. Typically, perspective projection is used in 3D games since it gives a better impression of depth, whereas parallel projection is mostly only used in certain CAD applications and views where the length of a line segment needs to be visually comparable to others regardless of z-position.

 

Note, parallel projection is often called orthogonal or orthographic projection, like in the picture below, so my terminology might be causing some confusion. Just to be clear, from the looks of it you are not doing the second type of projection, right?

 

Sorry I was writing in a hurry. What I meant to say was that I don't see much point in doing perspective projection for a single object. You're right I am doing a orthographic/parallel projection.

Link to comment
Share on other sites

I can't help with more VRAM at this point, but maybe I can give you back some cycles. I'm finally getting back to testing the changes I made for the V1.6 F18A firmware update, and the blitter will be pretty fast. Here are some capabilities:

 

constant -> destination: 1-byte every 10ns, or 163.84us to clear all 16K of VRAM, or 100 million bytes/sec.

 

source -> destination: 1-byte every 20ns, or 50 million bytes, sec.

 

Clearing 9K would take about 92.16us, or approx 1.3 scan lines (TI scan lines @ ~64us per scan line). In one TI scan line you could copy (src -> dst) ~3200 bytes, or fill (constant -> dst) ~6400 bytes.

 

The blitter also has a an 8.8 signed scale factor for both the source and destination addresses.

  • Like 2
Link to comment
Share on other sites

  • 2 weeks later...

I was just thinking, the GPU can easily test the current scan line so you could clear-behind during the active display. With the DMA, clearing half of the screen should take less than one scan line, and you could even then render the top half of the display. That would leave you the vertical refresh and top half of the display to clear and draw the bottom. A bit of a pain I know, but possible.

Link to comment
Share on other sites

  • 2 weeks later...

Actually I should call the new feature a "DMA", not a "blitter", since it works on bytes not pixels. If there was an 8-bit per pixel mode (and enough RAM) then it could be considered a blitter, but that is not the case with the F18A.

 

Blitters always works with bytes or blocks of bytes which is what graphical data is. DMA is a different matter. BLITter gets its name from BLIT ( BLock Image Transfer) operations. You might refer to it more like BLock Data Transfer (BLDT) or a BLDTer if you like if it describes a block data transfer similar to image but more general purpose but DMA adopted the used of Blitter technology in what is known as Block Mode DMA so we can effectively have a block mode DMA and in fact the Commodore 128 had this in its MMU features.

 

BL-DMA (BLock mode- DMA) might be an appropriate terminology if that is what describes the DMA operation style.

 

In addition to using this f18A VDP, I'm thinking of using Green Array's EVB001 - dual GA144A12 equipped board, with my TI-99/4A and possibly also the Amiga 1200. This would be an interesting and powerful addition to the TI-99/4A especially in the Forth programming realm.

Edited by Wildstar
Link to comment
Share on other sites

  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...