Jump to content
IGNORED

First person shooter on the TI? :)


Bones-69

Recommended Posts

This makes perfect sense. I am using 9938 Graphics mode six. Comparison based on some existing code I am modifying:

 

Old method: Use VDP high speed command to copy up to 54,272 bytes per scroll. Write the next line to the screen. Very slow.

 

New method: Process and write the next displayable line to the inactive area. A single pixel row requires 512 pixels x 1 pixel high x 1 byte/2 pixels = 256 bytes. A character-height scroll requires 256 bytes/row * 8 rows = 2048 bytes. When it is time to scroll, adjust register #23. ;)

The vdp is a thingy of 1985. Doing full screen scroll by brute force @50 fps is simply impossible. The new method is easily doable instead.

the msx guys do this:

wait for vblank int

when vblk, they write 256 bytes via vdp data port and update the scroll reg. the msx cpu is a z80 @3.5Mhz and in this vblank time they are able to move about 800-900 bytes before vertical retrace starts. So writing 256 bytes is doable. they do also sprite plane rotation by writing the entire SAT (128 bytes) + color table (512 bytes) = 640 bytes, reaching about the limit.

 

because doing full sprite plan rotation is cpu intensive, because of colorsat updates, i suggest you this:

- prepare 4 different SATs on VRAM, each sat have sprite color table rotated by an amount of 8 sprite planes.

first SAT have color table for sprites 0-31

second have color table rotated by 8 entries 8-31 then 0-7

third have color table rotated by 16 entries 16-31 then 0-15

fourth have color table rotated by 24 entries from 24-31 then 0-23

then restart from first.

- at every frame point the VDP SAT base address to one of the SATs.

- write ONLY the 128 part of the SAT @ the correct address (the current SAT address)

in this way @ every interrupt, you need to update only the 128bytes part of SAT, using to different, pre-rotated color tables and switching the SAT base address to one of the 4 SATs.

Of course, if you need to apply color changes, you need also to update 4 color tables, but you do this only one colortable per - frame

 

there are other ways, like using vdp commands during frame or in VBLK time

Link to comment
Share on other sites

With speed Tursi recently showed us the F18A is capable of and the extra functions, as well as the programming abilities of some in this forum. I have no doubt a First-Person Shooter will become a reality. It's gonna be a home run! I'm betting it'll be considered the best TI game of all time. One thing is certain, it'll set a new paradigm for all to follow.

Link to comment
Share on other sites

It might not look like much, but I'm proud to finally present the first test version of my raycaster that finally looks somewhat like it's casting rays (if you squint really hard)! Huge thanks to Tursi for helping me debug this and for the assembler version of the algorithm to render the wall slices. As it stands the engine is still way too slow to be usable, and I'm not sure this algorithm is the best way to go, but it's a start and a relatively good way to evaluate the rendering format. Ignore the random gaps between walls and the unwanted dark stripes on some of the walls. These are basically rounding errors in the casting routines that still need to be ironed out.

 

For those that want to try it out I've attached the program in FIAD format, just unzip and drop WOLFIE3D into a DSKx directory or dump it on a disk image (depending on your emulator of choice), and run WOLFIE3D from EA#5. Use the arrow keys to walk around and know that there's no collision detection so you'll walk straight through the walls (and if you move to far out of the map you're likely to crash the program). I can recommend using Classic99's overdrive mode, or MESS's throttle mode to speed things up a bit (although it can become twitchy on really fast systems then).

 

I've made a quick video as well, for those that don't want to download. I turn on CPU overdrive about halfway throught the video so you have an idea of how it runs with and without.

 

http://www.youtube.com/watch?v=OIG33UBHDG8

 

I'll clean up the source code and post it for those that are interested when I get some time. Just wanted to get this out there...

WOLFIE3D.zip

Edited by TheMole
  • Like 9
Link to comment
Share on other sites

Source attached. You're going to need gcc to compile this but once you've got the environment installed it should be easy to build. Just edit the Makefile to reflect to your compiler's path and where you want to the TIFILES to be installed, issue "make install" from the command line and you should be good to go. I included Tursi's libti99 (binary and headers only), 'cause you need that to build, but if you stick to the directory structure as in the zip file you should be good. The cart version doesn't work (too large, so it wraps onto itself and the world map is messed up), so use the EA#5 version.

 

I haven't spent much time cleaning up the code, so all the important bits are still in one file (wolfie3.c). Other than that, it should be quite self explanatory. The one tricky part in the code (to me at least) is the draw_slice assembly code, which is compiled but copied into scratchpad memory for speed and that requires some extra trickery that might look strange at first.

wolfie_src.zip

Edited by TheMole
Link to comment
Share on other sites

Thank you for the code. I didn't know how the algorithm worked, e.g. that you only cast one ray for each column.

 

I made a VDP RAM dump of a typical screen and imported it into Magellan. I only counted around 25 unique patterns in the first pattern table, and if this is the general case that a small set of patterns is used over and over, I guess you could do this a good deal more efficiently by updating the name table instead of the pattern tables. Any thoughts?

Link to comment
Share on other sites

Yeah, the raycasting part is a fairly lean algorithm and is relatively cheap to do on the 9900, it's the blitting from RAM to VRAM that takes by far most of the cycles currently.

 

But contrary to what you would intuitively think, I don't upload new patterns every frame. In fact, all patterns in the first two pattern tables are initialized to "0F0F0F0F0F0F0F0F" at the start of the program and they stay that way while the program is running. The first two nametables are initialized with 0 - 255 (so every position in every nametable points to a unique entry in the pattern table). The actual drawing is done by fiddling with the color tables. Since we're in bitmap mode, every 8 pixels can have two colors. By defining every pattern as "0F" we basically split every 8-pixel column in two pseudo-pixels: the odd pseudo-pixels are set by changing the background color in the color table entry for that pattern, the even pseudo-pixels by setting the foreground color. This gives us a horizontal resolution of 64 pseudo pixels. Since we only use the first two nametables, we have a maximum vertical resolution of 128 pixels, which we divide in two again by using the fake scanline effect. This gives an effective resolution of 64x64 pixels and since every byte in the color table defines two pseudo-pixels, we "only" need to upload (64x64)/2 = 2048 bytes to the VDP every frame.

 

So that's how it works today, since that gives us something which is all things considered relatively close to a flat framebuffer. Your idea sounds intriguing, 'cause the name tables are of course only 512 bytes. But in order to really make it more efficient you'd need to move to graphics mode instead of bitmap mode 'cause in bitmap mode you'd have to upload the color tables as well which are always going to be 2k (if rendering using scanlines, otherwise it's 4k). In graphics mode the color table is only 32 bytes, but we are going to run into color clashes (where a wall meets the ceiling and another wall everytime it's not on an 8x8 character boundary) so we'd have to work around that somehow and I'm not really sure if that would look nice (might make movement look a lot more choppy since you'll be forcing the transitions per character block). Then there's the question of how many patterns you need to model every theoretical wall slope. Trying to figure this out of the top of my head I would say that you can have 15 different slopes in 64 positions in a single pattern, so 960 different patterns (this definitely includes some duplicates, so it's maybe half of that), which is of course too much, especially if you want to support more than one color of wall.

 

All that said however, this is definitely the area in which optimizations are to be found. It's by far the most taxing part of the algorithm.

 

I have also been thinking about changing the type of raycasting I'm doing, moving to some form of hybrid raycasting/2D portal engine. The current algorithm is based on how Wolfenstein does it: the world is defined simply by putting cubes on a grid (the worldmap variable). I could instead define a world made up of rooms defined by 2D line segments (like a floor plan). Instead of checking against every square border in the map for every ray and testing whether or not the square is occupied by a wall (which is very much an iterative process: every frame almost 1000 inner loops are run before the 64 rays are cast), you'd check for an intersection of the ray with each of the line segments that make up the room, take anywhere between 4 to 10 line segments (aka walls) per room on average and it's clear that you'll need a lot less iterations. Finding the intersection point between a ray and a line segment is a more difficult calculation but you'd need to do a let less of them. On top of that, it's easy to optimize finding the right line segment by simply ordering them in clockwise fashion and starting from the same line segment you found an intersection with during the last ray. That way either your first or in worst case scenario second line segment will be the right one so your average number of intersection tests per ray will be between one and two.

The major limitation would be that the rooms need to be convex to avoid having your ray intersect with more than one wall per room, but apart from that there's three major benefits to this approach, I think:

  1. You can do more than just a grid of blocks and walls can meet at other angles than just 90°
  2. Storing your rooms will take less RAM than the current approach
  3. It should be faster than the current approach.

You'd need to do something clever to render through doorways however. That's where the portal engine ideas come in (the Unreal engine was the first famous portal engine, but that was of course a full 3D polygon renderer): a door is simply a special type of wall, called a "portal" that links one room to another. If your ray intersects with a line segment that represents a portal, instead of simply taking that distance and rendering your slice using that the engine will need to start testing against the line segments in the adjoining room as well. This runs the risk of becoming a recursive problem, but being a little careful with your level design should be enough to make sure that never happens.

 

Wow, I'm rambling... sorry about that, I'll shut up now :)

Edited by TheMole
  • Like 1
Link to comment
Share on other sites

You just know this would run like a mo-fo on the F18A :D

 

It would be an interesting challenge, as there would have to be cooperation between the 9900 (supplying map data somewhere in VDP for the GPU to translate and render).

There would be plenty of time left on the 9900 side for playing music and reading the joysticks, I reckon. It's beyond my abilities/time budget for me though, sadly.

  • Like 1
Link to comment
Share on other sites

It shouldn't be that hard to leverage the F18A. The most logical first step would be to run the draw_slice function on the GPU instead of the main CPU and that piece of code already is 9900 assembly in the current code. I could imagine simply uploading the calculated sliceheight and color for every ray, setting a pointer to this table at the end of the raycasting function and letting the GPU actually render the slices. This would not only save the CPU the trouble of calculating the color table, but would reduce the number of bytes to upload to the VDP every frame from 2048 to 128. Just that alone will lead to a very sizable improvement in framerate, but the TI would still be the one doing the actual raycasting (which in my mind makes it less "cheating" :) ).

 

An added improvement would be to have the GPU render the slices texture mapped instead of using a flat color. Something close to a "real" wolfenstein lookalike (except for the resolution) should be perfectly doable.

 

But... I don't have my hardware setup, and my F18A is still safely tucked away in a drawer and it doesn't look like that'll change in the next couple of months. So I'm stuck with emulation, which means no real way to develop F18A programs for me...

Link to comment
Share on other sites

  • 2 weeks later...

Rasmus' post got me thinking. I wasn't really correct when I said you'd need over 500 different patterns to pull this of by just changing the name table in graphics mode instead of the color table in bitmap mode. In bitmap mode, we effectively have pseudo-pixels that are made up of 4x2 hardware pixels, which means that all permutations are made with only 50 patterns. Unfortunately, you need these patterns twice in order to do shaded walls, resulting in 100 patterns for walls that can only come in one color (but two shades of that color). So in theory you could do two differently colored walls with 200 patterns and still only update the name table. Problem would then be color clashing artifacts when a slice changes color mid-pattern, but it would be possible.

 

So I wondered how it would look if I changed the pseudo-pixels to a size of 8x2 (losing half of my horizontal resolution, but keeping the vertical resolution). That way you only need 8 patterns per shade (although unfortunately they need to be split out on two color table entries in order to have different colors for the floor and ceiling).

 

The result runs much faster as you can see in this video (classic99 without overdrive), but in my humble opinion at times gets too ugly to be useful (especially when you're close to a wall on your side, look at the lower part of the left-hand side wall in the youtube thumbnail below or when I try to go through the blue doorway in the video):

http://www.youtube.com/watch?v=bCY0wM2U788

 

I also added collision detection (but not wall sliding physics, so it doesn't feel very nice yet... at least you can't walk out of the map anymore :) )

TIFILES version attached, let me know if someone wants to see the source code, the main raycasting function is exactly the same, but the draw_slice function is now all in C, and the blitting to the VDP is now just a single vdpmemcpy command from Tursi's lib.

 

WOLFIE3D.zip

  • Like 7
Link to comment
Share on other sites

That's pretty impressive, I'd say it's fairly acceptable!

 

I would note that vdpmemcpy is going to be slower than the copy loop we did - one of the tricks that I put into the copy loop to speed it up was to skip every other scanline with a read instead of a write -- a read will increment the VDP address with half as many waitstates as a second write. It made a visible difference in the performance, I would wonder if it still would on this version!

 

(edit) Ah.. never mind! I just read the description again and I understand why that doesn't make sense. ;)

 

Still, I think that's potentially fast enough for a game, I'm sure there's some tweaking possible :)

Edited by Tursi
  • Like 1
Link to comment
Share on other sites

Is the plan for you to develop this into a full game?

 

You know, I never had a game in mind when I started this, just wanted to see if it could be done. If anyone has a good idea for a game, I wouldn't mind taking a stab at it though.

 

Biggest problem is that with the current horizontal resolution it's near impossible to include objects in the engine, so I'd need to fix that first.

Edited by TheMole
Link to comment
Share on other sites

Been looking at Scarabeus for the C64, it was released in 1985 and as far as I can remember was the first sort of 3d game of its type, wonder what technique it used for it's 3d?

 

 

My guess would be they are using pre-calculated animations in this one (seeing how the player is always smack-dab in the middle of the corridor, always stops exactly at intersections and always turns in 90° increments). But given how many pixels change on screen every frame I'd say that this is unfortunately going to be a little bit too difficult for our little TI to pull off. The C64 CPU has direct access to VDP memory and that makes a huge difference in terms of how many updates you can push to the screen.

Link to comment
Share on other sites

 

You know, I never had a game in mind when I started this, just wanted to see if it could be done. If anyone has a good idea for a game, I wouldn't mind taking a stab at it though.

 

Biggest problem is that with the current horizontal resolution it's near impossible to include objects in the engine, so I'd need to fix that first.

 

Obviously, a 3D shooter al la Wofenstein would be the obvious choice :) But as you stated, there might be significant limitations to doing this, so maybe for starters a simple 3 D maze which is randomly generated every time, with some creatures scurrying after you and a time limit to find the exit... Here is a great 1981 game for the ZX81 computer that would be a perfect fit for this called 3D Monster Maze:

 

 

"The game uses a 16-by-18 cell maze which is randomly generated.[3][4] Initially the T. rex lies in wait. Once the player starts moving, the beast begins hunting. Thereafter, the T. rex may either calm down (if the player goes into a part of the maze that is far enough away), or become more active as the player comes closer. If the T. rex gets a direct view of its prey, the monster will run directly at the player.[3]

The T. rex anxiety level, reported to the player as a statement in the status line, provides an indirect clue to the player's relative distance from the monster. These statements are: REX LIES IN WAIT, followed by HE IS HUNTING FOR YOU, FOOTSTEPS APPROACHING, REX HAS SEEN YOU, and RUN HE IS BESIDE YOU or RUN HE IS BEHIND YOU. The player's speed is greater than the monster's, thus it is possible to escape by running (unless the player is trapped in a dead end).[5] The player can manually map the maze on a piece of paper with each step, but this becomes increasingly difficult as the pace increases. The fast pace can also lead to hard keyboard presses, which, in turn, can shake the computer/16K memory expansion connection, and lead to a sudden reset with several minutes worth reload time.[6]

Points are awarded for each step made by the player any time the dinosaur is on an active hunt. Since the player runs faster than the monster, it is possible to accumulate points by running around in circles with the monster just a few steps behind. Points are also given upon successfully getting away through an exit and into another maze.[7]

When the game ends, the player is informed about being "sentenced to roam the maze forever", and then can either "appeal" or continue playing again in the last maze. If the appeal is attempted, it is rejected with 50% probability, in which case the player is sent back to roam the previous maze again. An appeal which is accepted effectively results in the computer self-reset via BASIC's NEW statement

.

Edited by Vorticon
  • Like 1
Link to comment
Share on other sites

My guess would be they are using pre-calculated animations in this one (seeing how the player is always smack-dab in the middle of the corridor, always stops exactly at intersections and always turns in 90° increments). But given how many pixels change on screen every frame I'd say that this is unfortunately going to be a little bit too difficult for our little TI to pull off. The C64 CPU has direct access to VDP memory and that makes a huge difference in terms of how many updates you can push to the screen.

This looks very similar to Episode IV for the MSX1 which I beleive uses the lowres bitmap mode so in theory the TI should be able to handle it.

 

  • Like 1
Link to comment
Share on other sites

Rasmus' post got me thinking. I wasn't really correct when I said you'd need over 500 different patterns to pull this of by just changing the name table in graphics mode instead of the color table in bitmap mode. In bitmap mode, we effectively have pseudo-pixels that are made up of 4x2 hardware pixels, which means that all permutations are made with only 50 patterns. Unfortunately, you need these patterns twice in order to do shaded walls, resulting in 100 patterns for walls that can only come in one color (but two shades of that color). So in theory you could do two differently colored walls with 200 patterns and still only update the name table. Problem would then be color clashing artifacts when a slice changes color mid-pattern, but it would be possible.

 

So I wondered how it would look if I changed the pseudo-pixels to a size of 8x2 (losing half of my horizontal resolution, but keeping the vertical resolution). That way you only need 8 patterns per shade (although unfortunately they need to be split out on two color table entries in order to have different colors for the floor and ceiling).

 

The result runs much faster as you can see in this video (classic99 without overdrive), but in my humble opinion at times gets too ugly to be useful (especially when you're close to a wall on your side, look at the lower part of the left-hand side wall in the youtube thumbnail below or when I try to go through the blue doorway in the video):

http://www.youtube.com/watch?v=bCY0wM2U788

 

I also added collision detection (but not wall sliding physics, so it doesn't feel very nice yet... at least you can't walk out of the map anymore :) )

TIFILES version attached, let me know if someone wants to see the source code, the main raycasting function is exactly the same, but the draw_slice function is now all in C, and the blitting to the VDP is now just a single vdpmemcpy command from Tursi's lib.

 

Very impressive!

  • Like 2
Link to comment
Share on other sites

  • 3 weeks later...
  • 1 year later...

A little gem from 1985, "I of the Mask" by Sandy White, he was one of the most innovative programmers for the ZX Spectrum, this is a very early attempt and very convincing 3D "almost first person" game.

 

Strange that something like this could be done on an unspectacular little box of steel and rubber with bad sound and virtually no hardware graphics support.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...