Jump to content
IGNORED

Road Rash pre-alpha on Jaguar at 30 fps


VladR

Recommended Posts

- .long alignment on labels is occasionally ignored (still haven't figured why - but it's clearly a bug in the tool), thus I had to come up with a workaround in code

Which assembler and linker is this? Can you provide a short example that demonstrates your problem?

Link to comment
Share on other sites

Seeing this thread bump reminded me of a question I forgot to ask. Given that the 'theoretical' maximum resolution most people will be able to run Jaguar stuff it on a TV or RGB monitor is 320 x 400/480 or 640 x 240 or even 640 x 400/480 *ish* (sorry - my memory about what CRTs / 15 kHz monitors can do is very hazy), other than for theoretical tests of what the Jag can push, what's the 'best' or most appropriate resolution to aim for 'in-game'? Other than memory and pixel shifting constraints, what would be the best resolution to target?

Best for what?

If you want maximum performance, use the lowest resolution (and the smallest number of colors).

If you want maximum visual quality, do the opposite.

If you're looking for something more-or-less balanced, roughly 320x240 in 16-bit (CRY or RGB) mode is probably the sweet spot, and that's what most Jag games use (and games for other consoles of that era are similar).

 

is there a chance that a game (with all associated 'AI' and music and sounds and objects) could run hi-res or interlaced?

LinkoVitch's Reactris runs in interlaced mode. Edited by Zerosquare
  • Like 3
Link to comment
Share on other sites

Which assembler and linker is this? Can you provide a short example that demonstrates your problem?

I don't have access to the code right now, but it's usually something as simple as:

 

.long

MyJumpLabel:

 

 

During build, I'm displaying all symbols, so I can always check the address [and now that I have runtime hexa output] confirm the addresses are only word-aligned.

 

This is not exclusive to GPU section, it happens in the 68000 section too, and it's been a source of great frustration especially in past, when I didn't have the debugging functionality I have now and didn't understand why adding two unrelated lines of code suddenly breaks the stored values in memory (way down the line). Took a while, at that time without linker symbol printout, to figure out why it's printing values from different variables (especially if they were Byte, not word).

 

As for the versions, JagChris will probably want to kill me [rightfully so, might I add], but I'm still using the original, oldest versions. I don't like to break&mess with my build environment, and now that I know exactly what's going on, and have code workarounds in place finally, it's not a huge deal anyway...

Link to comment
Share on other sites

Seeing this thread bump reminded me of a question I forgot to ask. Given that the 'theoretical' maximum resolution most people will be able to run Jaguar stuff it on a TV or RGB monitor is 320 x 400/480 or 640 x 240 or even 640 x 400/480 *ish* (sorry - my memory about what CRTs / 15 kHz monitors can do is very hazy), other than for theoretical tests of what the Jag can push, what's the 'best' or most appropriate resolution to aim for 'in-game'? Other than memory and pixel shifting constraints, what would be the best resolution to target?

From 2D art assets production standpoint, non-square-pixel bitmaps are harder to draw. I don't think PaintShopPro has that capability, I think I've seen it in Photoshop, that you can define rectangular dimensions of a pixel. For example, 640x240 means 2:1 ratio horiozontally. But 512x240 means 1.6, which is counterintuitive.

 

From 3D standpoint, it does not matter, as when you write the engine generically enough, higher resolution will only result in sharper texels, not aspect ratio issues.This is, where the higher resolutions make real sense, as they greatly benefit visual quality (less shimmering, more texture detail visible from distance, and so on). Of course, with the obvious caveat - performance.

 

As for the interlaced resolutions (480 / 576 lines), I am of the opinion, that they actually make sense now, when almost everybody has LCD, since they can't reproduce flicker - e.g. they just show all scanlines, but without flickering. I haven't seen it myself, but I've read plenty threads where people described interlaced resolutions on LCD as perfectly crisp and flicker-free (obviously).

 

Today, for a 2D game, I would target a framebuffer of 768x480. 768, because that's the closest line width that the OP understands, and just artificially reduce gameplay area by 32 pixels on each edge. OP can easily display up a scanline of 720 pixels in one pass (usually, 704 physically visible, at least on my end, based on the video registers set-up).

 

 

 

All of these look great, but is there a chance that a game (with all associated 'AI' and music and sounds and objects) could run hi-res or interlaced? Even a maximum of 720 x 480 would be amazing. I recall being slightly disappointed that I never saw any Jag games run any action in hi-res / interlace mode, especially Atari carped on about theoretical high resolutions that it could do.

First of all, lots of games process music and run purely on 68000, without ever touching GPU/DSP, yet still get enough bandwidth to play some audio.

 

Jag has absolutely enough bandwidth to play audio in HighRes. Just check my older transparency test, where I clear&display using Blitter a framebuffer of 1536x200 and then cover it fully with transparent tree bitmaps - meaning, OP has to process the 1536x200 second time. And during all that time, an idle loop on 68000 is constantly, 100% of frame time, banging on the bus (just doing nothing, but slowing down the system very effectivelly nonetheless).

 

Transferring a 1-2 KB of music data each frame to DSP's 8 KB cache is practically nothing compared to the obnoxious amount of data that Blitter/OP have to transport each frame in that resolution (~300 KB vs ~64 KB in 320x200).

 

Blitter-free games, e.g. 2D platformers, top-down screen-based games, where you only adjust the OP List of bitmaps (or use Blitter only when switching screens), I'd argue they should be able to play some simple MIDI even via 68000 in that resolution, let alone from DSP.

 

One of my desires for jag is getting something like Pitfall II running in PAL's 1382x576. That would look magnificently sharp even on large LCD. No Framebuffer here (it would be ~900 KB at 256 colors!, clearing that would take forever too, let alone filling it through blitter), just manipulating OP List of all the 2D tiles, and letting OP do what it was designed for. Since rendering would be done totally on OP, GPU/DSP/68000 are free to handle audio and updating OP List (which you have to do anyway, even if you have just 1 framebuffer bitmap there).

 

Not sure, I'll get to it this year, though - still so much to experiment with...

 

This year, for sure, I want to try the H.E.R.O. 3D engine in 1382x576 (12x more pixels than 320x200). If I could fit all texturing there under 4 vblanks (e.g. 12.5 (PAL)/15 (NTSC) fps), it'd still be quite smooth and playable, and the recent Blitter phrase texturing throughput tests I did in 1568x200 hint it actually should be within the remote realms of possibilities :)

  • Like 3
Link to comment
Share on other sites

As for the versions, JagChris will probably want to kill me [rightfully so, might I add], but I'm still using the original, oldest versions. I don't like to break&mess with my build environment, and now that I know exactly what's going on, and have code workarounds in place finally, it's not a huge deal anyway...

Well yeah I can totally see taking up GPU memory and cycles creating workarounds for problems that could probably be fixed by updating to the latest versions of rmac/rln.

 

That's not batshit crazy at all. ?

Edited by JagChris
  • Like 3
Link to comment
Share on other sites

Well yeah I can totally see taking up GPU memory and cycles creating workarounds for problems that could probably be fixed by updating to the latest versions of rmac/rln.

 

That's not batshit crazy at all.

Actually, it's not, given the direction the time flows in our universe. Any time, from now on, spent on updating the tools (even if it's just 30 minutes), is just time totally wasted, as:

1. it's not going to bring the time I wasted on troubleshooting this issue in last few months,

2. it won't save any future time (as I already have the workarounds in place).

 

In other words, if the tools were not buggy, the time I wasted last few months, on debugging what the hell was going on, I'd already have implemented this:

- road hills/curvature

- AI

- audio

- at least 3-4 additional renderpaths, experimenting with various rendering techniques (yes, there's still a lot to experiment with)

 

And that's a very conservative estimate, given the compound productivity during my high-focus weekends (where, the more features you initially implement, it's like a turbo, the more additional ones you manage to implement as a bonus - remember the LOD terrain ?).

Link to comment
Share on other sites

Small update from this morning and yesterday:

- since I now have GPU chunks, I can separate the 3D pipeline substages into separate chunks, thus leaving more cache for the code&data of respective chunk, which I did today with road texturing

- there was a lot of idle downtime, when GPU was waiting for Blitter till it finishes copying current scanline into framebuffer

- I've finally implemented the scanline doublebuffering codepath - e.g. while Blitter is blitting current scanline, GPU in parallel prepares another scanline without waiting

- this reduced the wait time by exactly 50%, so there's still 50% of the Blitter wait time, that I can use in future for additional effects (or features)

 

- what's interesting is, that the road texturing time dropped to below 33% of frame time -> meaning, I can have triple the amount of road texturing, and still keep 60 fps

- if we go for 30 fps, that's (3+3) = 6x amount of texels occupied by road

- if we're willing to drop to 20 fps, then we can have (3+3+3) = 9x amount of texels occupied by road. That's quite substantial amount of texturing for a jaguar, though not many genres play well in 20 fps (then again, lots of games play in 15 fps range)

- but, something like Legend Of Grimrock / Dungeon Master, does not need 60 fps, and 15-20 fps is more than enough

- this is still just the slowest pixel transfer of Blitter, not phrase transfer (which is obviously much faster)

 

- in practice, this means, that if I refactor the building texturing to use same approach (at the cost of lowres textures), it should be possible to run together with the road at 60 fps

 

- since there's no way, the road curvature/hills/ai/music/input/trees will eat a full vblank on GPU, the worst-case scenario is 30 fps (e.g. 2 vbls) for all elements of the game

- and we still have 2 frames worth of DSP time and 68000 time (both doing nothing right now, except for 68000 hogging the bus)

 

- I'm getting more confident, that on PAL, the city section of RoadRash, assuming phrase Blitter transfers, could actually run at full 50 fps on jag (including music)

 

I'll now go and refactor the building rendering into a second GPU chunk (may take few days), and integrate it with the road and trees, so that we'll get closer to something playable...

  • Like 6
Link to comment
Share on other sites

I had few hours this morning, so decided to attempt to go for jag's panacea : phrase-blitting, which means using full 64-bit bandwidth of jag, which is in huge contrast (8 Bytes per transfer) to transferring just 1 Byte at a time (pixel transfer).

 

If you recall, I'm generating only half left of the road on GPU, and create right half by mirroring feature of the Blitter (X_SIGNSUB flag) - making two blitter calls per scanline.

Unfortunately, just like the docs said, and I discovered this morning, Blitter wasn't wired to reverse the bitmap in phrase mode, and thus the right half must stay in the slow pixel mode. A potential workaround would be to use the second scanline temporarily and revert the bytes there for free (in parallel, while Blitter is blitting the left half), but that's at least one page of code, and it does not look like I have that much space available in cache. It's obviously a very bad idea to swap that code back and in per scanline, so the right half will have to stay in the slow pixel mode. I came up with another workaround, but that would break with curves, so it's not much use either.

 

So, the final speed of the road rendering is 28% of frame time. Extrapolated, in two frames time (30 fps), I can now texture (2 / 0.28 = 7.14) ~7x number of texels, and 3 frames time (20 fps), it is (3 / 0.28 = 10.71) 10x number of texels.

Luckilly, since we render 50% of texels via phrase blit and 50% via pixel blit, that's actually a very realistic middle ground for fullscreen texturing.

 

What does it mean ? Well, my texturing routine is finally fast enough to be used for first-person-shooter engine. As I have a fillrate to do 10x texels (as the road has) at 20 fps, this means there's a huge buffer for overdraw (situation, when you redraw same pixel multiple times, thus effectively wasting the performance). Swapping different textures will probably bring the factor of 10x, to something like 8x, but that's still substantial buffer even for rooms that have pillars in the middle of the room (e.g. severe overdraw).

 

The following is a quick rough outline, how an ideal FPS engine on jag could be designed (obviously, with proper rearrangement as there are multiple sync points):

 

First, we lock the framerate to 20 fps. This means we have 3 vblanks of GPU, 3 vblanks of DSP and 3 vblanks of 68000. How are we going to use them then ?

 

68000: On or Off ?

- contrary to popular misconception, 68000 can do a lot on jag

- this is the reason why I was keeping 68000 in a constant non-productive loop, busy 100% of the frame time (basically just wasting bandwidth of Blitter) - I always knew I will want to use those 13.3 MHz later on, and wanted to have a realistic picture of how much bandwidth I have, when 68000 is nonstop banging on the bus

- the thing is, when you shut it off, it won't bring enough performance in GPU to warrant the shut off in the first place. The Blitter will be able to blit more, sure, but it won't cover what functionality 68000 can do in full frame time. That code has to be blit onto GPU. Which means two 4 KB blits (blit the new first, then the old one). During which time GPU must be idle, so not only you lost the performance of 68000, but now you are down 2 blits (about 7-10% of frame time). That alone kills any potential benefit of stopping 68000, unless, of course, we're talking about some simple intro/demo, which can fully run out of 4 KB without a single code blit.

- 3 frames time mean 3x13.3 MHz = ~40 MHz : Now that would be just stupid to not use it...

- The key to using 68000 effectively is to minimize memory access for variables and let it work off registers as much as possible. Plus, unlike GPU/DSP, it can directly work with 8-bit values without wasting 3 bytes, so we can stuff a lot of important data just to its registers.

 

Engine component breakdown:

 

68000 :

Frame 1: Input, Audio

Frame 2: Scripting (Doors, switches, ...), AI

Frame 3: HUD (ammo/score/health),Crude World Culling (just prepare list of big chunks for DSP to process)

 

GPU: Frame 1-3: Texturing polygons (+clipping), swapping textures

 

DSP: Frame 1-3: Collision Detection, Visibility, world culling, Preparing list of polygons (for GPU to render)

 

This is a distant future, of course, but maybe I'll get to this before end of this year...

  • Like 4
Link to comment
Share on other sites

  • 1 month later...

68000: On or Off ?

- contrary to popular misconception, 68000 can do a lot on jag

- this is the reason why I was keeping 68000 in a constant non-productive loop, busy 100% of the frame time (basically just wasting bandwidth of Blitter) - I always knew I will want to use those 13.3 MHz later on, and wanted to have a realistic picture of how much bandwidth I have, when 68000 is nonstop banging on the bus

- the thing is, when you shut it off, it won't bring enough performance in GPU to warrant the shut off in the first place. The Blitter will be able to blit more, sure, but it won't cover what functionality 68000 can do in full frame time. That code has to be blit onto GPU. Which means two 4 KB blits (blit the new first, then the old one). During which time GPU must be idle, so not only you lost the performance of 68000, but now you are down 2 blits (about 7-10% of frame time). That alone kills any potential benefit of stopping 68000, unless, of course, we're talking about some simple intro/demo, which can fully run out of 4 KB without a single code blit.

- 3 frames time mean 3x13.3 MHz = ~40 MHz : Now that would be just stupid to not use it...

- The key to using 68000 effectively is to minimize memory access for variables and let it work off registers as much as possible. Plus, unlike GPU/DSP, it can directly work with 8-bit values without wasting 3 bytes, so we can stuff a lot of important data just to its registers.

 

Engine component breakdown:

 

68000 :

Frame 1: Input, Audio

Frame 2: Scripting (Doors, switches, ...), AI

Frame 3: HUD (ammo/score/health),Crude World Culling (just prepare list of big chunks for DSP to process)

 

GPU: Frame 1-3: Texturing polygons (+clipping), swapping textures

 

DSP: Frame 1-3: Collision Detection, Visibility, world culling, Preparing list of polygons (for GPU to render)

 

This is a distant future, of course, but maybe I'll get to this before end of this year...

 

I think it's pretty neat you're putting the 68000 to good use... I remember a conversation with the guy who did "Battlesphere" a while back asking him how he was able to use the 68K and still get real-time 3D action from the game while the chip was still on the bus using main memory. If I recall, he got all of the processors to work at different moments in time instead of working parallel with each other, which at that time kind of gave me a glimmer of hope for the 68000 role. It was, at the time, understood that the 68K was being treated like the main processor for that "warm & fuzzy feeling" among enthusiast, also being the reason most of the 3D games like "Checkered Flag" ran so choppy. I always thought of it a waste to not put the 68k to some good use especially for 3D stuff or pseudo 3D stuff.

  • Like 1
Link to comment
Share on other sites

use the 68K and still get real-time 3D action from the game while the chip was still on the bus using main memory.

My whole H.E.R.O. 3D engine runs from 68k only. Of course, I'm using Blitter & OP heavily for texturing, but still hover above 60 fps, while computing everything else on 68k (visibility, culling, transformations, bilinear filtering, OP List, input, ...). Not only it's 68k code, but it's actually high-level C. I wouldn't recommend examining the compiler output if you're prone to heart attacks. The resulting 68k code is (as expected from a compiler) very non-efficient. Often times it envokes the triple-facepalm feelings. Since I'm using C structs all over the place, about half of the frame time is killed just by the compiler's way of accessing structs :)

 

Point is, if I switched from C to hand-optimized 68k assembler, there's a lot of CPU/bus time that could be saved (for more features and effects). One of these days, this will be an interesting exercise - I suspect that with a 30-fps lock, 68k could handle 640x240.

 

 

If I recall, he got all of the processors to work at different moments in time instead of working parallel with each other, which at that time kind of gave me a glimmer of hope for the 68000 role.

I wouldn't generally stand behind this statement. it may have worked out like that for their particular game and engine architecture, but unless you have benchmarks proving that 68k can't do same stuff in parallel faster than GPU doing another code blit (which costs a lot of GPU time as well), it's just a guess. And of course, I mean Smart division of labor between the two. Putting 3D transformations on 68k while GPU is just traversing some arrays is just stupid.

And for that, you need to have written and optimized both codepaths, which is a lot of effort. That's not something that was remotely possible under commercial deadline situations.

 

 

 

also being the reason most of the 3D games like "Checkered Flag" ran so choppy.

I have briefly looked over the CF code and it's got a lot of GPU chunks being blitted. Basically whole engine and a lot of gameplay runs purely off GPU&Blitter.

So, I don't think 68k is the reason of its sub-10 fps performance drops. Its drastic performance drops are however directly related to the amount of polygons in the scene AND their on-screen coverage (e.g. fillrate). Notice how the valleys kill the framerate completely. From my experience with the flatshading rasterizer (the Wipeout codepath in my videos), the overhead of both the scanline traversal and the fillrate cost associated with those huge polygons is what causes the performance drops.

 

Now, I haven't used the Blitter in the Wipeout codepath - just pure software rasterizing, pixel by pixel, so we can safely assume that once I find the threshold scanline width when it makes sense to switch to blitter, it would run faster for those large-poly scenarios.

 

I honestly believe that if CF developers got a chance to spend 4-8 more weeks on optimizing the engine, we'd get to a 15-20 fps scenario (and never below 10 fps), but fundamentally, CF's combination of color depth and fog is approaching the bandwidth limits of jag - for a single-threaded engine.

 

A multi-threaded CF engine wouldn't actually ever drop below 15 fps (occasionally hitting 30 fps - depending on polycount of a scene), but that's a major engineering effort to sync all engine stages across DSP&GPU, so we can't expect that from first generation of engines on jag....

 

 

I always thought of it a waste to not put the 68k to some good use especially for 3D stuff or pseudo 3D stuff.

Correct - especially for games that run around 15 fps (e.g. 4 vblanks), we're talking 4*13.33 = ~53 MHz of unused power.

 

But, it's always easier (effort wise) to keep 68k disabled and when the game is fully running, turn it on and say:

"L@@K, the 68k is screwing my fram3rat3 1!1! " :)

 

 

 

As for my current update, while my jag coding spree survived the first move, it didn't survive the second one, less than 3 weeks after the first one (turns out I don't mesh well with 'neighbors from hell', so I said screw the money and moved again).

 

I also had a deadline at work, for which I was working from home and coding in mornings last 2 weeks. It was finished last Friday, so I can slowly get back to jag coding again...

 

Also, I've been playing a lot of MotoRacer4 on PS4 last 2 months and got a whooooole bunch of ideas for my jag coding of RR :)

  • Like 2
Link to comment
Share on other sites

  • 2 months later...

Didnt want to keep being OT in the CPS2 to Jaguar ports thread. So i will post this here:

This is a pretty cool engine for the GBA, another system with no dedicated 3d hardware. It uses flat shadded polygons for the tracks, but cars seem to be prerendered. From the look and feel of it, it seems to be doing a similar trick to The Need for Speed and Crash and Burn on 3DO. What was the term?

 

Hot Wheel Stunt Track Challenge for Game Bot Advance:

 

Like those games, you cant turn around your car and race backwards, kinda feels like being on rails. Also, not many trackside objects to interact with. But it runs prety nice and plays well.

Edited by sd32
Link to comment
Share on other sites

Didnt want to keep being OT in the CPS2 to Jaguar ports thread. So i will post this here:

This is a pretty cool engine for the GBA, another system with no dedicated 3d hardware. It uses flat shadded polygons for the tracks, but cars seem to be prerendered. From the look and feel of it, it seems to be doing a similar trick to The Need for Speed and Crash and Burn on 3DO. What was the term?

 

Hot Wheel Stunt Track Challenge for Game Bot Advance:

YT link

 

Like those games, you cant turn around your car and race backwards, kinda feels like being on rails. Also, not many trackside objects to interact with. But it runs prety nice and plays well.

That's a great find! Not sure what you mean by the 'trick', though ?

 

- You used the right term 'on-the-rails'. Especially for racing games it brings a substantial performance boost (usually 33-75% of one frame time, but as with everything, if it's not optimized, upper limit is unlimited :) ), because the engine doesn't have to do frustum culling stage, where it selects only a subset of the whole world for processing through the 3D pipeline. Example - your whole track has 20,000 polygons, but in current camera view, there's just 300 and it's the frustum culling's job, to give only those 300 polygons to the rendering pipeline.

- The way it works is, that your whole track is divided into short linear segments, most usually 3 (polygons close to camera, majority in the middle, and polygons in distance), and engine just keeps incrementing the segment index value as you drive, so it's basically free.

- The game gives you an illusion of camera control, as there's a small range of camera adjustments when you turn, but that's about it. If you hacked the executable, and turned the camera around, screen would be basically blank (only seeing the polys closest to camera from default view).

 

There's another great advantage of this technique - you tell 3D artists how many polys (and how big) they can use in each segment, so through multiple iterations of the level design, you can keep framerate very stable by carefully adding polygons manually to each segment, while making sure the grand total for those 3 segments is still within limits set by 3D coders. I haven't noticed any major frame drops there, so they clearly spent a lot of time refining the track (adding small details where they still had the performance buffer).

 

The car seems to be realtime textured (not really a sprite - as its pixels move with slight steering, and GBA doesn't have that much RAM for all those frames), but as it's just few pixels and polygons, it's not a big performance impact.

 

 

Inevitably, this draws a comparison to Checkered Flag - but:

- CF has detailed 3D vehicle right in front of camera, sometimes covering substantial portion of screen (e.g. huge fillrate and overdraw cost - as you are overdrawing portion of screen where there is terrain already - but no easy way to avoid this)

- CF computes Fog

- 320x240 vs 240x160 means that CF has to process 50% more scanlines (160+80=240), so that is substantial difference just in itself. The difference in horizontal resolution should not be a big deal, as while the Blitter is blitting scanline, the GPU continues with scanline traversal in parallel (just checked the CF code for that)

Link to comment
Share on other sites

Just tried the game on a tv, thanks to a GBA to Super Nes adapter. You are right guys, the cars are 3d.

 

There are a lot of interesting recing game engines on GBA. But i think this is the only one using flat shadded polygons?

Link to comment
Share on other sites

Just tried the game on a tv, thanks to a GBA to Super Nes adapter. You are right guys, the cars are 3d.

It's kinda hard to see on most other videos, but I just found one, that has a bigger car, so it can be finally easily seen, it's realtime textured (when there's more cars on track, the framerate does [obviously] crash, though):

 

I also confirmed my initial suspicion - they're using subpixel precision in 3D transformation (there's no "jumping" of vertices back&forth, as is the case with default integer precision), so it feels smoother than it actually is - it's a relatively cheap feature (if you build the engine around it, that is).

This feature wouldn't have helped CF, as it's best experienced around 20-30 fps, since you can't really see subpixel precision under 10 fps...

 

EDIT: And I just noticed another "cheat" - the car moves quite slow, so even in lower framerate, it feels like it has a higher framerate than it actually does. The CF moves the car through the world too fast compared to real framerate, thus making it feel even less smooth (as if the game needed that in the first place!).

 

 

This has just made me realize I should reconsider replacing straight integer coordinates with fixed-point in my jag engine, as using the subpixel precision does indeed bring a lot of fake smoothness (visually speaking, that is). It's one thing being merely aware of that feature, and quite another seeing it in motion. Thanks, sd32 !

Edited by VladR
  • Like 3
Link to comment
Share on other sites

I haven't read the entire thread, but wanted to chime in: Road Rash 3do is pseudo 3d racer. Both the Genesis and 3do ones use limited "real" 3d to achieve the hills and distance calcs. The rest is faked, such as the curves. Something like Super Burnout is a similar engine, and runs at lightspeed, although that doesn't have the polygonal hills off to the side or the buildings (though buildings can be nothing more than textured vertical lines).

 

Need for Speed I believe is true 3d, but given how the track format works, may build the road each frame from the viewpoint forward (IIRC, similar to Road Rash, the road is a collection of nodes which are angles relative to each other. If you make the road criss-cross, you won't see that in-game. There is an old unofficial track editor to play with if you're curious). NFS cannot change perspective past a few degrees-- it will zoom out into 3rd person instead of showing you the reverse angle.

 

Hopefully this didn't repeat what was already said too much, or was too off-topic. I hope you keep developing your Road Rash proto; it's looking cool!

Link to comment
Share on other sites

  • 2 weeks later...

This is a continuation of the discussion we had in the other thread on the Stun Runner (an attempt to keep all engine updates contained in this thread).

 

Engine Update:

- Major optimization : Old version is now 26% slower (and is even slower without 3D meshes compared to new one)

- Support for generic 3D meshes - this was long time awaited by me

- Rendering player's ship, enemy ship and laser shots (3D,not bitmaps)

 

There's also few boring non-engine / game-related updates running on 68000:

- Input (I'm flying it in the video, no more script)

- AI (4 states currently, easily extensible to any number of states)

- AI is checking environment, and if it gets close to player in 3D space, bolts forward

- Shooting down the enemy works

- 68000 actually drives the GPU by updating the 3D object list (including world transformations, saving a lot of cache for more important stuff in future)

- This time I'm using ASM (not C, as in past)

 

While I didn't have the proof in past, just as I expected (and argued many times), contrary to an incredibly beyond-stupid popular myth (that gets passed along without the proper context and assumptions), 68000 didn't slow down my rendering at all.

In fact, I benchmarked rendering 1000 frames, and could not notice slowdown of even one frame more (compared to benchmarking disabled (e.g. commented out of the build) AI/input/game state) - e.g. not even 1/1000 slowdown (e.g. less than 0.1%), despite 68000 accessing memory left & right.

Since my current game tick is VSYNC'ed to 3 frames (I like my HiRes :) ), I have 3*13.3 MHz = ~40 MHz of 68000 power available for game-related features. That's a lot.

 

 

Here's the video - it's still running at 768x200 (VSYNC'ed to 20 fps (26.5 fps real)):

 

- This is still pure software rasterizer (not using Blitter here).

- I also finally benchmarked the scanline drawing&computation. Previously I found out how much pixel drawing takes, but there was a lot of other computations that were happening, once I have start&endpoint of the scanline.

- That whole scanline drawing portion (splitting the line into left/right padded edge, and the inner part to 32-bit) takes 77% of frame time. That's something that will be offset by Blitter in one of next versions. And just like with Road Rash, I'll continue computing in parallel, while Blitter will be drawing the scanline

 

A Blitter - supported version should handle , I suspect then, around 15 fps in 1536x200, which is the highest possible non-interlaced resolution on jag (well, 240 really, but I keep the 40 lines reserved for HUD). Still more than Checkered Flag :)

 

 

@louisg: No worries, not off topic at all. In fact we had a discussion about NFS several (well, maybe more) pages back, and I am still convinced it's possible to do a 30-60 fps version of it on Jaguar (assuming multithreaded rendering, of course), All details and conditions of it (there's few) are mentioned in relevant thread, so I won't repeat it here again, but feel free to quote it and we can discuss more.

  • Like 8
Link to comment
Share on other sites

This is a continuation of the discussion we had in the other thread on the Stun Runner (an attempt to keep all engine updates contained in this thread).

 

Engine Update:

- Major optimization : Old version is now 26% slower (and is even slower without 3D meshes compared to new one)

- Support for generic 3D meshes - this was long time awaited by me

- Rendering player's ship, enemy ship and laser shots (3D,not bitmaps)

 

There's also few boring non-engine / game-related updates running on 68000:

- Input (I'm flying it in the video, no more script)

- AI (4 states currently, easily extensible to any number of states)

- AI is checking environment, and if it gets close to player in 3D space, bolts forward

- Shooting down the enemy works

- 68000 actually drives the GPU by updating the 3D object list (including world transformations, saving a lot of cache for more important stuff in future)

- This time I'm using ASM (not C, as in past)

 

While I didn't have the proof in past, just as I expected (and argued many times), contrary to an incredibly beyond-stupid popular myth (that gets passed along without the proper context and assumptions), 68000 didn't slow down my rendering at all.

In fact, I benchmarked rendering 1000 frames, and could not notice slowdown of even one frame more (compared to benchmarking disabled (e.g. commented out of the build) AI/input/game state) - e.g. not even 1/1000 slowdown (e.g. less than 0.1%), despite 68000 accessing memory left & right.

Since my current game tick is VSYNC'ed to 3 frames (I like my HiRes :) ), I have 3*13.3 MHz = ~40 MHz of 68000 power available for game-related features. That's a lot.

 

 

Here's the video - it's still running at 768x200 (VSYNC'ed to 20 fps (26.5 fps real)):

 

- This is still pure software rasterizer (not using Blitter here).

- I also finally benchmarked the scanline drawing&computation. Previously I found out how much pixel drawing takes, but there was a lot of other computations that were happening, once I have start&endpoint of the scanline.

- That whole scanline drawing portion (splitting the line into left/right padded edge, and the inner part to 32-bit) takes 77% of frame time. That's something that will be offset by Blitter in one of next versions. And just like with Road Rash, I'll continue computing in parallel, while Blitter will be drawing the scanline

 

A Blitter - supported version should handle , I suspect then, around 15 fps in 1536x200, which is the highest possible non-interlaced resolution on jag (well, 240 really, but I keep the 40 lines reserved for HUD). Still more than Checkered Flag :)

 

 

@louisg: No worries, not off topic at all. In fact we had a discussion about NFS several (well, maybe more) pages back, and I am still convinced it's possible to do a 30-60 fps version of it on Jaguar (assuming multithreaded rendering, of course), All details and conditions of it (there's few) are mentioned in relevant thread, so I won't repeat it here again, but feel free to quote it and we can discuss more.

Super cool man!

Link to comment
Share on other sites

you got the ship all wrong Vlad. The wings need to split into an X configuration.

 

The the trench needs to be in different Shades of Gray scale. 10 times deeper and three times wider.

 

All wings report in!

 

Red 10 standing by. Red 7 standing by. Red 3 standing by...

 

Lock X foils in attack position...

?

Edited by JagChris
  • Like 2
Link to comment
Share on other sites

- I spent an hour this morning with the optimization I employed while working on the 6502 Atari 800 flatshader recently.

- The result was 5.6%, which in itself seems insignificant (but at least it's a step in right direction)

- Testing the Stun Runner scene with the two ships however made me realize that it's 5% per object, thus the total render cost dropped from 2.41 frames (e.g. 60/2.41=24.9 fps) to 2.09 (e.g. 60/2.09 = 28.7 fps), e.g. 15.3% framerate improvement.

- Since the effect is cumulative, I made a quick scene that in a simple bruteforce loop starts with 34 ships (1,088 rendered triangles = 3,264 vertices)

- Still just single-threaded renderer on GPU and no Blitter, resolution is the typical Jaguar-friendly 768x200

- I'm pretty sure, just like in the terrain LOD experimenting, implementing level of detail will increase the number of ships to at least double, but probably up to triple (100 hundred) count, keeping same framerate

 

This is one of those rare times, when even I am surprised by the results (and did not expect this result from a single thread on GPU without Blitter) :)

 

 

Like sd32 hinted on YT, this could make for some nice Space Invaders 2000 remake (just would have to stack the rows along Y axis, not Z axis as is right now in the video - so that the gameplay and feel would remain identical to original Space Invaders, just keep the whole layer come closer to you). This could have been nice at jag launch...

  • Like 3
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...