Jump to content
IGNORED

Lynx 3D Experimenting


VladR

Recommended Posts

I decided to experiment with Lynx a bit and see how far I can get in 6 weeks before the Lynx's 30th B-Day compo deadline. Initially, I'll be doing all prototyping in high-level C, before committing to a final ASM rewrite.

 

Day 1:

- set up a Handy Emulator and a C dev env (for VS 2012) as per the fantastic LX.NET blog at https://atarilynxdeveloper.wordpress.com/

- wrote basic 3D transformation-related methods. For now, I'm using 16-bit signed range (with a clip-space merged into 8  bits), as performance is irrelevant during high-level prototyping.

- wrote a 3x6 font handler so I can print out lots of debug info on screen (despite tiny resolution)

- learnt how to manipulate Lynx's palette at $FDA0-$FDBF and wrote basic RGB conversion functionality

- tested the code by drawing Depth-Based shaded 3D point cloud of 1,600 points (16*10*10) and so far, the emulator wasn't complaining

 

Lynx00.thumb.GIF.829664844833f48f7b3053a87bf3828d.GIF

 

Tomorrow, I'll import the 3D mesh of a base track segment from my Jaguar codebase and hopefully get at least a reference wireframe going. I'd like to have a reference flatshader up&running within next 2 days and then implement scanline fill via the Suzy's Sprite Engine - basically use Suzy as a Blitter on Jaguar (where GPU computes the endpoints on each scanline, and spins up Blitter to fill the scanline).

 

While I have a pretty good idea about what kind of 3D scene complexity 6502 can handle via SW rasterizing (from my recent experiments), I am really curious how fast Suzy is in terms of scanline filling - mostly:

- what's the spin up overhead (e.g. SCB computation)

- what's the scanline length threshold where it's faster to just use 6502 to fill pixels instead of spinning up Suzy

 

It's one thing to set up Jaguar's Blitter registers from within GPU cache (sitting on a 64-bit bus, plus GPU clocked at 26.6 MHz, plus GPU works in parallel to other chips) and whole another to do it 8 bits at a time, clocked @4 MHz...

 

Based on a quick look into SCB's details, I'd hazard [an uneducated guess], that it should be faster to do scanline fill on 65C02 for scanlines < (8-16) pixels long. Especially considering the fact that 65C02 is halted while Suzy putzes around the framebuffer. Definitely a prime candidate for a benchmark...

 

  • Like 2
Link to comment
Share on other sites

Pro Suzy:

- You can just use coordinates, no address calculation

- You can easily move the coordinate system

- Clipping

- No shifting and masking

- Easily add transparency

 

But actually, I do not know if someone did a 1:1 comparison.

 

Link to comment
Share on other sites

Yeah, not many people are as obsessed with cycle counting as I am. Plus, if it works already, why complicate the code...

 

 

I am hoping that clipping will be actually useable, unlike on Jaguar where due to HW bugs I had to run a workaround code on GPU per scanline to handle those special cases, at which point HW clipping became unuseable and I am just doing Sw clipping.

 

Also, scanline edges should be simpler to handle than on A800 because there's just two cases, not four, because there's just two pixels per byte. That should tip the balance towards SW clipping. 

Link to comment
Share on other sites

3 minutes ago, karri said:

Interlacing CPU multiplications with preparing for next step is also possible with Suzy.

You mean the parallel divide and mul via Suzy?

 

Yeah, that's a third code path (yet another one to try).

 

I did some thinking on this one last week and my current position is that it's probably not going to be very useful for flatshading because Suzy will be mostly busy drawing scanlines, so not much performance can be gained by this in HW -based rasterizer.

 

However, if we ditched Suzy's scanline fill, then we could totally use the free mul and div in parallel, but that implies SW rasterizer - e.g. scanline drawing via 6502.

Unfortunately, I recently implemented a fixed point scanline traversal so I don't need mul or div per scanline...

 

Is there anything else other than that that Suzy can do in parallel with 6502?

Link to comment
Share on other sites

10 hours ago, 42bs said:

Pro Suzy:

- You can just use coordinates, no address calculation

- No shifting and masking

I'm not sure address calculation for SW rasterizer (via table lookup as I do now, so it's really just LDA/STA Lo+Hi and horizontal xpos (from left screen edge) is free via reused Y-reg anyway) is actually slower than updating HPos/VPos inside SCB (LDA/STA Lo+Hi).

 

I just read through the Sprite engine description and it would appear that Suzy doesn't have a separate Fill mode ? Meaning, for drawing a scanline I must actually use a scaling sprite - I suspect 1px wide, opaque, non-collideable, no RMW, with scanline length specified as a HSize. Is that so ?

 

It would be interesting to benchmark the internal HW implementation to see if scaling is faster than copy (it should), and what is that threshold scanline length.

Link to comment
Share on other sites

3 hours ago, VladR said:

I'm not sure address calculation for SW rasterizer (via table lookup as I do now, so it's really just LDA/STA Lo+Hi and horizontal xpos (from left screen edge) is free via reused Y-reg anyway) is actually slower than updating HPos/VPos inside SCB (LDA/STA Lo+Hi).

 

I just read through the Sprite engine description and it would appear that Suzy doesn't have a separate Fill mode ? Meaning, for drawing a scanline I must actually use a scaling sprite - I suspect 1px wide, opaque, non-collideable, no RMW, with scanline length specified as a HSize. Is that so ?

 

It would be interesting to benchmark the internal HW implementation to see if scaling is faster than copy (it should), and what is that threshold scanline length.

Sounds right. On the other hand Suzy is pretty good in drawing tilted and stretched lines based on one pixel sprites sprites also.

Link to comment
Share on other sites

3 hours ago, necrocia said:

This is great! Nice to see some real 3D work being done on the Lynx!

Glad you like it, just don't get your hopes up too much. Highly likely I will get stuck in the endless loop of cycle counting and constant refactoring. But, given the situation (30th B-day of Lynx), I really want to create something fully playable. Given this is an 8-bit platform, I don't need to target the complexity of 32-bit games, so something simple (say, like StunRunner) shouldn't really be regarded as "not worthy of the platform ". We'll see.

25 minutes ago, karri said:

Sounds right. On the other hand Suzy is pretty good in drawing tilted and stretched lines based on one pixel sprites sprites also.

The vertical stretching/interpolation of the sprite makes Suzy a prime candidate for a textured road. Of course, I have yet no idea, how fast Suzy is, to texture a road (e.g. is it going to result in a playable framerate for a fast paced racer?), but after the flatshading, it's right next on my to-do list.

For sure, however, a slow-paced FPS game, could totally have a playable framerate with such texturing on floors/ceilings.

Link to comment
Share on other sites

Day 2:

- recreated my base track segment from my Jaguar's codebase

- the track creation is generic - it uses translation offsets per each segment, just currently it's along a straight line. But hills/curves are very easy this way (just supply an (X,Y,Z) offset per each segment).

- wireframe render

 

Lynx01.thumb.GIF.3cd7a17e4be357964751042b02acd9bb.GIF

 

Next - flatshading via scanline filling. I think I have time to skip triangle rasterizer and actually reuse the quad rasterizer I wrote recently for Jaguar - this basically halves the amount of scanlines (which would be a major performance boost for quad-based geometry).

 

 

  • Like 2
Link to comment
Share on other sites

7 hours ago, VladR said:

I just read through the Sprite engine description and it would appear that Suzy doesn't have a separate Fill mode ? Meaning, for drawing a scanline I must actually use a scaling sprite - I suspect 1px wide, opaque, non-collideable, no RMW, with scanline length specified as a HSize. Is that so ?
 

Maybe have a look into the poly routines from BLL (or cc65).

Link to comment
Share on other sites

The problem with a polygon is the right edge as seen in the last small image.

 

This code:

void polygon(int x1, int y1, int w1, int x2, int y2, int w2, unsigned char color)
{
    Spixel.hpos = x1;
    Spixel.vpos = y1;
    Spixel.hsize = w1 << 8;
    Spixel.vsize = (y2 - y1 + 1) << 8;
    Spixel.tilt = (x2 - x1) * 256 / (y2 - y1);
    Spixel.stretch = (w2 - w1) * 256 / (y2 - y1);
    Spixel.penpal[0] = color;
    tgi_sprite(&Spixel);
}

 

On the other hand overwriting the right edge with a left edge produces a smooth result.

 

void
drawsegment(int lanes, int x1, int y1, int w1, int x2, int y2, int w2)
{
    int l1;
    int l2;
    int r1;
    int r2;
    int lanew1;
    int lanew2;
    int lanex1;
    int lanex2;
    int lane;
    l1 = w1 >> (lanes + 2);
    l2 = w2 >> (lanes + 2);
    r1 = w1 >> (lanes + 1);
    r2 = w2 >> (lanes + 1);
    // Draw left yellow rumble line
    polygon(x1 - w1 - r1 + 1, y1, r1, x2 - w2 - r2 + 1, y2, r2, COLOR_YELLOW);
    // Draw road
    polygon(x1 - w1, y1, w1, x2 - w2, y2, w2, COLOR_GREY);
    lanew1 = w1 * 2 / lanes;
    lanew2 = w2 * 2 / lanes;
    lanex1 = x1 - w1 + lanew1;
    lanex2 = x2 - w2 + lanew2;
    for (lane = 1; lane < lanes; lanex1 += lanew1, lanex2 += lanew2, lane++) {
        polygon(lanex1 - l1 / 2, y1, l1, lanex2 - l2 / 2, y2, l2, COLOR_WHITE);
    }
    polygon(x1 - 1 + l1 / 2, y1, w1, x2 - 1 + l2 / 2, y2, w2, COLOR_GREY);
    // Draw right yellow rumble line
    polygon(x1 + w1, y1, r1, x2 + w2, y2, r2, COLOR_YELLOW);
    // Draw right cleanup
    polygon(x1 + w1 + r1 - 1, y1, r1, x2 + w2 + r2 - 1, y2, r2, COLOR_DARKGREY);
}

1908774321_Screenshotfrom2019-06-2810-49-28.thumb.png.517f6a67b50fcd7d964848125291a7c0.png

1458841414_Screenshotfrom2019-06-2810-53-33.png.eb2fa63b06732df3fe42a936b2e74ee7.png

Link to comment
Share on other sites

3 hours ago, 42bs said:

Maybe have a look into the poly routines from BLL (or cc65).

I'll see how far I get using the emulator, first. Besides, a significant part of enjoyment from this work is discovery :D

I'm only concerned exactly how precise can the emulation of Suzy be, especially with regards to internal workings of the sprite engine so that we can reproduce all the glitches and behavior of the sprite engine.

 

3 hours ago, karri said:

The problem with a polygon is the right edge as seen in the last small image.

 

This code:


void polygon(int x1, int y1, int w1, int x2, int y2, int w2, unsigned char color)
{
    Spixel.hpos = x1;
    Spixel.vpos = y1;
    Spixel.hsize = w1 << 8;
    Spixel.vsize = (y2 - y1 + 1) << 8;
    Spixel.tilt = (x2 - x1) * 256 / (y2 - y1);
    Spixel.stretch = (w2 - w1) * 256 / (y2 - y1);
    Spixel.penpal[0] = color;
    tgi_sprite(&Spixel);
}

 

On the other hand overwriting the right edge with a left edge produces a smooth result.

 


void
drawsegment(int lanes, int x1, int y1, int w1, int x2, int y2, int w2)
{
    int l1;
    int l2;
    int r1;
    int r2;
    int lanew1;
    int lanew2;
    int lanex1;
    int lanex2;
    int lane;
    l1 = w1 >> (lanes + 2);
    l2 = w2 >> (lanes + 2);
    r1 = w1 >> (lanes + 1);
    r2 = w2 >> (lanes + 1);
    // Draw left yellow rumble line
    polygon(x1 - w1 - r1 + 1, y1, r1, x2 - w2 - r2 + 1, y2, r2, COLOR_YELLOW);
    // Draw road
    polygon(x1 - w1, y1, w1, x2 - w2, y2, w2, COLOR_GREY);
    lanew1 = w1 * 2 / lanes;
    lanew2 = w2 * 2 / lanes;
    lanex1 = x1 - w1 + lanew1;
    lanex2 = x2 - w2 + lanew2;
    for (lane = 1; lane < lanes; lanex1 += lanew1, lanex2 += lanew2, lane++) {
        polygon(lanex1 - l1 / 2, y1, l1, lanex2 - l2 / 2, y2, l2, COLOR_WHITE);
    }
    polygon(x1 - 1 + l1 / 2, y1, w1, x2 - 1 + l2 / 2, y2, w2, COLOR_GREY);
    // Draw right yellow rumble line
    polygon(x1 + w1, y1, r1, x2 + w2, y2, r2, COLOR_YELLOW);
    // Draw right cleanup
    polygon(x1 + w1 + r1 - 1, y1, r1, x2 + w2 + r2 - 1, y2, r2, COLOR_DARKGREY);
}

1908774321_Screenshotfrom2019-06-2810-49-28.thumb.png.517f6a67b50fcd7d964848125291a7c0.png

1458841414_Screenshotfrom2019-06-2810-53-33.png.eb2fa63b06732df3fe42a936b2e74ee7.png

Awesome glitch ! Originating in HW, I presume ? I think I noticed similar [last-pixel] jitter in some other games while watching it on YT.

 

Two-Pass rendering on 8 bits ! So.Much.Power.To.Spare :lol:

 

Is it actually faster than rendering it just on CPU ? I mean, we have to consider that 65C02 is halted all this time. Surely, Suzy is faster on first pass, but second ? Hard to say without having exact timings of each sprite call...

 

Incidentally, few months ago, I created a very fast scanline texturing routine on 6502, though it took 9 rewrites  :lol: But, it certainly could be made faster given 65C02 new instructions and opmodes. But that, right there, is a week-long detour :lol:

 

For road rendering, I will definitely first try the scanline texturing I did on Jaguar long time ago and see what kind of glitches I encounter. Then, I'd like to do the whole segment in one sprite call and see how it looks and behaves.

I suspect at least 4 versions. 5, if I include the 6502 texturing.

 

I suspect you guys have the Lynx HW ? Hopefully, within a week, I'll have some first ASM benchmark that I'll need tested.

  • Like 1
Link to comment
Share on other sites

On 6/27/2019 at 7:16 AM, VladR said:

I decided to experiment with Lynx a bit and see how far I can get in 6 weeks before the Lynx's 30th B-Day compo deadline. Initially, I'll be doing all prototyping in high-level C, before committing to a final ASM rewrite.

 

Day 1:

- set up a Handy Emulator and a C dev env (for VS 2012) as per the fantastic LX.NET blog at https://atarilynxdeveloper.wordpress.com/

- wrote basic 3D transformation-related methods. For now, I'm using 16-bit signed range (with a clip-space merged into 8  bits), as performance is irrelevant during high-level prototyping.

- wrote a 3x6 font handler so I can print out lots of debug info on screen (despite tiny resolution)

- learnt how to manipulate Lynx's palette at $FDA0-$FDBF and wrote basic RGB conversion functionality

- tested the code by drawing Depth-Based shaded 3D point cloud of 1,600 points (16*10*10) and so far, the emulator wasn't complaining

 

Lynx00.thumb.GIF.829664844833f48f7b3053a87bf3828d.GIF

 

Tomorrow, I'll import the 3D mesh of a base track segment from my Jaguar codebase and hopefully get at least a reference wireframe going. I'd like to have a reference flatshader up&running within next 2 days and then implement scanline fill via the Suzy's Sprite Engine - basically use Suzy as a Blitter on Jaguar (where GPU computes the endpoints on each scanline, and spins up Blitter to fill the scanline).

 

While I have a pretty good idea about what kind of 3D scene complexity 6502 can handle via SW rasterizing (from my recent experiments), I am really curious how fast Suzy is in terms of scanline filling - mostly:

- what's the spin up overhead (e.g. SCB computation)

- what's the scanline length threshold where it's faster to just use 6502 to fill pixels instead of spinning up Suzy

 

It's one thing to set up Jaguar's Blitter registers from within GPU cache (sitting on a 64-bit bus, plus GPU clocked at 26.6 MHz, plus GPU works in parallel to other chips) and whole another to do it 8 bits at a time, clocked @4 MHz...

 

Based on a quick look into SCB's details, I'd hazard [an uneducated guess], that it should be faster to do scanline fill on 65C02 for scanlines < (8-16) pixels long. Especially considering the fact that 65C02 is halted while Suzy putzes around the framebuffer. Definitely a prime candidate for a benchmark...

 

 

mmmmm, looks yummy.  Do I see the beginnings of Star Raiders for the Lynx in the making?  When I was the 3D topic that is where my mind immediately went daydreaming of the possibilities it would lend to such a game.

  • Like 1
Link to comment
Share on other sites

Did some cycle measurements (ideal 65C02, not Lynx) with a small (maybe not 100% optimized) 65C02 line fill.

Here the results:

    ;; two pixels even       66
    ;; two pixels odd        88
    ;; three even          114
    ;; three odd           116

    ;; 2n odd (n>=2)       116+(n-2)*13
    ;; 2n even (n>=2)      100+(n-2)*13
    ;; full line          1114

 

even/odd = X is even or odd.

The cycles per instruction are text-book ones, so the Lynx is slightly worse if a page gets crossed.

 

I thought, I read some info on Suzy's RAM access, but only that it reads 8pixels in a burst for collision check.

 

I have not yet checked Suzy's drawing speed for such, but my guess is it will outperform the CPU from a certain width on. How wide has to be found out ...

Link to comment
Share on other sites

3 hours ago, Tidus79001 said:

 

mmmmm, looks yummy.  Do I see the beginnings of Star Raiders for the Lynx in the making?  When I was the 3D topic that is where my mind immediately went daydreaming of the possibilities it would lend to such a game.

Not sure what exactly looked "yummy" there ? Was just a simple test of a depth-shaded cube.


Star Raiders ? You mean the 1979 one or the unfinished one from Aric Wilmunder ? Something like this, perhaps ?

Lynx02_StarRaiders.thumb.GIF.082db510151b8f2d1a09fdd45a724f51.GIF

There, you are now officially responsible for my today's 2-hour detour :lol:

 

Personally, I'm bigger fan of SRII (e.g. Last Starfighter), which is very arcadey. Now, those above-planet action sequences could certainly look great with 16 colors (plus some DLI for even more colors) on Lynx...

 

The unreleased SRII had also some stages where you were flying above the mountains. If we stripped the roll rotation and allowed only strafing up/down/left/right (more than enough for the camera control), we could use the interpolated vertex heightmap I did last year for Atari which allowed a pretty dense pixel cloud (24x24 = 576 vertices )at high framerates. Plus, Lynx is more than double frequency, so Lynx should easily double the framerate.

 

  • Like 2
Link to comment
Share on other sites

16 minutes ago, VladR said:

Not sure what exactly looked "yummy" there ? Was just a simple test of a depth-shaded cube.


Star Raiders ? You mean the 1979 one or the unfinished one from Aric Wilmunder ? Something like this, perhaps ? 

Lynx02_StarRaiders.thumb.GIF.082db510151b8f2d1a09fdd45a724f51.GIF

There, you are now officially responsible for my today's 2-hour detour :lol:

 

Personally, I'm bigger fan of SRII (e.g. Last Starfighter), which is very arcadey. Now, those above-planet action sequences could certainly look great with 16 colors (plus some DLI for even more colors) on Lynx...

 

The unreleased SRII had also some stages where you were flying above the mountains. If we stripped the roll rotation and allowed only strafing up/down/left/right (more than enough for the camera control), we could use the interpolated vertex heightmap I did last year for Atari which allowed a pretty dense pixel cloud (24x24 = 576 vertices )at high framerates. Plus, Lynx is more than double frequency, so Lynx should easily double the framerate.

 

OMG, that looks amazing.  What is it?  I would love to try out that game even if it is unfinished.  Is this ROM available somewhere for download.

  • Like 1
Link to comment
Share on other sites

46 minutes ago, Flojomojo said:

@VladR you seem to like programming within powerful constraints. Ever have a look at PICO-8?

Yeah, but there's no emotional attachment to PICO, as there is to Atari. I didn't grow up with PICO.

29 minutes ago, Tidus79001 said:

OMG, that looks amazing.  What is it?  I would love to try out that game even if it is unfinished.  Is this ROM available somewhere for download.

That's not a game. Just a snapshot of a test scene  from the engine that I created because you mentioned Star Raiders (like I said, two-hour detour :lol: ). Scene does react to camera position, so stars and ships move accordingly. But, since it's high-level C, the framerate is abysmal :) Then again, it only took 2 hours (beauty of high-level language prototyping), and it was fun . When ported to Assembler, it should run at least 15-20 fps with 3 on-screen ships, and around 20-30 fps with 1 ship. Generic Line drawing is expensive. If we opted for the point cloud enemies (like unreleased SR), framerate would go up. I'm afraid that'd look way too simple for Lynx, though...

 

It could be made into something simple and playable in roughly 2 days of work, but Lynx can do better than that, so I don't think it's worth the effort for now.

 

 

 

  • Like 1
Link to comment
Share on other sites

41 minutes ago, VladR said:

Yeah, but there's no emotional attachment to PICO, as there is to Atari. I didn't grow up with PICO.

That's not a game. Just a snapshot of a test scene  from the engine that I created because you mentioned Star Raiders (like I said, two-hour detour :lol: ). Scene does react to camera position, so stars and ships move accordingly. But, since it's high-level C, the framerate is abysmal :) Then again, it only took 2 hours (beauty of high-level language prototyping), and it was fun . When ported to Assembler, it should run at least 15-20 fps with 3 on-screen ships, and around 20-30 fps with 1 ship. Generic Line drawing is expensive. If we opted for the point cloud enemies (like unreleased SR), framerate would go up. I'm afraid that'd look way too simple for Lynx, though...

 

It could be made into something simple and playable in roughly 2 days of work, but Lynx can do better than that, so I don't think it's worth the effort for now.

 

 

 

For 2 hours woth of work it is amazing and inspiring.  Thanks for whipping up this example for me so quickly in such a short span of time.  One of these days I am going to try to learn C or assembly but honestly it's a bit daunting knowing where or how to begin in such a journey with such a steep learning curve.  I truly for a long time have wanted to contribute to the retro gaming ecosystem that is keeping these old video games system alive and thriving as opposed to relegated to the dustbins of history.  What is the best way as an entry point for such a learning endeavor (any constructive advice on this would be appreciated)?  Most tutorials I have seen do seem to expect that the individual already has rudimentary understanding of assembly or C (last time I did any programming was 30 years ago in my teens and that was Basic and Cobol and not at an very advanced level).  I would love to one day be able to leave my legacy for these old systems and have something to show here along with all of the rest of the talented people here on AtariAge and throughout the retro gaming development community at large (as well as alongside the official titles that were progammed and published by the talented programmers back in the day when these systems were still on the market) whose talent and hard work that I have had pleasure of enjoying over these years while exploring my love of these old systems via purchasing and playing the games that they have created.

  • Like 1
Link to comment
Share on other sites

1 hour ago, Tidus79001 said:

For 2 hours woth of work it is amazing and inspiring.  Thanks for whipping up this example for me so quickly in such a short span of time.  One of these days I am going to try to learn C or assembly but honestly it's a bit daunting knowing where or how to begin in such a journey with such a steep learning curve.  I truly for a long time have wanted to contribute to the retro gaming ecosystem that is keeping these old video games system alive and thriving as opposed to relegated to the dustbins of history.  What is the best way as an entry point for such a learning endeavor (any constructive advice on this would be appreciated)?  Most tutorials I have seen do seem to expect that the individual already has rudimentary understanding of assembly or C (last time I did any programming was 30 years ago in my teens and that was Basic and Cobol and not at an very advanced level).  I would love to one day be able to leave my legacy for these old systems and have something to show here along with all of the rest of the talented people here on AtariAge and throughout the retro gaming development community at large (as well as alongside the official titles that were progammed and published by the talented programmers back in the day when these systems were still on the market) whose talent and hard work that I have had pleasure of enjoying over these years while exploring my love of these old systems via purchasing and playing the games that they have created.

I don't want to derail this thread (outstanding work VladR, I'm eager to see the textured version now!), but if you want to start programming for an actual retro machine, I think the "easier" choices are:

 

A) You lean on your previous knowledge of BASIC (that is also quite simple to learn compared to C or assembly) and create games for:

- The Megadrive / Genesis (loads of power, quite simple architecture, and sprites/tiles based graphics shared with many other 8/16 bits consoles). As you know Basic, you can make some very cool games with BEX (http://devster.monkeeh.com/sega/basiegaxorz/) or Second Basic (http://www.second-dimension.com/second-basic) - both are Basic Languages running on Megadrive, and some "commercial" homebrew have been made with it.

 

- The Atari 2600, with the wonderful batariBasic (https://www.randomterrain.com/atari-2600-memories.html#batari_basic) - the 2600 is obviously more limited in term of CPU power and has more constraints display wise, but it's a very interesting console to create games for :). Many "commercial" homebrew have been created with it, it's very powerful and easy to learn / use.

 

B) You want to spend some time to learn the C programming language, so you can set your sights on:

 

- The Megadrive / Genesis has the most easy to use and powerful retroSDK around: SGDK (https://github.com/Stephane-D/SGDK) - commercial homebrew like Tanzer or Xeno Crisis are made with it. Compared to other C-based SDK for other consoles (GB, SNES, Lynx, etc.), it's easier to use as it comes with a "sprite engine" that do all the low-level work for you (setting VRAM usage, using DMA transfer to send new graphics data, etc.), so you can focus on creating "spriteA" and tell it to "play anim Y" for example.

 

- The GameBoy with GBDK is also a good candidate as the console is quite simple to understand, especially if you use the ZGB engine that allow to create scrolling games very easily (https://github.com/Zal0/ZGB)

 

- And of course, to go back on track with the thread, although I'm only starting to discover it, I find that the Lynx is very cool to program for - the tutorial available here will hold your hand through most of the learning process : https://atarilynxdeveloper.wordpress.com/

 

Please note that many other consoles have C (NES, SNES, NeoGeo, PC-Engine, SMS/GG, etc.) or BASIC (Atari 7800, Intellivision, Jaguar, etc.) based SDKs too.

 

Link to comment
Share on other sites

Day 3:

- Ported the Generic Quad Rasterizer from my Jaguar codebase, so now we don't waste enormous performance on processing same scanline twice (as is the case with triangles) on quad-based geometry

- Used it on the track I created yesterday and it looks surprisingly useable in the low vertical resolution of Lynx.

Lynx03_QuadRasterizer.thumb.GIF.8bedaea47f875eeb70de5567a7e8cb28.GIFLynx04_FlatshadedTrack.thumb.GIF.48c45a0c233d678ad914ba5193189872.GIF

This is a reference rasterizer, so scanlines are now drawn via tgi library (slow, but 100% safe and working).

 

Next, I will go learn working with Suzy's SCB, so we can get the flatshading HW accelerated at the 16 MHz  powah :)

 

I even think, this level of quality would be [almost] sufficient for the 30th B-day compo, but let's see if I can get something nicer going on...

Link to comment
Share on other sites

12 hours ago, Tidus79001 said:

For 2 hours woth of work it is amazing and inspiring.  Thanks for whipping up this example for me so quickly in such a short span of time.  One of these days I am going to try to learn C or assembly but honestly it's a bit daunting knowing where or how to begin in such a journey with such a steep learning curve.  I truly for a long time have wanted to contribute to the retro gaming ecosystem that is keeping these old video games system alive and thriving as opposed to relegated to the dustbins of history.  What is the best way as an entry point for such a learning endeavor (any constructive advice on this would be appreciated)?  Most tutorials I have seen do seem to expect that the individual already has rudimentary understanding of assembly or C (last time I did any programming was 30 years ago in my teens and that was Basic and Cobol and not at an very advanced level).  I would love to one day be able to leave my legacy for these old systems and have something to show here along with all of the rest of the talented people here on AtariAge and throughout the retro gaming development community at large (as well as alongside the official titles that were progammed and published by the talented programmers back in the day when these systems were still on the market) whose talent and hard work that I have had pleasure of enjoying over these years while exploring my love of these old systems via purchasing and playing the games that they have created.

I have to admit, I wasn't sure you're not pulling my leg previously and you weren't just being sarcastic :)  But if you were, then you highly likely wouldn't have gone through the trouble of writing the post above. I don't really have much to add to dr.Ludos post above, though. Just, pull the plug and go at it.

So, if this thread gives you some inspiration, then at least something good has come out of it.

 

On the topic of your favorite Star Raiders, I tried creating some kind of dotted HyperSpace, but wanted a different look than the typical square or circle tunnels present in every other game out there. So, I came up with something like this:

Lynx05_HyperSpace.thumb.GIF.388d5ee21636dc2fb8b98cfea932ddf5.GIF

Not exactly happy about it. had to disable the inner interpolated points for the further segments and keep only the pink outer points. 

The integer (well, byte) calculations, while very fast, have clearly hit the precision wall here - and as you can see some lines are quite off and I should convert it to fixed-point calculations.

 

Funnily enough, when I did the same in 320x200 on Jag, precision was alright and there was no glitch. But, Lynx's vertical res is half, so this is one platform limit I just hit. This means that the terrain sequence I wanted to try next, will have the same glitch behavior (for distant terrain patches) and I won't be able to have a super dense point cloud of a terrain unless I reimplement it using fixed-point. Oh, well...

 

However, in movement, it should be alright. Plus, this is very fast.  I have a 6502 Asm routine that interpolates points across a generic quad that takes 2,334 cycles (transforms 4 base vertices, creates+renders additional 15, plus renders base 4) for 19 points. I adjusted the routine above to interpolate 81 new points, but it should still run at around 30 fps, which is enough for a hyperspace anyway...

  • Like 1
Link to comment
Share on other sites

10 hours ago, karri said:

I can hardly wait to play this one. After all my 3D struggles on Stardreamer I really admire if you get this up and running at decent speed.

Which one you mean ? The SR [StunRunner] or SR [StarRaiders] :)  ?

 

StarRaiders is mostly empty screen and those 1-3 ships don't have a lot of scanlines (I'll explain why below). It has, however, a generic rotation which is expensive even with lookup tables.

So, given that we have roughly ~56,400 cycles per frame, I really think it should run flatshaded at around 20-30 fps. Which is plenty smooth for the resolution and pace of game.

 

StunRunner, on the other hand, here's a rough estimate (data is averaged) from my excel sheet that has cycle data per each component of my 6502 flatshader:

To account for a worst-case scenario (say, for an indoor environment, where distant polygons still occupy full screen width), this table assumes the polygons are filling screen at full 160px, which is not the case here due to perspective, so in reality, the cycle count should be slightly lower, as each scanline is shorter than the previous one (going up).

 

Track Segment Average Scanlines per polygon Polygons Scanlines Edges Edge Pixels Quad Pixels Transform Cost: 140.5c StepHi,Lo Compute: 160c Scanline Loop: 119c Line Equation: 2 edges: 42c Edge Fill: 15c Inner Fill: 8c Total Cycles  
1 24 5 120 240 480 840 843 1,600 14,280 10,080 3,600 6,720 37,123  
2 8 5 40 80 160 280 843 1,600 4,760 3,360 1,200 2,240 14,003  
3 3 5 15 30 60 105 843 1,600 1,785 1,260 450 840 6,778  
4 2 5 10 20 40 70 843 1,600 1,190 840 300 560 5,333  
5 1 5 5 10 20 35 843 1,600 595 420 150 280 3,888  
  38   190   760 1,330 4,215 8,000 22,610 15,960 5,700 10,640 67,125 Subtotals
              6 12 34 24 8 16   Percent

 

First major difference to the Atari 800 is that we don't have quad-pixels. A byte on Lynx only fills 2 pixels (not 4, like on A800). So, that stage has to be doubled in cost (for CPU filling), but we have Suzy to help with that. Unfortunately, as you can see, it was only ~16% of the whole 3D pipeline, so it's not exactly huge help that we have Suzy to assist with filling.

But, if we use Suzy to fill scanlines, than we can remove also the Edge Fill stage (8%), so that's (8+16) = 24% right off the bat.

 

Second major difference to the Atari 800 is that there's 8.43 KB less memory due to FrameBuffer size (15.93 Kb - 7.5 KB). 160x96x4 only takes 3.75 KB versus 7.96 KB on Lynx. That's a problem, since my LUTs consume a lot of RAM. Precision will be affected if I have to chop the tables to half...

Atari 800's 64 KB is more than Lynx's 64 KB, unfortunately, from practical standpoint.

 

Hopefully, from the table above it is obvious, why do I keep mentioning Scanline Count in every thread. The main scanline traversal stages (Scanline Loop + Line Equation) consume (34+24+12) = 70% of the whole 3D pipeline as they have a fixed cost per scanline. Without drawing *anything* on screen. I've had scenes on A800 where that number almost crossed 80% (without single pixel drawn)...

 

At this point, there are 2 unanswered questions on my end:

1. How many CPU cycles on Lynx are killed by clearing the framebuffer

2. How fast Suzy's 16 MHz really is in drawing short scanlines

 

Today, I'm going to start messing with Suzy's SCB and hopefully, within next couple days, will be ready to start porting to ASM and start uploading public benchmark builds.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...