Jump to content
IGNORED

Lynx 3D Experimenting


VladR

Recommended Posts

Off-Topic : How's Espoo this time of year ? I don't believe you get the midnight sun so far south, right ?

 

I used to work in Oulu, in winter. It was awesome - perpetual darkness throughout whole winter, occasionally broken by merely slightly less dark sky for 2-3 hours, around 11am. Assuming there was no cloud cover. Which's impossible to say because it's always dark :lol:

 

I soooo miss Finland, you guys have saunas everywhere. The apartment we had also had sauna! And those were real, proper, Russian-style saunas, where you don't even dare to breathe deeply :lol:

 

 

 

It was surprisingly warm, though. I don't believe it ever got below -30 'C. I must have arrived during some warm spell or something...

Link to comment
Share on other sites

Espoo is beautiful in summer. It has a pleasant temperature (17 degrees just now at 11 AM) and the days are long. We actually have a pretty cool sauna from the 40's at Keitele. It is a smoke sauna without a chimney. The chimney is a "räppänä" a hole in the roof that you close after the fire is out and you start your bathing. Feels like heaven. Plus there are no neighbors so you can just skinny dip in the lake. 

sauna.thumb.jpg.a79932e1bcb15c0e446ebde10350902a.jpg

 

About Stardreamer... I have most 3D code operating but the project may be on hold until I can get the past tracks implemented. Perhaps next year...

  • Like 1
Link to comment
Share on other sites

Damn, I am jelaous now! That environment oozes charm!

 

Last few days I have been prototyping the base engine for the game I will try to finish for the compo, but just in case I overshot the complexity given the fixed deadline :) , would you be interested in something smaller scale where we would cooperate together?

 

I believe you can be part of multiple teams, just can't submit multiple games by yourself.

 

What do you think?

Link to comment
Share on other sites

Unfortunately OnDuty still lacks two endboss fights, four levels are missing, credits music and ship music is missing. Plus I need two or three interactive cut scenes. So no chance for anything before the dead line.

Link to comment
Share on other sites

Never mind, I figured it wouldn't hurt to ask :)

 

On a second thought, the water in that lake, isn't it too hot now in the summer ? I grew to enjoy the thermal shock from the hot sauna and into the freezing water. The coldest air I was ever lucky to experience after sauna was -20C. It was an incredible experience - every single cell in your body exploding :lol:

 

The music - you compose it by yourself ?

Link to comment
Share on other sites

The water temperature is about the same as the air temp right now. In winter the water in a lake is easily 20 degrees warmer than the air. So freezing water feels nice after the sauna. It is a nice experience. What makes it even better is that after the cold dip you don't sweat.

 

The summer place is on an island with no power. I have only 2 solar panels so tablets are good choices for replacing computers.

 

The music is composed by me. Here is a small sample from OnDuty theme.

 

The part in morse code is my title signature that will follow throughout the game. In the intro I use a funk background.

http://79.125.115.174/pics/title.wav

  • Like 1
Link to comment
Share on other sites

3 hours ago, agradeneu said:

How about a on rails REZ/Starblade like shmup?:-)

I believe the easiest path for me would be to have a set of exhaust dots that expand. Every vessel emits on stationary dot every 2 seconds and they expand and change colour unti they vanish after 10 seconds.

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

I spent over a week in creating a C code that would handle perspective texturing just using 8-bit math and without any real-time divisions (just few short tables) and most importantly on an angled surface (much, much harder than horizontal/vertical walls). Not sure right now if it's a good idea to pursue this for the contest or rather step back to the flatshading, given there's 4 weeks remaining...

 

It was a great research, and it turned out to be way more work than I anticipated, as even though it worked 3 days ago, I spent another 3 days reimplementing it to avoid the various minor glitches. While I previously created something similar on Jaguar, I could use division per scanline there just fine (as it ran on GPU and division can be pipelined for free with a bit of rearranging care) so it was , to say the least, a misleading experience :)

 

It is fully clipped, movement is the same as in the Lynx's AVP techdemo (e.g. kinda grid-based, but smooth, per-pixel). I presume the movement phase will be detached from action phase, e.g. you will shoot the enemy, but won't be able to move till he's done. This, for sure, will require some procedural behavior of enemy, so it's not just simple cursor pointing for enemy running straight at you.

Advantage will be a full 60-fps gameplay during the action phase. I believe it should move quite smooth, as speed of movement is low (it's not a high speed racer after all).

 

I also just realized, that since this is not a raycaster, but a polygon rasterizer, I could place crates and barrels, behind which the enemy could hide...

 

This is running on my personal Atari:Dev.Emu, which is very productive for prototyping because:

- Visual Studio

- Debug Edit&Continue

- Debugging when having few watch windows plus Output window full of debug stuff plus opening contents of arrays via mouse is way more productive

- I can merge both ASM and C code accessing same variables, so porting C to ASM is very quick and easy

- by default it does cycle counting for all ASM code

- Asm targets: 6502, 6502C, 68000, RISC: GPU, RISC:DSP

- 4 main platforms: Atari 800, Jaguar, Eclaire XL, Lynx

- All Atari's resolutions from the above platforms

- I can very quickly emulate/set/limit the target framerate to get a feel how smooth it will be in expected framerate

 

Textures are just something quick that I did in a text editor. It sure could use a crafty hand of a skilled 2D artist :)

 

Lynx11_Shooter_WireFrame.thumb.GIF.d4cac8737d13aa43c2603c08863ac957.GIFLynx12_Shooter.thumb.GIF.723dfc228fd13ffa8c4560c0afb88ec7.GIF

  • Like 6
Link to comment
Share on other sites

7 hours ago, Cyprian_K said:

looks great

 

 

would be possible to share your dev environment?

At some point, I should upload it to github. For A800, I mostly need to implement a DisplayList component, as right now my solution simply creates a fixed resolution. I just didn't really need full Display List (where each scanline is a different resolution), as all my prototyping relies on a static resolution anyway.

 

Do you have time to experiment with it ? I could perhaps spend some time to clean it up and swing you a current build, if you really want ?

4 hours ago, bhall408 said:

 

Beautiful!

 

Reminds me of Yoomp, one of my favorite Atari 8-bit home-brews. A Lynx version of that would be awesome!

 

I don't think the textured tunnel in its current form looks remotely pretty. It needs a serious texture work, as it's too much visual noise right now, at least on screenshot.  One of the reasons why I prefer the nice&clean look of flatshading (even on Jaguar)...

 

In movement, it looks somewhat more OK, though. While I got some idea on Lynx's performance of scaling, some benchmarking on HW is needed to determine the actual target framerate. It surely pushes Suzy to its limits, though...

 

Yeah, a Lynx build of Yoomp would be great, just not that awesome given the raw power differential.

Link to comment
Share on other sites

Despite the textured visuals being very grainy and noisy, there are few advantages that come with it:

- you can turn the light on/off for separate tunnel segments (e.g. light switch) just by using empty textures

- you can generate new textures procedurally and merge different texture elements at run-time

- once I implement mipmaps, I can get fogging/darkening -like distance effect for free

 

 

Now, given there's exactly 4 weeks till compo deadline from today,  I need to seriously sit down and put a precise estimate on remaining components and make a decision where I'll go with textured look or downgrade to flatshading. It would be ridiculous to miss the deadline, but given how much unknown I'm dealing with (no HW to test builds on, unknown behavior of scaled clipping), it'd be better to release something less technologically advanced, yet fully playable...

  • Like 1
Link to comment
Share on other sites

34 minutes ago, VladR said:

At some point, I should upload it to github. For A800, I mostly need to implement a DisplayList component, as right now my solution simply creates a fixed resolution. I just didn't really need full Display List (where each scanline is a different resolution), as all my prototyping relies on a static resolution anyway.

 

 

I would be interested in a non-GPL Atari 800 emulator (MIT, Apache, etc OK), that you could think of as an XEGS. Ditto for 5200 and 7800.

 

Link to comment
Share on other sites

1 hour ago, bhall408 said:

 

I would be interested in a non-GPL Atari 800 emulator (MIT, Apache, etc OK), that you could think of as an XEGS. Ditto for 5200 and 7800.

 

Not sure I follow here fully about the XEGS/5200/7800 ?

 

This is obviously not meant as an end-user emulator, because the use case is entirely different. Unlike running ROMs (which I never even consider to be worth my time to implement - after all we have plenty gaming emulators), here I want to have as much dev productivity as possible and do rapid prototyping that is otherwise impossible if one was doing it in a classic CC65 environment:

- unlimited RAM on PC means I can keep all the versions of the current technique with all their tables and data in one code (and run them again at any time during dev)

- for 3D prototyping, it's very convenient to start with floating point version, then switch to integer and if needed, eventually to fixed-point

- once the C code is up&running, you can start porting smaller pieces to 6502 Asm. I usually start with inner loop (to get a feel for what the final cycle cost will be), but keep all other C code untouched

- this allows for quick coding and debugging of gradually faster versions

- Edit&Continue is an especially critical feature of Visual Studio to have

 

The ability, at any moment, during coding, to hit F9+F5 and get a full debugger (with watches, etc.) within few seconds is simply unbeatable. Because the 6502 code is implemented via C macros, you can step through it same as you step through the C code.

 

Link to comment
Share on other sites

While right now every day before the compo deadline is precious, I believe that after the compo I could spend few days and:

- clean the DevEmu codebase from all my experiments

- finish the remaining unimplemented 6502C-specific instructions

- upload the Lynx-specific build to GitHub

Link to comment
Share on other sites

Your progress looks great. Remember you have until the 20th of July to register and you don't have to submit until the 11th of August. Also ROMs will not be posted until around 17th of August.

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

While away for a few days last week (which was before I found out there was an extension to compo deadline) I figured I don't really like the uber-pixelated look of perspective texturing (combined with just straight movement). While great from technological perspective, it's really ugly from visual perspective. So, it's back to flatshading, which means I can also have strafing along X and Y axis (not just straight movement as was the case with the textured tunnel).

 

Some progress:

- removed 99% of 16-bit computations (within the renderer) and replaced them with 8-bit ones (resulting in an obvious massive speed-up)

- this enabled inserting my old 8-bit Clip-Space, which is a great match for Lynx's low resolution

- I have a full SW clipping working now (except top screen edge)

- implemented a simplified version of the Track Engine I have on Jaguar - using the BaseSegment system, which creates a level track via offseting track's base segment along an XY displacement curve

- this gives me curves and hills for free

- depth of Base Segment is a run-time variable, so this gives opportunity to adjust polygon complexity versus view distance

- render distance (number of base segments) is also run-time to allow for flexibility (should it be needed)

- currently I have 3 different base segments, but it's generic enough that it'd work if I had 50, so I'll be adding more during upcoming weeks

- coded basic level loading system, tested with 3 levels

- color shading scheme is randomized per each level (but colors are obviously not completely random so that it looks OK)

- I also created a fly-through unit test that randomizes strafing and flies through all levels automatically (great thing for regression testing)

- all of above is running in Handy already

 

Quad Rasterizer:

- The quad rasterizer is now 95% in assembler, only the polygon outer loop is in C and will stay for a while (for debugging purposes)

- I finally have all its stages benchmarked so I can adjust the scene complexity without guessing (as I don't have Lynx)

- I rewrote the edge scanline traversal and ditched the 16-bit FixedPoint one that was based on Jaguar's HW architecture. I went for tweaked Bresenham which tremendously helped visually because I no longer have holes between edges, as Bresenham nicely follows the contour every single time exactly same way (not the case with fixed point system - hence there were holes).

- Since I have the benchmarks now, it's obvious it doesn't even make sense to go for 8-bit fixed point scanline traversal, even if the divisions were done in parallel on Suzy for ~free, as the additional overhead of Bresenham is so miniscule that just spinning Suzy up wouldn't make up for it. And, the edges are real pretty.

 

I really like CC65, as all initialization during level loading can be very effeciently prototyped in C, while only performance-critical stuff is in Asm.

  • Like 3
Link to comment
Share on other sites

I'm in the middle of replacing multiplications and divisions with Suzy Math engine - I just implemented multiplication.  Of course, I can currently only test on Handy (so that's far from presuming the same behavior on target HW), but it would appear that Handy produces the result instantly.

 

But, the documentation I have says, that I'm supposed to poll MULTSTAT (which is nowhere to be found, so I figured it must be SPRSYS: $FC92, as the docs say "Bit 7 : Math in process").

 

So, I poll $FC92 but I don't even get one full loop iteration, as it exits instantly. Far sooner than the supposed 54 cycles that the MUL is supposed to take. I even tried super high numbers, so the result is 32-bit, but that makes no difference on Handy. Then again, not sure if MUL execution speed is also dependant on size of operands (as is the case with DIV).

 

Unless, the 54 ticks (that MUL is supposed to take) are not the 6502 ticks (e.g. NOP takes 2), but they are 54/5 = 10.8 cycles of 6502. Then, it would make sense that the polling code can barely do LDA CMP BCC in that time.

 

So, which one is it ? 54 or 10.8 ? If it's the 10.8, then emulator is behaving right.

 

Although, I just realized, I could try to do three LDAs :

2c NOP

2c NOP

4c LDA $FC92

4c LDX $FC92

4c LDY $FC92

 

Thus the third one (LDY) should show a different value than first one (LDA), because it's been 12c before LDY.

 

Link to comment
Share on other sites

I just did the division, and literally, the very first LDA $FC92, after the MATHE is written to, shows that result is ready.

And the divisor has 5 significant zeroes, so it should take 176 + 14*5 = 246 cycles.

 

So, I guess the emulator provides MATH results instantly.

 

 

Since I don't have one, will the following wait loop be safe on target HW ?

 

while (PEEK ($FC92) >= 128) waitCounter+=1;

 

Link to comment
Share on other sites

1 hour ago, VladR said:

I'm in the middle of replacing multiplications and divisions with Suzy Math engine - I just implemented multiplication.  Of course, I can currently only test on Handy (so that's far from presuming the same behavior on target HW), but it would appear that Handy produces the result instantly.

 

But, the documentation I have says, that I'm supposed to poll MULTSTAT (which is nowhere to be found, so I figured it must be SPRSYS: $FC92, as the docs say "Bit 7 : Math in process").

 

So, I poll $FC92 but I don't even get one full loop iteration, as it exits instantly. Far sooner than the supposed 54 cycles that the MUL is supposed to take. I even tried super high numbers, so the result is 32-bit, but that makes no difference on Handy. Then again, not sure if MUL execution speed is also dependant on size of operands (as is the case with DIV).

 

Unless, the 54 ticks (that MUL is supposed to take) are not the 6502 ticks (e.g. NOP takes 2), but they are 54/5 = 10.8 cycles of 6502. Then, it would make sense that the polling code can barely do LDA CMP BCC in that time.

 

So, which one is it ? 54 or 10.8 ? If it's the 10.8, then emulator is behaving right.

 

Although, I just realized, I could try to do three LDAs :

2c NOP

2c NOP

4c LDA $FC92

4c LDX $FC92

4c LDY $FC92

 

Thus the third one (LDY) should show a different value than first one (LDA), because it's been 12c before LDY.

 

Just tried this:
 

		lda #55
	sta $f152
        stz $f153
        sta $f154
        stz $f155
.\wait
        lda $f192
        bmi .\wait
        lda $f160

So it is dummy writes to RAM => 72*64µs for 512 iterations => 9µs

The same with $Fcxx takes: 12.25µs.
So the pure multiplication is 3.25µs = 52cycles (each 62.5ns)

 

Link to comment
Share on other sites

Thanks. I presume you used 64 us timers for this measurement ? You also used $f192 few times, which is surely just a typo and you meant $fC92, correct ?

 

Since you confirmed it's the 16 MHz system ticks (and not the 4 MHz 65C02 ticks), here's a conversion table of time it takes in the CPU cycles (which is much more useful than system ones):

 

 image.png.bb55987876898462e7e0bdd57e77b3f0.png

 

This really sucks and is quite slow. I should have gone with my tables, which are much faster for 3D transform. Well, next time.  Can't spend all the time just tweaking the engine performance. Gotta get it up&running first...

 

 

 

image.png

Link to comment
Share on other sites

Yes, in particular DIVs are much faster with well established table-math. Actually, the log or square tables are faster than the SUSY mul as well, just less parallel. And let's face it, there are not many technically advanced games since the commercial era for the Lynx.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...