Jump to content
IGNORED

jaglib and DSP program offsets


Luigi301

Recommended Posts

You can gain a bit of speed with the following changes, exploiting the pipeline:

    shlq    #2,SQRT_REM_HI
.sqrt_loop:
    move    SQRT_REM_LO,TEMP1
    sh      SQRT_THIRTY,TEMP1
    or      TEMP1,SQRT_REM_HI
    shlq    #2,SQRT_REM_LO
 
    shlq    #1,SQRT_ROOT
    move    SQRT_ROOT,SQRT_TEST_DIV
    shlq    #1,SQRT_TEST_DIV
    addq    #1,SQRT_TEST_DIV
 
    cmp     SQRT_TEST_DIV,SQRT_REM_HI
    jump    ge,(SQRT_LOOP_CHECK) ;if remHi >= testDiv
    subq    #1,SQRT_COUNT
 
 
    sub     SQRT_TEST_DIV,SQRT_REM_HI
    addq    #1,SQRT_ROOT
 
.sqrt_do_loop:
 
    cmpq    #-1,SQRT_COUNT
    jump    ne,(SQRT_LOOP_ADDR) ; if not -1, keep looping
    shlq    #2,SQRT_REM_HI
 
    move    SQRT_ROOT,r0
 
    GPU_RTS
  • Like 5
Link to comment
Share on other sites

Don't know if this helps since you've already ported the code, but one of the classic ways to determine back faces was simply to determine if the points of your triangle are sorted clock-wise or counter-clockwise after projection. You can do that with just a couple of comparisons. :)

Yes, you can work out the normals during initialisation. Then rotate them along with the rest of the vertices at runtime.

As long as the triangles are all wound in the same order you can tell front or back from the direction.

Link to comment
Share on other sites

Yes, you can work out the normals during initialisation. Then rotate them along with the rest of the vertices at runtime.

As long as the triangles are all wound in the same order you can tell front or back from the direction.

Oh right, since the models don't deform I can precalculate the normals and rotate them as part of the transformation program. The triangles are all clockwise but I will need the normals for shading when I get there.

 

 

 

You can gain a bit of speed with the following changes, exploiting the pipeline:

 

 

Cool, thanks. Optimizing for pipelining isn't something I've had to do before.

 

Another night of programming and I got a sweet-looking wireframe cube with no back-facing polygons! Now, uh, I just need to make some models other than cubes...

 

DUpJPxk.gif

Edited by Luigi301
  • Like 6
Link to comment
Share on other sites

As long as you have 2-3 days in a row (like a weekend) where you work 4+ hours, you can get the first version done.

 

One long day is all that is needed, really, for a first working attempt. Then, it's just iteration, like you're doing already.

 

Any specific game / engine you are working towards ?

Link to comment
Share on other sites

I'm thinking about some kind of space sim, which is something the Jag should be good at.

Yeah, that's an easy genre to pull off on jag:

- 1 static 2D scrolling bitmap on OP

- 1 static 2D HUD bitmap on OP

- 3 small 2D targeting reticle bitmaps on OP

- couple dozens of scanlines for 3D ships, covering at most 10% of screen (most of the time)

 

Even unoptimized first version will run easily at 30 fps, though I honestly see no reason why it should ever drop under 60 fps, considering how little space on the screen the ship takes even when it's close to player. Let alone if you'll use multithreading on DSP.

Link to comment
Share on other sites

Yeah, that's an easy genre to pull off on jag:

- 1 static 2D scrolling bitmap on OP

- 1 static 2D HUD bitmap on OP

- 3 small 2D targeting reticle bitmaps on OP

- couple dozens of scanlines for 3D ships, covering at most 10% of screen (most of the time)

 

Even unoptimized first version will run easily at 30 fps, though I honestly see no reason why it should ever drop under 60 fps, considering how little space on the screen the ship takes even when it's close to player. Let alone if you'll use multithreading on DSP.

That's all speculation without knowing exactly what he has in mind.

  • Like 3
Link to comment
Share on other sites

That's all speculation without knowing exactly what he has in mind.

Speculation ? What exactly ?

 

Are you saying that jaguar can't handle couple 2D bitmaps ? Or couple dozens of scanlines of triangles ?

 

 

EDIT: Oh crap, I took your bait - I now see what you're doing there :) Good on ya, mate :)

 

I do learn from my mistakes fast, though ;)

 

updating the ignore list...

Edited by VladR
Link to comment
Share on other sites

Speculation ? What exactly ?

 

Are you saying that jaguar can't handle couple 2D bitmaps ? Or couple dozens of scanlines of triangles ?

Not at all, of course it can. I'm not saying it can't run what you suggested.

 

But you're guessing at what he might be planning to add in the first place.

Link to comment
Share on other sites

Speculation ? What exactly ?

 

Are you saying that jaguar can't handle couple 2D bitmaps ? Or couple dozens of scanlines of triangles ?

 

 

EDIT: Oh crap, I took your bait - I now see what you're doing there :) Good on ya, mate :)

 

I do learn from my mistakes fast, though ;)

 

updating the ignore list...

i'm sorry, wtf? bait? it's a genuine statement.

  • Like 1
Link to comment
Share on other sites

For reference, I was thinking something along the lines of TIE Fighter - a bunch of Gouraud-shaded models flying around in a starfield with a 2D overlay UI. I think the Jaguar can easily handle a couple hundred lit and shaded polygons flying around an empty space shooting billboarded sprites at each other.

 

I've got all the Jaguar code archives but I'm not sure where to start looking for implementing flat shading on the GPU/blitter.

Edited by Luigi301
  • Like 2
Link to comment
Share on other sites

i'm sorry, wtf? bait? it's a genuine statement.

 

That's just Vladr. If you don't agree with him 100% of the time, you're baiting him. In 5 years, after he's fucked around producing bits and pieces of FuckAll he'll come to the conclusion that he ignored you telling him about in the beginning.

  • Like 2
Link to comment
Share on other sites

For reference, I was thinking something along the lines of TIE Fighter - a bunch of Gouraud-shaded models flying around in a starfield with a 2D overlay UI. I think the Jaguar can easily handle a couple hundred lit and shaded polygons flying around an empty space shooting billboarded sprites at each other.

 

I've got all the Jaguar code archives but I'm not sure where to start looking for implementing flat shading on the GPU/blitter.

TIE Fighter was awesome. Would love something like that on jag :)

 

- Honestly, I would recommend switching to gourard only after you got flatshading working first. I'm sure you noticed by now that one register flag typo is all it takes for Blitter not to work :)

- So, the less features you enable, the shorter it takes to debug, which I honestly believe is great majority of the effort.

- I wouldn't really bother trying to understand Atari's 3D code (not sure even if there is a flatshading example - I was never bothered to check for that).

 

- Just google "triangle scanline traversal".

- You already have code for Blitter drawing lines under generic angles and distances, so you can reuse that for scanlines which are even simpler (as they are horizontal).

 

Depending on how much time you have over next few days, it's up to you where you want to strike the balance between fully generic code that can handle any order of points and clip them accordingly, versus a test triangle with hardcoded values.

Link to comment
Share on other sites

That's all speculation without knowing exactly what he has in mind.

 

 

Not at all, of course it can. I'm not saying it can't run what you suggested.

 

But you're guessing at what he might be planning to add in the first place.

 

Alright then, I'm going to give you the benefit of the doubt that your intentions weren't actually evil.

 

You did, however, sound awefuly similar to the "jag can't do 3D shit" crowd, so the bait was the only logical conclusion.

 

Of course I don't know what he exactly plans to do, but 80% of the performance characteristics of space sim were covered in my post, as I played those games myself.

 

Since he, from the start, is working on both DSP and GPU, he can significantly push those performance barreers.

Link to comment
Share on other sites

Right now it just works from the bottom up incrementing the endpoints with the slope of the edges they follow, until it hits the scanline with the middle vertex and stops. I need to clear out some space in my program to be able to work from the top down and fill in the other half of the triangle.

 

Im doing each scanline separately but I think the blitter should be able to process both slopes changing at the same time if I bastardize the increment registers properly? That would let me do one blitter operation per triangle instead of per scanline.

Edited by Luigi301
Link to comment
Share on other sites

You won't get the generic triangle in one call. In past, I've succesfully done multiple different polygons in single call, but you would still need to fill the smaller rest manually.

 

It's not a problem other than you run out of 4 KB very fast and the code swapping kills any performance benefit of this approach. I haven't pursued this path further, since I managed to get my scanline rasterizer into state where I am pretty happy with its performance (and have plenty other low-hanging fruit for when I will want more performance), so I don't need to chase it like that.

 

 

But, of course, it's entirely possible to design a game around that. For example the floor in Diablo3 is a perfect example where you could render textured polygon via a single blitter call.

 

Wouldn't work in 6DOF space game though...

Link to comment
Share on other sites

:) It's a start! Running out of space in this GPU program though...

 

 

I did a few tricks to save on space when I was writing my sound engine, granted 8KB is more than 4KB so not as tight :) But might be some avenues to explore.

 

If you have a lot of MOVEI instructions, have a think for ways to reduce their number, a regular JRISC instruction is 16bit, but a movei is 48bit due to the address. If you are setting up pointers to RAM it MIGHT be possible to get there using a single movei and then addq and shifts to add on offsets. Fair amount of faff but can save you a couple of bytes.

 

The biggest saving for my application was looking at the data tables it needed and packing them together a bit. In my case I had a lot of 16 bit values, obviously stored as 32 bit words in the CPU RAM. By combining both 32bit words into one single 32bit word I freed up a chunk of memory, and accessing the value only needed a little bit of extra logic to shift/mask off the data I wanted.

 

I've always wanted to have a crack at 3D on the jag :) never gotten around to it yet, you're certainly rekindling my interest in it with your progress keep up the excellent work, always top banana when stuff works.

 

Another quick thought, as you are rendering filled triangles, could you reclaim some space by removing the code to draw the wireframe? If the triangles are filled, no point in drawing the frame as well. ?

  • Like 5
Link to comment
Share on other sites

I thought of an just got around to proving another possible time saver. Quite possibly a well known method of saving LUT space, but I am no maths wizard :D

 

If you are using LUT for sin & cos type functions, you only actually need 1 of those (as cos is just sin 90 degrees on), in addition to that you only need the 1st 90 degrees of sin as it is the same values backwards and/or inverted for future values past 90.

 

You probably already knew this, but it's new to me :) thought I'd share :D

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...