Jump to content
IGNORED

Raycasting


Asmusr

Recommended Posts

 

Could the SAMS be used to store code sections that could be dynamically loaded into VDP RAM kind of like overlays?

 

Perhaps, but I'm not out of 32K RAM at all. Any graphics (soft sprites) that don't fit into VDP memory will have to be uploaded from 32K each frame by the main CPU after the GPU has drawn the walls. This is much slower than letting the GPU display them directly from VDP memory. Alternatively they can be pulled into from the F18A flash. The good thing is that since they are soft sprites they don't *have* to be stored [constantly] in VDP memory.

  • Like 1
Link to comment
Share on other sites

For a true speed boost, the GPU should be able to scale a column of pixels up or down in size (for walls and sprites) and to to copy a column of pixel on another taking into account for transparent pixels (for sprites).

 

Yes I think you're right. That reminds me that I don't have a clear idea about how to add enemy sprites. Would this work?

 

1. The enemies are added to the map as special tiles.

2. When a ray hits an enemy, you add the distance and the tile to a list and then continue until a wall is hit.

3. After drawing the pixel column for the wall, you draw pixel columns for each enemy on the list back to front, taking transparent pixels into account.

 

The enemies would move in large increments because they would only be able to stand on a map tile, but this seems far easier to manage than enemies that can move freely around.

Link to comment
Share on other sites

Well, we can always make the GPU faster... ;-)

 

Perhaps I have asked you this before, but I don't remember the answer: Are the GPU registers (R0-R15) internal registers or are they just memory words like on a TMS9900? Is there any performance difference between using a register and a memory address in an instruction except perhaps for a longer instruction decoding? Are the registers memory mapped?

Link to comment
Share on other sites

The GPU registers are dedicated, and yes they are much faster than accessing memory. The GPU is actually rather inefficient since it was my first attempt at a CPU and I brute-forced a lot of it. If I rewrote it I could probably reduce the cycles per instruction. But the biggest benefit is in being able to make dedicated hardware to support graphic operations. That is what the PIX and DMA were for, but I'm not sure either of those are panning out to be that useful (like the original scheme I had for scrolling, borders, etc.) Since there is not much software (other than a lot of your demos) that use the PIX or DMA, I have no problem replacing those with hardware that supports more useful operations. There is also a 10-nanosecond-accurate timer in the F18A that I don't think anyone has *ever* used.

 

It takes a while for concepts to sink into my head, so if you and artrag want to explain the painful low-level detail what you need hardware support for, I'll look into what it would take to make it happen.

 

Up for the chopping block if not really necessary:

 

1. second, millisecond, microsecond, nanosecond timer.

2. PIX instruction

3. DMA

4. CPU access to reading registers (take a lot of FPGA logic to support this)

 

Let me know.

Link to comment
Share on other sites

That is what the PIX and DMA were for, but I'm not sure either of those are panning out to be that useful (like the original scheme I had for scrolling, borders, etc.) Since there is not much software (other than a lot of your demos) that use the PIX or DMA, I have no problem replacing those with hardware that supports more useful operations.

 

TIMXT uses DMA for scrolling operations. Maybe some of the potential new operations could replace its usage, if you put DMA on the chopping block.

Link to comment
Share on other sites

The GPU registers are dedicated, and yes they are much faster than accessing memory. The GPU is actually rather inefficient since it was my first attempt at a CPU and I brute-forced a lot of it. If I rewrote it I could probably reduce the cycles per instruction. But the biggest benefit is in being able to make dedicated hardware to support graphic operations. That is what the PIX and DMA were for, but I'm not sure either of those are panning out to be that useful (like the original scheme I had for scrolling, borders, etc.) Since there is not much software (other than a lot of your demos) that use the PIX or DMA, I have no problem replacing those with hardware that supports more useful operations. There is also a 10-nanosecond-accurate timer in the F18A that I don't think anyone has *ever* used.

 

It takes a while for concepts to sink into my head, so if you and artrag want to explain the painful low-level detail what you need hardware support for, I'll look into what it would take to make it happen.

 

Up for the chopping block if not really necessary:

 

1. second, millisecond, microsecond, nanosecond timer.

2. PIX instruction

3. DMA

4. CPU access to reading registers (take a lot of FPGA logic to support this)

 

Let me know.

 

I don't really think the GPU needs to be faster, I can just optimize my code. If we say that the standard TI-99/4A has been used up to 90% of its potential so far (on the Rasmus scale ;)), I would say that the TI-99/4A+F18A has only been used up to 30% of it's potential, and only by a handful of people.

 

The DMA is fine as it is for copying or writing. I use it in the raycaster for clearing the double buffer.

 

The feature I suggested would be to be able to scale the (single) bitmap layer. All I would need for this project would be to double the height, which would give you square pixels in the fat pixel mode. More generally you could scale by any value in both x and y, and ultimately I think systems like the SNES allowed you to scale the bitmap in 3D so you could produce something like the floor in my Skyway game.

 

The feature Artrag suggested to speed up the rendering of the raycaster would be to copy a one pixel wide strip of a bitmap stored in vdp ram to the visible bitmap in vdp ram, scaling it in the y direction, and taking into account bit mapping and possibly transparency. So unlike the general DMA it would be specific to the bitmap layer and would work on pixels instead if bytes. To generalize it should support any size at the source, scaling in both x and y at the destination, and would work with both normal and fat bitmap pixels.

 

I'm not arguing strongly to implement any of this because I'm happy to work within the current limitations.

  • Like 2
Link to comment
Share on other sites

@Tim: If you are using the DMA then it will not go away.

 

So the scaling sounds like it would be done at the line rendering level and really has nothing to do with the VRAM. Is that correct?

 

The Artrag feature sounds like a Pixel-DMA, yes? Basically a real blitter.

 

Yes and yes.

Link to comment
Share on other sites

Would it be possible to store bitmaps in the flash rom and copy them directly in the active area? It would solve the ram problems.

About sprites I followed the same tutorial of you, part III.

http://lodev.org/cgtutor/raycasting3.html

It was matter of sorting the sprites according to their distance from the screen and plot them from the fartherst to the closest for each stripe of the screen starting from the non occluded one

Edited by artrag
Link to comment
Share on other sites

 

Perhaps, but I'm not out of 32K RAM at all. Any graphics (soft sprites) that don't fit into VDP memory will have to be uploaded from 32K each frame by the main CPU after the GPU has drawn the walls. This is much slower than letting the GPU display them directly from VDP memory. Alternatively they can be pulled into from the F18A flash. The good thing is that since they are soft sprites they don't *have* to be stored [constantly] in VDP memory.

Have you seen my RXB Demo of the BSAVE and BLOAD for the SAMS?

What about the SAMS page flipping of those same saved screens?

 

Link to comment
Share on other sites

a favourite 4A game of mine is 3D-Maze written by Glen Schworak back in '88.

 

It had no enemies to battle and no weapon overlay...it was simply; here's the maze, find your way out (after finding a key, and with luck, a map).

 

Amazingly fast (within the context of the time) through the use of minimalist graphics, my head spins at the possibilities for a raytraced version someday soon.

 

 

hmmmm...if a section of the megademo resulted in Skyway, could Rasmus' current experiments result in a playable raytraced '3-D Maze' in relatively short order as well? (Kudos to Mr Schworak though, who produced one of the most entertaining and well-conceived early era 3D games for the TI)

 

Question...Are all vintage systems enjoying this wealth of astonishing development?? Because it sure seems we are living in a golden age for TI enthusiasts!

Link to comment
Share on other sites

 

It's VDP memory I'm short of - not SAMS or anything else. I'm afraid RXB doesn't always hold the answer :).

Hmm you totally missed the point!

Using the SAMS to fix your issue is the point.

I only showed RXB to explain to you that solution existed and that it does work.

Link to comment
Share on other sites

 

If SAMS could page directly into VDP RAM that would indeed solve my issue.

RAM is always much faster than VDP to do the same thing.

Make an Assembly program to move 2 screens onto screen, one saved in VDP and moved to screen and another from RAM to screen.

Assembly is much much faster doing the same thing.

Edited by RXB
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...