Jump to content
IGNORED

Raycaster


Asmusr

Recommended Posts

1 hour ago, PeteE said:

Hi Rasmus, thanks for sharing your code for this.  I have made a small performance improvement in the screen drawing code, saving 20 cycles per byte written to the VDP, by removing the column pointer increment and write back, instead using self-modifying code to update the column offset for each row.  The @0(r1) part of the instruction is modified by the "clr @self_modifying_offset" and "inc @self_modifying_offset".

 

Cool. I usually stay far away from self modifying code, but in this case it's worth it.

Edited by Asmusr
  • Like 1
Link to comment
Share on other sites

31 minutes ago, FarmerPotato said:

I think you are all talking about Graphics 1 mode?

In bitmap, I like to arrange the screen table as chars 0:8:16:24 ... 1:9:17:25.

So that I set the VDPWA and stripe down 8 chars in a column, then the next 8, and so on.

Repeat for the other 3rds.

I worked on this when I was trying a rectangle fill.

Right, but updating patterns would be a lot slower. I have thought about using multi-color mode because that can be arranged in true columns and could also support low-res textures, but I think it would be too slow.

  • Like 1
Link to comment
Share on other sites

1 hour ago, Elia Spallanzani fdt said:

what effect does it have on speed if you limit the test to a certain distance, for example 8 squares?

Where he finds nothing he could insert a column of "fog".

The time it takes to cast a rays is proportional to the max number of steps, but only if it doesn't hit anything. Currently the max steps is 24, so reducing it to 8 would have an effect. But drawing the screen also takes time, so maybe the overall time per frame would be reduced by 10-20%? However, I'm not sure how you would draw fog?

Link to comment
Share on other sites

What if the color of the columns also depends on the distance? the closest white, then light yellow, dark yellow, shades of green, finally black.

 

P.S. 

 

Little game idea: you are in a maze, you have a lantern. You hear strange noises and smells and you know THINGS are in the dark, ready to devour you if the light goes out.

You have to find the exit and quickly because over time the oil of the lantern burns and the darkness approaches.

In the labyrinth you can find some oil flasks or signs with cryptic clues.

Sometimes in the dark you can see malevolent eyes looking at you.

 

 

The corridors are all the same but in some there are noises or smells that help you distinguish them.

If the light goes out you are dead.

Edited by Elia Spallanzani fdt
  • Like 1
Link to comment
Share on other sites

14 hours ago, Elia Spallanzani fdt said:

And. what happens if you project only the 16 odd rays and calculate the height of the even columns as an average (e.g. col 2 = col1 + col3 / 2)?

It would make the edges of walls less well-defined. It's already pretty bad with the edges, so I'm working on a second version that will make the edges more well-defined and remove the fish-eye distorsion.

  • Like 1
Link to comment
Share on other sites

I have given in and moved to the DDA algorithm. With the simple algorithm I just couldn't figure out an easy way to tell whether I hit an x or an y wall, so I could't change wall color depending on the orientation of the wall.

 

 

The approach used here https://lodev.org/cgtutor/raycasting.html is great if you have floating point numbers, but with fixed point numbers I don't like divisions, so I have adjusted it to used pre-calculated tables of sines and cosines and only a few multiplications.

 

The result looks a lot better, with clearly defined corners. I have also implemented fish-eye correction, although it still looks a bit weird when you're parallel to a wall (due to too low precision). The speed is more of less the same as before. An added benefit is that the new algorithm doesn't require the map to have a width of 256, because I can trace the ray using a map address instead of fixed point x and y coordinates. 

 

I have also implemented a new kind of textures, which means that within a block (the world consists of 256 x 32 blocks) you can choose a different color (column set) for each column with a resolution of 16. It doesn't look great to be honest (resolution is way to low), but it can be used to implement special blocks like doors (there are two on the map). The old way of implementing textures using characters with different patterns still exists, it's not visible in this demo.

 

The next step is to try to add enemies or other objects. The idea is to save the depth of each column during the raycasting, and then, in a second pass, only draw a column of an object if it is in front of the depth buffer entry. It will require changes to the rendering routine, and may require significant changes, like moving to use a buffer in 32K RAM.

 

raycaster.rpk raycaster8.bin

  • Like 16
  • Thanks 2
Link to comment
Share on other sites

On 4/29/2020 at 7:33 PM, PeteE said:

Hi Rasmus, thanks for sharing your code for this.  I have made a small performance improvement in the screen drawing code, saving 20 cycles per byte written to the VDP, by removing the column pointer increment and write back, instead using self-modifying code to update the column offset for each row.  The @0(r1) part of the instruction is modified by the "clr @self_modifying_offset" and "inc @self_modifying_offset".

 

I found an even faster way to transfer the screen: Let's move the workspace on top of the column pointers, then we can transfer a byte and increment the pointer in one instruction:

upload_screen_loop:
       lwpi column_ptrs
       movb *r0+,@vdpwd
       movb *r1+,@vdpwd
       movb *r2+,@vdpwd
       movb *r3+,@vdpwd
       movb *r4+,@vdpwd
       movb *r5+,@vdpwd
       movb *r6+,@vdpwd
       movb *r7+,@vdpwd
       movb *r8+,@vdpwd
       movb *r9+,@vdpwd
       movb *r10+,@vdpwd
       movb *r11+,@vdpwd
       movb *r12+,@vdpwd
       movb *r13+,@vdpwd
       movb *r14+,@vdpwd
       movb *r15+,@vdpwd
       lwpi column_ptrs+32
       movb *r0+,@vdpwd
       movb *r1+,@vdpwd
       movb *r2+,@vdpwd
       movb *r3+,@vdpwd
       movb *r4+,@vdpwd
       movb *r5+,@vdpwd
       movb *r6+,@vdpwd
       movb *r7+,@vdpwd
       movb *r8+,@vdpwd
       movb *r9+,@vdpwd
       movb *r10+,@vdpwd
       movb *r11+,@vdpwd
       movb *r12+,@vdpwd
       movb *r13+,@vdpwd
       movb *r14+,@vdpwd
       movb *r15+,@vdpwd
       lwpi wrksp
       dec  r3
       jne  upload_screen_loop         ; Next row
       rt

Even from 8-bit RAM, this is almost twice as fast as the old routine running from scratch pad (36226 vs. 60270 clock cycles).

 

It could be optimized even further by storing vdpwd in one of the registers, e.g. r15. Then the instructions could be replaced by movb *r0+,*r15,  movb *r1+,*r15 and so on, which is faster and takes half the space. We would then need to deal with last column separately, preventing the column pointer from being overwritten.

  • Like 8
  • Haha 1
Link to comment
Share on other sites

23 minutes ago, FarmerPotato said:

Now what happens if you use PAD for WS and 15 unrolled


MOVB * Rx+, *R15

...

RT

 

That would use 18+16+16 words or 100 bytes of PAD. 

I tried the optimization in combination with running from PAD, but it only saved about 5000 clock cycles so I decided save the PAD for something else and revert to running from 8-bit RAM as in post #39.

Link to comment
Share on other sites

@Asmusr, you probably already thought of this, but using one of the registers in each column-segment workspace for VDPWD would only require three of the registers in a third workspace (R0=col30, R1=col31, R2=VDPWD), with the remaining registers untouched, which could overlap running code with no ill effect.

 

...lee

Link to comment
Share on other sites

1 hour ago, Lee Stewart said:

@Asmusr, you probably already thought of this, but using one of the registers in each column-segment workspace for VDPWD would only require three of the registers in a third workspace (R0=col30, R1=col31, R2=VDPWD), with the remaining registers untouched, which could overlap running code with no ill effect.

 

...lee

Yes I did think of that, but having two holes in the list of column pointers would require some awkward handling in other parts of the code. The code looks like this now, running from 8-bit RAM (34620 cycles):

upload_screen_loop:
       lwpi column_ptrs
       mov  r15,@tmp
       li   r15,vdpwd
       movb *r0+,*r15
       movb *r1+,*r15
       movb *r2+,*r15
       movb *r3+,*r15
       movb *r4+,*r15
       movb *r5+,*r15
       movb *r6+,*r15
       movb *r7+,*r15
       movb *r8+,*r15
       movb *r9+,*r15
       movb *r10+,*r15
       movb *r11+,*r15
       movb *r12+,*r15
       movb *r13+,*r15
       movb *r14+,*r15
       mov  @tmp,r15
       movb *r15+,@vdpwd
       lwpi column_ptrs+32
       mov  r15,@tmp
       li   r15,vdpwd
       movb *r0+,*r15
       movb *r1+,*r15
       movb *r2+,*r15
       movb *r3+,*r15
       movb *r4+,*r15
       movb *r5+,*r15
       movb *r6+,*r15
       movb *r7+,*r15
       movb *r8+,*r15
       movb *r9+,*r15
       movb *r10+,*r15
       movb *r11+,*r15
       movb *r12+,*r15
       movb *r13+,*r15
       movb *r14+,*r15
       mov  @tmp,r15
       movb *r15+,@vdpwd
       lwpi wrksp
       dec  r3
       jne  upload_screen_loop         ; Next row

 

  • Like 3
Link to comment
Share on other sites

5 hours ago, INVISIBLE said:

I have to ask, is this for  experimental purposes to see what could be done or is the intention a completed game?

All I can say is that at the moment I'm interested in developing this project further. But there's currently not much to make a game from, and I don't have any specific idea for a game.

 

Edited by Asmusr
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...