Project-M 2.0

peteym5 · April 11, 2018

I had been looking into ideals for using this engine for a 3D Role Playing Style game on the Atari 8-bit. Something with a first person perspective going through a dungeon. I know this probably consumes a lot of RAM. Adding material to play as a Dungeons & Dragons style adventure game, it may require way over what can fit inside 64K. Run it from a 128K XEGS/Max or Atari computer with 128K RAM or more.

Edited April 11, 2018 by peteym5

Gunstar · April 11, 2018

Cool stuff is that Carmack did the Jag port in several weeks himself.

The article I read said they ported it over a WEEKEND.

oky2000 · May 1, 2018

Great to see this is still alive after all these years. I found this again after checking out an awesome tech demo of Wolfenstein for 7mhz Amigas at a similar sort of stage.

Wolfenstein is quite an underwhelming game on the Jaguar, there are more impressive Wolf style games that run in HAM mode on a £300 Amiga 1200 already so clearly that is not exactly pushing the Jag. Also, said HAM based FPS is more impressive than Wolf on a £1000 386DX VGA high street DOS PC families were buying in the mid 90s.

In fact Doom and Wolf are only average games anyway and championed by losers, probably the fathers of the losers who constantly champion Super Mario Bros NES as best game in history (another average game with crap music). "DOSwankers" go on and on about them but I never played it yearly for decades and yet I could name 100+ 8bit and 16bit computer games, let alone console games, I wouldn't mind playing right now. I play Lotus II challenge on ST/Amiga all the time and Lotus III on Amiga 1200. The music on Doom and Wolf is also rubbish (because every bit of music on PC is crap and inferior to even the 1985 Amiga 1000 before Windows could handle unlimited DAC based streamed sound from an API)

That was exactly my point, I mean if it'd come out around the same time as the Jag port, there would have been a lot of "wait... if the 8bit can do this... why is the Jag not so much more awesome? Actually the most amazing thing beside Project-M I've seen is the 60fps video playback. http://atariage.com/forums/topic/211689-60-fps-video-using-side-2/?hl=video player

NRV · March 10, 2020

I had some time after the Raymaze thingy and I used it to test some ideas "accumulated" in my head, after all these years.

So be warned, long post ahead, with raycasting ramblings..

I had 3 things that I wanted to see working:
- what I call the "line flicker" effect for APAC modes
- an idea for a new (very fast) renderer, called Project X, that uses APAC over a char mode
- and all the logic to have objects working

APAC Line Flicker:

The line flicker effect is basically having a normal screen in the "even" frames, and setting the screen one scanline down in the "odd" frames.
So for an APAC mode, the GTIA 11 and 9 lines of one frame mix/merge with the GTIA 9 and 11 lines of the next frame.
Then this is similar to interlace, but the idea is to reduce the dark lines product of using the GTIA 11 mode.

Here.. a video is better.. (you need to set the quality to 720p50Hz to see it):

In the middle of the video I activate the "frame blending" option in Altirra, so that's why the effect start looking "better".
From what I understand some modern TV's do a similar effect.

I'm a little undecided about this, but it could be an option. Probably I could give a final verdict if I see it running in real hardware.
I think it would be a little better in NTSC also (30 Hz instead of 25 Hz).

Implementing this with DLI's is trivial, but with IRQ's it was kind of a "side quest".
First problem was that I was using one IRQ every two scanlines, to flip between GTIA 11 and 9 every line.
Why?, because for 32 bytes mode lines, is faster to do a medium IRQ every two lines, than one small IRQ every line.
Given that I was also going to implement this effect over a char mode, using one IRQ every two lines was "convenient", to skip the bad lines.

But if you want to move your screen one scan line down, every "odd" frame, you will need to re sync the starting point of your IRQ's.
That means touching SKCTL (15KHz clock) or STIMER (1.79MHz clock) every frame.
And if you also want to play sounds or music, you don't want to write SKCTL or STIMER every frame, because they are going to sound wrong.
I don't know if there is a way to do that.. maybe this is a question for phaeron

In the end, I also needed to change the font every mode line, so with all those requeriments, the only solution was having one IRQ every line.
Using the IRQ on channel 1, clocked to 1.79MHz, I can use the remaining 3 channels without any restriction (no need to force the music to 15KHz).
Also you can sync the IRQ's to a point before the start of the badlines, so there is no conflict there (it still needs to be short).
And you can move your screen one scan line down every odd frame, without the need to touch STIMER, just by moving the scan line of your DLI that init the IRQ's for that frame.

Project X:

The idea behind Project X was having a renderer with a processing cost near 0.. is that even possible?, without using all the memory?.
Well.. yes. For starters one screen window uses 64 bytes, compared to the 4K per screen of Project M.
So is kind of obvious that writing 64 bytes is a loooot faster than writing 4K (we are talking about the same screen resolution).
I suppose this is a good example of the flexibility of the A8 to generate special "graphic" modes x)

The big drawback of this mode is that you can only have very simple walls, so no complex textures.
If you are smart you can have a good number of variations (more with a cartridge), but you have only a few number of different designs to use.
The good thing is that they can be colorful, you win a nice and free depth cue effect, and you get back like 16K of ram (comparing it to Project M).
Also, most of the processing time can be dedicated now to the "raycasting rays" part, and also.. objects experiments.

Project M has a type of renderer that uses a block of around 14K for the scaling code (to scale up or down the textures).
The scaling code points to a fixed 8K area in memory, where all the textures reside (and that would be one hell of a use for the banks of a cartridge).
This is basically precompiled code, very fast, but still need to fill a 4K area for every logic frame.
On the other side, Project X is the type of renderer that has all possible wall columns already scaled in memory, so it only needs to write 2 bytes to generate a
final scaled wall column on screen (with background included).

For this particular renderer it could be useful to use GTIA 10 instead of 11, to generate the color part of APAC, but I would need to see how that looks, because
of the different offset of the GTA 10 pixels against the GTIA 9 pixels.

px6.png.b12f8b9711a95c3c05ffd99280906404.png px8.png.0b670ca4264a022b902bdf5c27565c33.png px11.png.576ab1c9846f903f0a96d6d156eb1182.png

Another advantage of Project X, is that it allows the camera/player to be closer to the walls, so is easier to move through doors.
Also, the extra speed allowed me to increase the visual quality, using more precision on some of the raycasting data.

Raycast Optimizations:

After this, it was time to optimize the raycasting code.

Now that the camera pivot is in the player position, there was a way to speed up a lot the camera rotations.
It could have minor visual imperfections, but they are noticeable only if you are looking for them, so I tried it on.
Then.. rotations were running at 54 fps in PAL ... yeah that's not an error (still NTSC was a little slower than 60).
Because my frame rate was never that high and I use double buffering, I never needed a hard screen sync before.
So it could happen that I render more than one logical frame per hardware frame.
I added a "soft" kind of screen sync, so PAL don't go over 50 (softer than just waiting for a specific VCOUNT value, I wait
for a VCOUNT "zone", so you can start rendering the next frame sooner, if the previous one was shorter than average, for example).

After that, was the turn to improve all the raycasting that is done outside rotations.
I had the idea, long ago, about interpolating most raycast info between 2 rays that were touching the same wall.
The full idea means doing something like a binary search over the rays, and a general interpolation between any two rays.
But there is a danger that doing all that could end up costing you too much time.
So I decided to do a simplified version, that only check if ray N and ray N+2 touch the same wall, and then see if it can interpolate
most of the data from ray N+1 (interpolation is also easier this way, with some specifics to the type of data you are interpolating).

It was another good optimization. I would say in average 8 rays get interpolated (which is a lot faster than doing the raycast).
So the speed up is similar to the one when you run Project M in the smaller window (starting the demo with SHIFT pressed).

I can easily move these optimizations to Project M (fighting a little with ram distribution), so that's a low hanging fruit for the future.

Objects:

The implementation for this is another old idea. It was kind of surprising that it worked so easily and without major issues x)
Basically, for every active object, I need to get the direction from the player, the distance and the screen size of the object.
For every one of these I have a table that is accessed using the positive deltas between the camera position and the object position.
Is little more complex, because there is also a "scale factor" involved, that is related to the distance between the camera and the object.

If the object is closer to the player, then the tables provide more "resolution" for the data that they contain.

The direction is transformed into a world angle index, that is later changed to a screen angle index, to see if an object is inside the screen.
Then the distance is used to see if we need to clip some columns of the object, against walls that can be between the camera and the object.
Finally the object should be rendered using the correct sprite frame, for an object of that size and with that orientation.
This is different in the video, because for now I only draw columns of different width and size, and also change the color according to the distance.

Right now the angle table uses 1K and I think it would look better with more resolution (that would mean 4K instead).
The distance table uses 256 words (so 512 bytes), and it haves 7 bits of precision that I'm not using yet, but it works well enough.
I was using a size table of 256 bytes, but in the end I don't need it, because objects also need the perpendicular distance to the camera
(same correction as wall columns), so I'm using the same code used for the walls, to get this scale factor.

In the video I implemented two "objects", one of them moving in a loop. They get activated when they are at a "visible" distance from the player
(like 8 tiles away) and if they are disabled they should not cost much processing time.

Probably two enemies at the same is a good rational limit for this engine, but I would have to test this more.

Set the quality to 720p50Hz also for this..

For the future: (whenever that is..)

For Project X, it could be useful to force a max frame rate of 25 in PAL (30 or 20 in NTSC), so it is a little more stable.
Now it can go from 50 to a little below 20 (in very specific points of the maze, looking in specific directions and with 2 objects active..
but maybe I can optimize this worst case), so that variation may bother some people.
In general I would say the average goes between 25 and 35 fps. So maybe locking the upper limit could be another option.

Also, for any movement logic, is better to have a stable frame rate, but you can also solve this moving the logic to an interruption.

The next step would be using better graphics for the objects, and that would require more complex clipping and lots of sprite frames .
This can be done using P/M's, or char based software sprites in Project X (there is space for that), or just software sprites, in Project M.
I also need to move the optimizations and the object code to Project M, but maybe it would be more productive to start migrating everything to a cartridge.

Regards!

ilmenit · March 10, 2020

Thanks for the details, very interesting! Did you experiment with look of the enemies at that low horizontal resolution? Would they look acceptable in the distance?

Rybags · March 10, 2020

Re the game engine - I'm still of the opinion it's greatest use might be for an RPG type game.

Wolf3D and Doom games are beyond practical for 8-bitters and realistically serve as a novel demo.

A dungeon crawler, now that could be a classic.

Poison · March 10, 2020

WOW ! I want Wolf 3D NOW !! It is absolute outstanding !

flashjazzcat · March 10, 2020

7 hours ago, NRV said:

So be warned, long post ahead

Very much enjoyed it, and no matter how long the hiatus between each update, it's always fun to see what you have come up with. The interlaced APAC looks great. What I love about this project is that it's so impressive I would still think it was a hoax if I had not already run one of the older demos.

R0ger · March 10, 2020

Wouldn't it be better to simply use kernel code ? How many cycles can you save using those IRQs ?

emkay · March 10, 2020

Nice progress in terms of fps.

A 3D shooter has never been that near.

Btw. The scanlines would be less obvious, if the gaming screen was bigger.

How about a different projection of the graphics? Just some GTIA Mode 9 game... no borders , the missiles used for giving the different color for a door else... one or two enemies , and the "gun" of the protagonist?

If such thing was turning into a game showing 14-16 fps, it would be "king's class" ...

+CharlieChaplin · March 10, 2020

Using GTIA 10 ? Ok.

Using cartridge? Ok.

Next step: GTIA-shifting and large cartridge with 1MB or more!

(Gr. 10 RIP-mode with up to 160x240 resolution)

Whatever you do - please! please! use other colours for the dungeon...

popmilo · March 10, 2020

Am I the only one who wants to see xex file ?

Would you mind sharing details about that 64 bytes screen ?

Awesome tech NRV! let's see another update in couple years

emkay · March 10, 2020

41 minutes ago, popmilo said:

Am I the only one who wants to see xex file ?

Would you mind sharing details about that 64 bytes screen ?

Awesome tech NRV! let's see another update in couple years

The latest videos show obviously that mirroring is used . 64 Byte would point to 32 bytes width and the need of an extra line for different content. Possibly character mode is used with chars "fills" depending on the angle of the projection. Movement seems to be at byte boundaries. Thus it figures out that "4*" fps in a rotation fit to the time and the shown changes.

Irgendwer · March 10, 2020

1 hour ago, popmilo said:

Would you mind sharing details about that 64 bytes screen ?

I would guess that you just LMS the fitting representation to the DL.

Just the principle I described here - which you liked: :-)

Edited March 10, 2020 by Irgendwer

NRV · March 11, 2020

Thanks for the feedback, everyone.

11 hours ago, ilmenit said:

Did you experiment with look of the enemies at that low horizontal resolution? Would they look acceptable in the distance?

Not yet. I was hoping to have something that looked good with two players, using a GTIA like resolution (double width and double line for P/M's).

What I know is that I would like to have the different sprite zoom levels in memory. I don't think that using a generic scaling routine is going to look good.

The pilot running to your ship in Fractalus came to mind.. don't know if that is a good target x)

10 hours ago, Rybags said:

Re the game engine - I'm still of the opinion it's greatest use might be for an RPG type game.

Wolf3D and Doom games are beyond practical for 8-bitters and realistically serve as a novel demo.

A dungeon crawler, now that could be a classic.

Hmm I think it can do both, with one or two enemies active at the same time, but probably my first experiments will be more "action" oriented.

7 hours ago, R0ger said:

Wouldn't it be better to simply use kernel code ? How many cycles can you save using those IRQs ?

Uff.. I don't think so

I don't say it cannot be done, but using IRQ's is a far easier solution.

Remember that my "logical" frame can take any number of hardware frames (and we are not talking about an "integer" number of them).

If trying to use something similar to a kernel, I would start the kernel code with a DLI, just at the start of the APAC zone, and would return at the end of it.

Doing it like that would mean that the PRIOR changes use the less processing time possible, but then you need to do something useful the rest of the time (while still in the kernel).

And I don't have a task that I can easily interleave with that job. Rendering sounds like a good candidate, but it would still be difficult.

Playing music could fit, but I wouldn't want to do that with RMT.. it would have to be my own code.

7 hours ago, emkay said:

The scanlines would be less obvious, if the gaming screen was bigger.

I don't think they would be less "visible", but if you are talking about the "sensation" that a bigger screen produce, sure.

I didn't mention it, but Project X is the perfect candidate to use the full screen.

Adding 8 more columns for raycasting would slowdown it, but maybe not that much, and increasing the height is kind of "free" with this renderer.

Except for the cost of the extra IRQ's and the display list lines, of course.

5 hours ago, CharlieChaplin said:

(Gr. 10 RIP-mode with up to 160x240 resolution)

RIP was the mode that used GTIA 10 and 9?. I would like to test it at some point, but I don't think that the displacement between both modes qualify as "160" width resolution

3 hours ago, popmilo said:

Am I the only one who wants to see xex file ?

Would you mind sharing details about that 64 bytes screen ?

Awesome tech NRV! let's see another update in couple years

A couple of years sounds good ?

I didn't feel this test merited a xex file yet.. the same maze.. simple columns as objects.. so let me think about it x)

About the mode.. is pretty simple in reality.

Mirroring is not used, the different vertical gradient of colors should give up that away.

Is like the idea from Irgendwer, but not for LMS lines. Instead is for columns of chars.

It could be done with only 32 bytes. So let me start there..

You have 16 lines of chars and every line has a different font (so I use 16K in fonts).

But every line in the display list, points to the same 32 bytes of memory.

So when I put a number in the first byte I automatically draw that whole column in the screen!

Is like the idea of using different fonts for using char software based sprites, but taken to the extreme x)

But as I said, you are very limited in the number of designs you can use for the walls. You need all zoom levels for every column in the fonts.

In this case it would have been useful to have fonts with 256 chars.. but that also would mean needing 32K of memory!

Finally, with 64 bytes I can have one version of a column for the top half of the screen and another for the bottom half.

So I have more "design" options for the walls. I can mix and match different top and bottom parts.

Also, I'm not using all of the font rows, so there is still space to use char based software sprites with this.

popmilo · March 11, 2020

11 hours ago, Irgendwer said:

I would guess that you just LMS the fitting representation to the DL.

Just the principle I described here - which you liked: :-)

Yeah, that was horizontal scanlines solution, perfect for racing games

This column based thing with repeating char lines and changing charsets going down the screen is perfect for raycaster it seems.
Nice one indeed.

emkay · March 11, 2020

4 hours ago, popmilo said:

Yeah, that was horizontal scanlines solution, perfect for racing games

If you have a complex sprite overlay available, for sure.

emkay · March 11, 2020

14 hours ago, NRV said:

I don't think they would be less "visible", but if you are talking about the "sensation" that a bigger screen produce, sure.

I didn't mention it, but Project X is the perfect candidate to use the full screen.

Simple as is: If the screen gets bigger, you can stay away more from the display, thus the scanlines get less visible

14 hours ago, NRV said:

Adding 8 more columns for raycasting would slowdown it, but maybe not that much, and increasing the height is kind of "free" with this renderer.

Except for the cost of the extra IRQ's and the display list lines, of course.

I'd really like to see a working game with 2 or more enemies moving independent on a simple "gr. 9" or even "gr. 10" screen than to have that much colors without a game

Also, the missiles still could be used to set colored objects.

14 hours ago, NRV said:

About the mode.. is pretty simple in reality.

Mirroring is not used, the different vertical gradient of colors should give up that away.

Is like the idea from Irgendwer, but not for LMS lines. Instead is for columns of chars.

I didn't check your engine before. But is surely pointed to re-used characters , particular in the latest video.

Re-using characters is also mirroring

emkay · March 11, 2020

Some clue about the "enemies".

Using the PMg for them might result in weird visuals.

But there are solutions for shadow "shapes" and "shining through" objects.

Doing "enemies" char based is no problem, as you stated.

So it could be possible to let them look like Wolf 3D, if enough chars were available, and the PMg it set in front of those chars, to allow more details. If the enemies were restricted to char-mode movement, it won't be much disturbing (to me at least) .

popmilo · March 11, 2020

20 hours ago, NRV said:

Finally, with 64 bytes I can have one version of a column for the top half of the screen and another for the bottom half.

So I have more "design" options for the walls. I can mix and match different top and bottom parts.

Also, I'm not using all of the font rows, so there is still space to use char based software sprites with this.

Did you think about making edges of wall slanted inside one char column ?
Something like 16 kind of height (4 bits), 1 bit for slope direction and 2 bit for how high it ends on neighbor character ?

16 heights per half screen is like half char precision... with slope maybe it would be enough.
Ufff... Don't we hate 128 char limit of A8

+MrFish · March 11, 2020

55 minutes ago, popmilo said:

Ufff... Don't we hate 128 char limit of A8

Actually, no, it's a limit like any other system has. I imagine it's discussed more often by people who work with the C64, rather than by people who work more often with Ataris.

popmilo · March 12, 2020

14 hours ago, MrFish said:

Actually, no, it's a limit like any other system has. I imagine it's discussed more often by people who work with the C64, rather than by people who work more often with Ataris.

Sure, it is just one of limits.

Imho that 1 bit difference can be crucial in some methods. Stuff like that "encoding" gfx based on character code is very powerful.
For example with 8bit char codes you can split them nicely to 4+4 bits. How do you split evenly info in 7 bits ?
First symmetrical division is 3+3 and you're left with 1 extra bit that you don't know what to do with, and 3bit vs 4bit is like 8 steps instead of 16 for color or angle or dither or else.

Sometimes more really is better, like A8s MHz for example

+MrFish · March 12, 2020

56 minutes ago, popmilo said:

Sure, it is just one of limits.

I don't dwell on what a system "can't" do, I focus on what's available and how I can maximize that.

emkay · March 12, 2020

2 hours ago, popmilo said:

Sure, it is just one of limits.

Imho that 1 bit difference can be crucial in some methods. Stuff like that "encoding" gfx based on character code is very powerful.
For example with 8bit char codes you can split them nicely to 4+4 bits. How do you split evenly info in 7 bits ?

Where is the real benefit, if you do that "addition" ?

4 bits = 16 different chars

6 bits = 64 different chars

7 bits = 128 chars

4 bits + 4 bits =16+16 = 32

Where is the need to put "evenly info" there, as you have to address the character and the memory position in full bytes?

2 hours ago, popmilo said:

First symmetrical division is 3+3 and you're left with 1 extra bit that you don't know what to do with, and 3bit vs 4bit is like 8 steps instead of 16 for color or angle or dither or else.

Sometimes more really is better, like A8s MHz for example

The only thing that is missing is some "Sprite Overlay" that is offering more color and detail.

Reminder: If the Atari had the same sprite options as the C64, even a fluently running DooM was possible, without a CPU upgrade.

ivop · March 12, 2020

50 minutes ago, emkay said:

Where is the real benefit, if you do that "addition" ?

4 bits = 16 different chars

6 bits = 64 different chars

7 bits = 128 chars

4 bits + 4 bits =16+16 = 32

It's a two dimensional array. [16][16] = 256

With 7-bits you could do either [8][16] or [16][8]. Slopes and angles that is.

Project-M 2.0

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members