Jump to content
IGNORED

BallBlazer framerate


VladR

Recommended Posts

0 is very relative. I am on third prototype in C , where 0 has a different meaning from previous version. It all depends on what makes most sense in given approach.

 

I also realized that endless repeating board is better because:

- you get the border color to use

- you can have a much larger level which helps with action games

 

I have just verified on couple scanlines that my last CPU based method works for both edges. Now I am working on solution that will wrap around and work for all scanlines.

 

I am already pretty sure that the rendering code will be as fast as I posted earlier.

 

The only remaining question is, how many cycles will the lookup table update code take. I don't know that just yet- but getting there. I am hoping for 5,000 - 7,000 cycles...

 

I think a combination of checkerboard and 3D flatshading would be somewhat new look on Atari 800xl, hopefully?

 

Or are you aware of some C64 games that do that?

Link to comment
Share on other sites

What table are you exactly referring to?

To be able to do scrolling without Antic&DLI just on CPU in about 2,000 cycles, I am redrawing only the framebuffer delta : the edges. I posted the render code earlier - it's about 15 cycles per edge, when it's unrolled, so for my 130 edges it should be ~2k.

 

For this, it needs a table that will for each edge store the following 2 values:

  • ByteOffsetFromLeftScreenEdge <0..39> - so that I can do STA sl48_vidPtr,X (X = ByteOffsetFromLeftScreenEdge)
  • EdgeScrollValue <0..3> - so that I can do LDA EdgeSTAValue,Y (Y = EdgeScrollValue)

sl48_vidPtr : address of scanline 48 in framebuffer

EdgeSTAValue: there's only 4+4 combinations of 2 edge colors in a byte - no point in wasting cycles for computing it (8 Bytes is nothing). I have 2 sets : one for color1->color2 transition and another for color2->color1 transition

 

Basically, I need to track each scanline's color boundary position in a most efficient way.

 

Of course, if it would -hypothetically- take 20,000 cycles to update then it's no good that I can scroll it on CPU in 2,000 cycles :lol: But, I won't know if I don't try...

  • Like 2
Link to comment
Share on other sites

Vladr what are the Formulars? Is speed per line depending on the checkboard angles?

I'm using this:

Speed per line = (XposEnd - XposStart) / TimeInSeconds

 

XposStart: Xpos of the edge before we scroll

XposEnd: Xpos of the edge after we make full scroll of one quad width

 

Since the perspective is fixed, it's just one table with 48 Bytes and it doesn't matter what is the color of the quad.

 

Table here, table there, and I already have 14 tables :lol: Gotta love 6502 !

  • Like 2
Link to comment
Share on other sites

It would be fun to create real anti-alias pixels for the diagonals instead of the current fake ones.

Actually, it's not an entirely insane idea on 1.79 MHz :)

- We could vastly improve the antialiasing visuals by dropping the field edge, and make it endlessly repeating

- This would give us 2 colors for AA instead of the current 1

- Each edge would then be 2-pixels wide (instead of current 1)

- this would still be free for the case of using Antic's HScroll, as it would be merely baked into the field

 

This, however, would not solve the pixel mess at top 10 scanlines, where 1 pixel doesn't help much. Especially those long -almost horizontal edges- need the same length of AA below and above them.

 

Now, I'm sure an approximation could be computed offline :

- for each frame of scrolling, we would compute real antialiasing

- we would then compute the visual error for each frame against the reference frame without no AA (simple Sum of all pixels brightness values)

- and simply choose the frame with the smallest visual error and choose that for Antic's HScroll

- Let's not forget we're racing the beam, so we might as well choose slightly different colors for the board, that would blend better with the AA (the result would be blurry, but that's OK in the distance, as it's less visually disturbing than what we have now)

 

So, at the slight pre-processing cost, we could still enhance the visuals, for free run-time cost

 

 

This is, granted, an overkill, but -hell- why not :)

 

You have to account for the errors, generated by skewing an already diagonal line, too though...

You know, for one board, a run-time AA, with 2 AA colors (e.g. endless board) might actually be possible at reasonable cost :)

- Not a 3x3 kernel, for sure - but 2x2 just might work.

- I have 130 edges in my current board

- a 2x2 kernel would work on 4 pixels, thus 130*4 = 520 pixels to compute

- now we need a separate image space, we cannot work in the bitmap space, so we need a separate version of the field, just for the AA purposes, so we can directly add and bitshift colors

- you would use X register for index into current scanline, and Y for next scanline, then LDA scanline1,X, INX, CLC, ADC scanline1,X; then CLC ADC scanline2,Y, INY, ADC scanline2,Y; then LSR A, LSR A and DEX, STA scanline1,X

- that's roughly 40 cycles per pixel, or about 20,000 per frame

 

it could work, actually - but best to leave it up to player as an option

  • Like 2
Link to comment
Share on other sites

Update:

- I have the differential solution prototyped and working - written in C in target resolution

- I tested it by performing full scroll

- this is the method that does not use Antic's scrolling, thus not requiring to kill all CPU performance (by beam racing)

- I only change one byte per each edge

 

I've encountered and fixed quite a few issues resulting from not using the same perspective as BallBlazer has.

Currently I'm working on fixing pixel jitter, as it turns out that if you recompute the line equation at run-time, the scanline width is actually not of constant value, which results in various visual artifacts along the edges.

 

Once the jitter is gone, I'll go port it to 6502 ASM.

 

The greatest visual advantage of this approach is that there's over 100 different frames when strafing, so it's extremely smooth compared to Ball Blazer.

  • Like 1
Link to comment
Share on other sites

I'm really a fan of any 3D development on the A8.

But, features that put the A8 so far ahead , will not do better using 3D calculation.

3D experiments were great , too. I really wished, this 3D experiments will eventually result in some playable games :D

Edited by emkay
Link to comment
Share on other sites

I recently did a remake of the playfield grid just to see it move full screen with 50fps. It's PAL only. Maybe it's useful for your discussions..

That's cool! Are you scrolling through writing to the framebuffer via 6502 or the standard HScroll via Antic (beam racing) ?

 

I'm really a fan of any 3D development on the A8.

But, features that put the A8 so far ahead , will not do better using 3D calculation.

3D experiments were great , too. I really wished, this 3D experiments will eventually result in some playable games :D

Wait, you already implemented my differential method and know how the final cycle count compares to the Antic HScroll :) ?

Link to comment
Share on other sites

Well, almost 40 years of the 8 bit Ataris told their story ;)

You know how many times I heard the same for Jaguar ? Yet, somehow, I'm extracting fluid framerates in 3D in higher resolutions like 768x200, up to 1,536x200 from that old cat.

 

You never know when you're doing the research, what will be the end result, till the thing actually runs. What's even more surprising is that plenty of those findings are unintended (a mere byproduct of certain parallel codepath).

 

 

And, incidentally, to prove this point of the unknown unexpected, look at EclaireXL. If you raced the beam on that thing for the scrolling, you'd waste majority of those 28 MHz. just doing nothing - waiting for WSYNC.

 

But, if I have a CPU-based solution, a lot of performance will be left, simply due to the nature of the beast - the frequency delta (28 MHz vs 1.79 -> 16:1) is much greater than resolution delta (4:1), so this approach , by a margin of 4:1, favors the non-HW way of 40 years :)

  • Like 1
Link to comment
Share on other sites

You know how many times I heard the same for Jaguar ? Yet, somehow, I'm extracting fluid framerates in 3D in higher resolutions like 768x200, up to 1,536x200 from that old cat.

 

You never know when you're doing the research, what will be the end result, till the thing actually runs. What's even more surprising is that plenty of those findings are unintended (a mere byproduct of certain parallel codepath).

 

 

And, incidentally, to prove this point of the unknown unexpected, look at EclaireXL. If you raced the beam on that thing for the scrolling, you'd waste majority of those 28 MHz. just doing nothing - waiting for WSYNC.

 

But, if I have a CPU-based solution, a lot of performance will be left, simply due to the nature of the beast - the frequency delta (28 MHz vs 1.79 -> 16:1) is much greater than resolution delta (4:1), so this approach , by a margin of 4:1, favors the non-HW way of 40 years :)

The main Games for supporting the Atari, have been "3d view". Be it Starraiders , Ballblazer, or even Rescue on Fractalus... It's the strength of that system . Built in the 70s, the hardware offers some features far beyond the 80s. Imagine, just a better sprite support over that playfield....

I even don't doubt, the Atari could do a Wolfenstein 3D , if the hardware is used correctly. But , for what cause ever, Project-M wants to use 256 colors, not resulting in a Wolf 3D.... I'd really prefer a mainly 4 colors screen that fluently allows to play through the levels of that game. But, the limit to 4 colors in ANTIC Mode D isn't really the restriction. In GPRIOR 0 more than 20 colors per scanline were possible. And, as it isn't needed for the whole screen, to have that much colors available. the basis is given. Fast 3D animation fullscreen , plus some needed details with separating colors... but things end where the coder sets the limit. And that's the problem of the past, if you understand ;)

Edited by emkay
Link to comment
Share on other sites

That's cool! Are you scrolling through writing to the framebuffer via 6502 or the standard HScroll via Antic (beam racing) ?

 

Wait, you already implemented my differential method and know how the final cycle count compares to the Antic HScroll :) ?

 

Thanks!

 

there's only horizontal finescrolling each line individually and color changes for each line during the Kernel.

 

Absolutely no changes in the graphic data of the so called 'framebuffer'

  • Like 1
Link to comment
Share on other sites

Hi bugbiter, did you hand code the bitmap. If so I applaud your patience. icon_winking.gif

When i did it I used Basic.

 

harri1.atr

 

To run it in Altirra. Boot as xl machine (it needs Graphics 15), load "d:jscope2.bas"

On a side note why do file names have no meaning years later. icon_wink.gif

Type gr.15:g.1000

That was 40 character mode. But we soon dropped that and went wide screen heheh.

Type g.6000 for that.

So we plotted half the screen copied it to $4000 then redrew the other half and copied that.

Then dumped it to disk.

Link to comment
Share on other sites

Hi bugbiter, did you hand code the bitmap. If so I applaud your patience. icon_winking.gif

When i did it I used Basic.

 

attachicon.gifharri1.atr

 

To run it in Altirra. Boot as xl machine (it needs Graphics 15), load "d:jscope2.bas"

On a side note why do file names have no meaning years later. icon_wink.gif

Type gr.15:g.1000

That was 40 character mode. But we soon dropped that and went wide screen heheh.

Type g.6000 for that.

So we plotted half the screen copied it to $4000 then redrew the other half and copied that.

Then dumped it to disk.

Yes I did! Embarassing, isn't it? I was just too lazy to do a generator - I was afraid the lineup and chopping off the borders would get too complicated, so I just copy pasted the .byte lines and mofified them..

Thanks for the generator, I'll take a look at it. :-)

  • Like 1
Link to comment
Share on other sites

@Bugbiter

 

what I really like is your "AA"ing the top lines... when you scroll slowly you see that you do that nice "halftone" AA which avoids the pixel mess far at horizon... nice subtile but improves it.

The original Ballblazer engine does that too - If you go from bottom to top each horizontal split position's distance to the horizon is halved with a simple ror instruction.

The halftone line color is then determined by the upper 2 bits of the low byte of the vertical split line position, the 'remainder'

But I changed the colour scheme to a brighter light green and a darker dark green, so there are more brightness steps inbetween compared to the original. I guess so it looks even more smooth with that.

Edited by bugbiter
  • Like 1
Link to comment
Share on other sites

Yeah spotted it lately... very nice. Original B.B. does that?

 

Btw made some improvements in terms of code... ;) ok. We demo coders say 50 fps is 50 fps... but

 

CLC

ROL

ROL

ROL

 

Can be changed to

 

ASL

ASL

ASL

 

;)

 

At least I spotted no difference in my version... need to check with your version.

Edited by Heaven/TQA
  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...