Jump to content
IGNORED

Tempest Xtreem Playable Demo


Kjmann

Recommended Posts

Well,

here is a Code3 Cruncher packed version of the Tempest V29 release. Think this one can be loaded with most DOS and gamedos versions and Basic will be disabled automatically. However, if loaded from DOS, one cannot return to DOS after the program has been unpacked and started (but the program will do a coldstart when Reset is pressed)...

 

Some redundant file segment info: $0400-042C = Basic off switch, as released by Bill Wilkinson in Compute! some aeons ago, $0244-0244 = coldstart routine for Reset-key, $0600-0668 = simple text title which appears while loading the game, all other segments = Code3 Cruncher packed data + depacking routines... greetings, Andreas Koch.

 

 

Edited by CharlieChaplin
Link to comment
Share on other sites

Well having all the variables in zero page was done almost right away. The routine I came across uses the same zero page locations as what the what the OS line drawing routine used. That avoids lots of conflicts. One issue I have been looking at is it uses 2 bytes to store deltax, deltay, tempx, tempy, and wonder how accurate it would be if I found a way to do it with only 8-bits.

As Rybags noted, 16-bits are only needed for GR8 lines on normal screen. Also, why do you care about the zero-page locations reserved by the OS? First thing you do you switch the OS off ;)

 

I also have looked at the ideal of cases for each direction, actually think you end up with 8 depending on the slope. I do agree one stumbling block is calculating the screen address for every pixel point there.

As I said, you don't need to have 8, 4 is enough. You assume you only handle right (or top, left, down) facing lines, and if you detect one facing left, you simply reverse the start and the end points. Extra step is to optiimize the horizontal or vertical lines, but it's only useful if you know you have a lot of these.

 

Also, you don't have to FULLY compute the address for every pixel. If you follow my suggestion, where X register holds your current X, then you save a lot. Also, it's not smart to keep a 'currentY' variable and for every pixel compute the address. If you detect a line change, you can simply add/sub 40 from a zero page, it's a bit faster. And if you can store every line on a different page, it's even faster.

 

If you send me your line routine, I can probably improve it ;)

 

Doing Add or Subtract the byte screen width is something I have thought about. However storing every line on a different page probably will not work for a game like Tempest because we use vast amounts of RAM already. There is an option with using self-modifying code that can also be implanted, instead of loading the A register, adding 1 or 255, and storing it, simply use INC or DEC, the opcode will be changed depending on the direction. Could be further enhanced with INX or DEX, if the X register can be freed up.

 

This is really a side project I am working on and I should place it in a new thread. Just trying to see how fast we can make something go.

 

I will see if I can get the line drawing routine ready to be sent.

Edited by peteym5
Link to comment
Share on other sites

I will see if I can get the line drawing routine ready to be sent.

I looked at this routine and it's just oh-so-slow :(

I quickly hacked a different version. As usual, we tradeoff space for speed - it requires 2KB of extra arrays. Can use less, but will get slower.

I don't know if Tempest really needs a fast drawto, but perhaps someone else can use it.

Sources: http://homepages.cwi.nl/~marcin/a8/drawto15.asx

Executable: http://homepages.cwi.nl/~marcin/a8/drawto15.xex

Link to comment
Share on other sites

I actually did find a few compact ways of increasing speed without using huge tables. What I've been doing for plot pixel is using a different sort of table for mask and color. The tables are only like 20 bytes in size. Of course we still have the other 192x2 row address table. Oh yes, I successfully did test adding or subtracting 40 to ROWAC instead of have to multiply or lookup the row address for each row. I don't see a whole lot of gain beyond what I have done already. Does in about 35 CPU cycles. I can always do away with the JSR-RTS in the drawto part. Code is something like this:

 

PLOTPIXEL

LDA PIXELCOLUMN

AND #3

TAX

LDA PIXELCOLUMN

LSR

LSR

TAY

LDA (ROWAC),Y

AND ANDMASK,X

GETCMASK = *+1

ORA $FFFF,X

STA (ROWAC),Y

RTS

 

GETMASK

LDA COLOR

ASL

ASL

CLC

ADC #<COLORMASK0

STA GETCMASK

LDA #0

ADC #>COLORMASK0

STA GETCMASK+1

RTS

 

 

ANDMASK

DTA 63,207,243,252

COLORMASK0

DTA 0,0,0,0

COLORMASK1

DTA 64,16,4,1

COLORMASK2

DTA 128,32,8,2

COLORMASK3

DTA 192,48,12,3

Link to comment
Share on other sites

thanks Eru, for sharing the source... it's really readable... ;) but you can not leave out the XASM/QASM fragments like DTA ;) i am forcing me since for few months to use .byte etc instead... :D

 

just one question as I never understood 100% the MADS macros

 

what are the :1 doing? esp. the .def :1 = * ?

 

 

PIXEL .MACRO

ldy div4,x

lda (zer),y

and mask,x

.def :1 = *

ora color_bits,x

sta (zer),y

.ENDM

PREPARE .MACRO

sta todo

inc todo

lsr @

sta tmp

lda color

ora >color_bits

sta :1+2

.ENDM

Link to comment
Share on other sites

I actually did find a few compact ways of increasing speed without using huge tables. What I've been doing for plot pixel is using a different sort of table for mask and color. The tables are only like 20 bytes in size. Of course we still have the other 192x2 row address table. Oh yes, I successfully did test adding or subtracting 40 to ROWAC instead of have to multiply or lookup the row address for each row. I don't see a whole lot of gain beyond what I have done already. Does in about 35 CPU cycles. I can always do away with the JSR-RTS in the drawto part. Code is something like this:

These are of course valid optimizations. If it's fast enough, no point in using tables. Btw, 2KB for my standards is far from huge ;)

Another thing I found in your sources is multiplication by 40 for every pixel - please avoid it.

 

thanks Eru, for sharing the source... it's really readable... ;) but you can not leave out the XASM/QASM fragments like DTA ;) i am forcing me since for few months to use .byte etc instead... :D

I hacked it in like 45-60 minutes last night, what do you expect ;) And i like DTA :)

just one question as I never understood 100% the MADS macros

 

what are the :1 doing? esp. the .def :1 = * ?

 

PIXEL .MACRO

ldy div4,x

lda (zer),y

and mask,x

.def :1 = *

ora color_bits,x

sta (zer),y

.ENDM

PREPARE .MACRO

sta todo

inc todo

lsr @

sta tmp

lda color

ora >color_bits

sta :1+2

.ENDM

These are macros. I only used them to make the code shorter, as these pieces repeat roughly for all 4 cases. I truly recommend using macros - these are great. But sometimes dangerous, and MADS handling of the parameters leaves a bit to be desired.

:1 means 'insert here macro parameter number 1'

".def :1 = *" means "define a label with a name defined by the macro parameter number 1 and a value equal to the current address"

Edited by eru
Link to comment
Share on other sites

Thankyou for your help. I am looking to keep the code and tables compact and small since we might be needing some spare ram for wave sounds. Doing some of this research did reduce the time Tempest takes to delete and redraw the screen, which is now less than 0.1 seconds. There is much potential in this and future projects may benefit.

 

I am considering setting something up for APAC screen mode manipulation, have the ability to draw lines in 256 colors. No its not for Tempest, but for games that don't need to have high resolution. On that note, has anyone tried to just make a custom NMI instead DLIs for that? Can make something that won't hog the CPU too badly with that case. Many are stating you can't do much with APAC because of the resources it consumes.

Link to comment
Share on other sites

With APAC in standard width, pretty sure you must do it with a single DLI/kernal.

 

However, there are 2 things that can be done on a 64K machine which might free up enough cycles:

- a user vector at $FFFA that redirects to the DLI directly. Code that checks the NMI source could be avoided if a Pokey Timer routine was also used to change the vector to service the VBlank at the appropriate time.

Saving there per DLI (since we can bypass this part of the OS code) :

BIT $D40F ; 4 cycles

BPL DOVBLANK ; 2 cycles (failed branch)

JMP ($200) ; 5 cycles

 

11 cycles saved.

 

- Disable Display List Instruction Fetch on Antic. That would save 1 cycle per scanline, assuming you're doing GTIA modes in Antic F. When DList Instruction fetch is disabled, Antic just updates the memory scan counter and continues displaying the same mode. Of course the problem here is that you have to cater for the memory scan crossing a 4K boundary.

 

Then there's the usual bag of tricks in the NMI itself like self-modifying code, z-page variables etc.

 

For drawing in APAC. One possible workaround which allows using standard routines is to map the colour and luma lines in seperate consecutive 4K bitmaps.

 

e.g. LMS F COLOUR

LMS F LUMA

LMS F COLOUR + 40

LMS F LUMA + 40

etc.

 

Of course doing that, we lose 2 cycles per scanline due to the extra 2 bytes used for an LMS every scanline.

Then, to draw lines, just draw the colour one first, change the screen pointer then draw the LUMA portion.

Edited by Rybags
Link to comment
Share on other sites

Actually my ideal of getting around the DLIs was to not use the DLIVECTOR (512, $200) and do a simple toggle of the high bit of PRIOR in the NMI, only have to save the A register and restore it before the RTI. The VBI will just set the initial value. Since we have 256 colors available, may not have to use the Player/Missile graphics. It would leave more CPU cycles open for the VBI and other interupts. Of course a APAC type game does not have to be highspeed since I am looking to use it for turn based strategy or puzzle type games like the puzzle piece shifting thing, tetris, or columns. You're most likely will not have enough CPU cycles to do a side-scrollers, high speed shoot-em ups, action games.

 

I have checked out the disable-fetch thing, but for some reason it looked weired in the emulator. Have to try it on a real Atari.

Link to comment
Share on other sites

ehm... re: APAC...

 

why do you need to squeeze every cycle out of it? have a look on my demo

 

http://atari.fandal.cz/detail.php?files_id=224

 

there is a complete music player at the beginning or look the the "manga pic" distorter... so it is not like that you have no CPU cycle left...

 

Jesus, done 1996.... i am getting old...

 

and I remember Fox or Eru using APAC in a Forever-compo intro (the unlimited balls one), and Hiassoft (yes...the one doing the high speed sio) has done a demo called "plasma clouds" in 256 colours...and I used 256 apac in font mode...

 

or am I on a complete wrong track???

post-528-1214849873_thumb.png

post-528-1214849887_thumb.png

post-528-1214850098_thumb.png

post-528-1214850109_thumb.png

titel44.zip

Edited by Heaven/TQA
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...