Jump to content
IGNORED

WIP (frozen) - 8way platform scroller (MWP technique)


MaPa

Recommended Posts

Just posting my test of use MWP technique in a game scenario. It is working somehow, not ideally coded and it made me too many headaches to code it. One file is with hw sprite, one with software sprite and one with soft sprite and with indication how much CPU time it takes to draw the only one sprite. Yellow parts are character definitions drawing (4x4 chars), purple parts some overhead which takes quite a lot cpu time due to MWP video ram wrap etc. Yes, it was coded with purpose to do a finished game, but the project is now frozen (like almost for two years now).

 

post-3960-128947625871_thumb.png post-3960-128947631135_thumb.png

dino.xex

dino_soft.xex

dino_soft_bars.xex

  • Like 6
Link to comment
Share on other sites

Looks good... though you should know that now you'll be unindated with requests to finish it... or turn it into something else.

I don't worry, I'm a stoic person so I'll ignore it successfully :) or maybe it will push me to do something with it and finish it in some form which would be good too.

Edited by MaPa
Link to comment
Share on other sites

I think MWP is useful for 5200 or 400 games but I still not see any advantage in realgame situations?

IMHO it's for saving memory, in antic mode 4 you need only about 1kB for 8way scroll screen instead of 4kb (AFAIK how 8way should be done in normal way). But for just a little 3kB saving it brings other complications etc.

 

 

It looks like mwp is working... Soft sprite routine is rather slow, but I bet there must be a reason :)

 

ps. I collected all the fruit and nothing happend :) you really should do something about it ;)

 

Hehe, the item collecting was added relatively recently. I must have been bored that I did something new into it. The soft sprite routine surely can be done faster but IMHO not that much if you will not do totally unrolled loops with hardcoded data which is IMHO unrealistic in good game scenario in 64kB RAM. Just pure copying and masking data into 16 chars that the sprites occupies takes around 2400 cycles (only 26 lines high sprite so not copying full 4 chars) which is around 40 scanlines and that's around 7 char lines (due to badlines). And you need to adjust pointer where you read video ram to get char under the sprite, get its offset and prepare pointer to its definition. In MWP where the screen is notlinear, it can wrap anywhere back to the beginning of video memory area, so you need to check that condition etc. so it will sum up to some "nice" overhead. As pure copying takes 23 cycles per byte (indexed lda, and, ora, sta = 21 cycles without page crossing + iny = 23 cycles total) * 8 bytes in char = 184 cycles copying one char and then comes the overhead where you have some multiple lda, sta, adc, sbc, cmp, bne etc. and you have easily another 50+ cycles per character.

Edited by MaPa
Link to comment
Share on other sites

... And you need to adjust pointer where you read video ram to get char under the sprite, get its offset and prepare pointer to its definition. In MWP where the screen is notlinear, it can wrap anywhere back to the beginning of video memory area, so you need to check that condition etc. so it will sum up to some "nice" overhead. As pure copying takes 23 cycles per byte (indexed lda, and, ora, sta = 21 cycles without page crossing + iny = 23 cycles total) * 8 bytes in char = 184 cycles copying one char and then comes the overhead where you have some multiple lda, sta, adc, sbc, cmp, bne etc. and you have easily another 50+ cycles per character.

I have done soft sprites on C64, and never thought about nonlinear screens... But, same core principal as yours I guess.

Lda, and, ora, sta (all absolute, x or y indexed).

in my routine I separated it in this way:

"restore screen chars under previous sprite position"

"new background to buffer"

"sprite to buffer"

"buffer chars to screen"

 

I have rough skeleton of new routine that would combine second and third step with assumption of random chars under sprite. Calculations show it would be significantly faster.

 

And on A8 with its faster cpu it should be possible to make it faster ...

 

All this with using lookup tables for shifting and a lot of self modifying code.

 

On c64 it makes sense because 8 sprites (4x4 size) take 128 chars and 128 is left for background...

On A8 this is not the case, so I'm thinking to use some "bitmap" like mode... maybe made of chars to get that one color more but who knows...

 

Are you using preshifted sprite data ?

Link to comment
Share on other sites

MaPa, you are a seriously talented coder. This demo looks a little like a game on the Amiga several years ago.

 

By the way, what is "MWP" which I keep on seeing on this board?

I don't know if you know about "AtariWiki" but this is great info:

 

http://atariwiki.strotmann.de/wiki/Wiki.jsp?page=Ironman%20Atari#section-Ironman+Atari-MWP

  • Like 1
Link to comment
Share on other sites

By the way, what is "MWP" which I keep on seeing on this board?

 

MWP stands for Minimum Wrapping Principle or something like that. AFAIK analmux "invented" it some years ago. This scrolling technique allows 8way scroll and uses memory about of one screen. It uses 2 LMS commands in DLIST, first points to first displaying line (of course) and the second is positioned so that at the end of memory area it points to its beginning again. When scrolling the position of second LMS command varies.

Link to comment
Share on other sites

I have done soft sprites on C64, and never thought about nonlinear screens... But, same core principal as yours I guess.

Lda, and, ora, sta (all absolute, x or y indexed).

in my routine I separated it in this way:

 

I don't know how on c64 but on ATARI I can't image now how I can have absolute,x or y indexed if 256 are not enough to cover all sprite definitions, character definitions etc. so I would need to self-

modify code and prepare several addresses on several places or have kilobytes of code for several situations and combinations. For example I have something like this (speeded up a little by unrolling loop):

 

       lda ($fa),y		; load char definition data under soft sprite
and ($f8),y		; AND mask
ora ($fc),y		; ORA sprite data
sta ($f6),y		; save to sprite char definition
iny
lda ($fa),y		; 2. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 3. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 4. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 5. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 6. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 7. byte
	and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 8. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny

 

If I would replace indexed (),y by absolute,x then I would need to change 1 address on 8 places or do not unroll loop and then I will add 3 cycles for BNE or whatever per byte and will save 4 cycles by changing (),y into $abs,x and y but on the other hand, preparing ZP pointers takes less cycles then preparing $abs addresses somewhere in RAM. Or am I missing something?

 

Are you using preshifted sprite data ?

Yes, I'm.

Edited by MaPa
Link to comment
Share on other sites

MWP stands for Minimum Wrapping Principle or something like that.

 

(edited one time)

 

Exactly (not "minimal warp principal" :) )

 

But respect ;) that you try to combine s.w.sprites and mwp scrolling. That was never my intention. The SMB3 stuff was supposed to work only with PM gfx, no software sprites, or at least not in the scrolling zones, only in bowser zones. OK, all problems can be solved of course, but you need to keep track of the 2nd copy (2nd LMS) screen line. Then, if you're doing s.w.sprites, the first problem will be the shadow copies, sometimes occurring, sometimes not.

 

Combining scrolling and s.w.sprites, I'd use a different scheme. Use only vertical wrapping and 24 LMS'es, and lines of 80 characters long. Then do double buffering only in horizontal direction. OK, then it will need twice as much screenmemory.

Edited by analmux
Link to comment
Share on other sites

I don't know how on c64 but on ATARI I can't image now how I can have absolute,x or y indexed if 256 are not enough to cover all sprite definitions, character definitions etc. so I would need to self-

modify code and prepare several addresses on several places or have kilobytes of code for several situations and combinations. For example I have something like this (speeded up a little by unrolling loop):

 

...

 

If I would replace indexed (),y by absolute,x then I would need to change 1 address on 8 places or do not unroll loop and then I will add 3 cycles for BNE or whatever per byte and will save 4 cycles by changing (),y into $abs,x and y but on the other hand, preparing ZP pointers takes less cycles then preparing $abs addresses somewhere in RAM. Or am I missing something?

 

Nothing wrong with your approach :)

 

On C64 I used 4 verticaly placed chars to get "column" of 32 bytes.

4 columns are 128 bytes. So any byte of sprite can be reached with indexed addressing.

This is how my code for 21 line, 3 byte wide sprites looks like:

;main sprite routine (masked)

spriteplot_main  	ldy #20

spriteplot_01	ldx $1000,y	;x=m0
spriteplot_02	lda $1000,x	;a=shl_mask(m0)
spriteplot_03	and $1000,y	;a=m0 and ch0
spriteplot_04	ldx $1000,y	;x=s0
spriteplot_05	ora $1000,x	;a=a or shl(s0)
spriteplot_06	sta $1000,y ;left byte

spriteplot_11	ldx $1000,y	;x=m0
spriteplot_12	lda $1000,x	;a=shr_mask(m0)
spriteplot_13	ldx $1000,y	;x=m1
spriteplot_14	ora $1000,x	;a=a or shl_mask(m1)
spriteplot_15	and $1000,y	;a=a and ch1
spriteplot_16	ldx $1000,y	;x=s0
spriteplot_17	ora $1000,x	;a=shr(s0)
spriteplot_18	ldx $1000,y	;x=s1
spriteplot_19	ora $1000,x	;a=shl(s1)
spriteplot_10	sta $1000,y	;middle byte 1

spriteplot_21	ldx $1000,y	;x=m1
spriteplot_22	lda $1000,x	;a=shr_mask(m1)
spriteplot_23	ldx $1000,y	;x=m2
spriteplot_24	ora $1000,x	;a=a or shl_mask(m2)
spriteplot_25	and $1000,y	;a=a and ch2
spriteplot_26	ldx $1000,y	;x=s1
spriteplot_27	ora $1000,x	;a=shr(s1)
spriteplot_28	ldx $1000,y	;x=s2
spriteplot_29	ora $1000,x	;a=shl(s2)
spriteplot_20	sta $1000,y	;middle byte 2

spriteplot_31	ldx $1000,y	;x=m2
spriteplot_32	lda $1000,x	;a=shr_mask(m2)
spriteplot_33	and $1000,y	;a=m2 and ch3
spriteplot_34	ldx $1000,y	;x=s2
spriteplot_35	ora $1000,x	;a=a or shr(s2)
spriteplot_36	sta $1000,y	;right byte

			dey
			bpl spriteplot_01
			rts

 

It does require a lot of addresses set before main cycle, but if I remember correctly, it was well worth it.

Saving in absolute vs indirect in 21 main loop cycles was good enough to make it work faster.

Will have to do those calculations again.. it does look fishy...

 

And I wouldn't be able to do the lookup table masking and shifting without ",x" ...

 

I will try to refine this code and combine it with direct background read in main loop and see how it goes.

 

If its slow -> out goes the masking.

If its still slow -> out goes shifting online - in goes preshifted data :)

If its still slow -> reduce vertical resolution.

If its still slow -> reduce size of sprites :)

 

If its still slow -> go to the bar and get drunk ;)

Link to comment
Share on other sites

This thing is really great.

Hopefully, it will turn into a game sometimes?

 

It already looks very good, and seeing the protagonist, heck, this back in the 80s, could have become the "known A8 hero" , just like Mario on the Nintendo.

 

Not sure,whether the cpu usage is such a big problem. Reducing the height of the gamescreen and put some 32 bytes wide info panel beneath it, gives some additional CPU cycles. .... whatever

 

In this state, all PM is free? Placing additional moving elements anywhere on the screen gets easier now, than to multiplex one Player into 2 .

 

What a nice perspective of the possible icon_shades.gif

 

 

Link to comment
Share on other sites

This thing is really great.

Hopefully, it will turn into a game sometimes?

Because of all the responses here I'm thinking about returning to it and continue after probably complete rewrite.

 

 

It already looks very good, and seeing the protagonist, heck, this back in the 80s, could have become the "known A8 hero" , just like Mario on the Nintendo.

If I remember it right, protagonist was done by PG and Ooz did its animation.

 

 

In this state, all PM is free? Placing additional moving elements anywhere on the screen gets easier now, than to multiplex one Player into 2 .

In the file with soft sprite yes, but in the other with hw sprites, the protagonist uses 2 out of 4.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...