WIP (frozen) - 8way platform scroller (MWP technique)

MaPa · November 11, 2010

Just posting my test of use MWP technique in a game scenario. It is working somehow, not ideally coded and it made me too many headaches to code it. One file is with hw sprite, one with software sprite and one with soft sprite and with indication how much CPU time it takes to draw the only one sprite. Yellow parts are character definitions drawing (4x4 chars), purple parts some overhead which takes quite a lot cpu time due to MWP video ram wrap etc. Yes, it was coded with purpose to do a finished game, but the project is now frozen (like almost for two years now).

dino.xex

dino_soft.xex

dino_soft_bars.xex

Rybags · November 11, 2010

Looks good... though you should know that now you'll be unindated with requests to finish it... or turn it into something else.

MaPa · November 11, 2010

Looks good... though you should know that now you'll be unindated with requests to finish it... or turn it into something else.

I don't worry, I'm a stoic person so I'll ignore it successfully or maybe it will push me to do something with it and finish it in some form which would be good too.

Edited November 11, 2010 by MaPa

mimo · November 11, 2010

Looks very nice

Heaven/TQA · November 11, 2010

I just went yesterday into Boinxx... so I am a stoic person, too...

Heaven/TQA · November 11, 2010

I think MWP is useful for 5200 or 400 games but I still not see any advantage in realgame situations?

popmilo · November 11, 2010

It looks like mwp is working... Soft sprite routine is rather slow, but I bet there must be a reason

ps. I collected all the fruit and nothing happend you really should do something about it

MaPa · November 12, 2010

I think MWP is useful for 5200 or 400 games but I still not see any advantage in realgame situations?

IMHO it's for saving memory, in antic mode 4 you need only about 1kB for 8way scroll screen instead of 4kb (AFAIK how 8way should be done in normal way). But for just a little 3kB saving it brings other complications etc.

It looks like mwp is working... Soft sprite routine is rather slow, but I bet there must be a reason

ps. I collected all the fruit and nothing happend you really should do something about it

Hehe, the item collecting was added relatively recently. I must have been bored that I did something new into it. The soft sprite routine surely can be done faster but IMHO not that much if you will not do totally unrolled loops with hardcoded data which is IMHO unrealistic in good game scenario in 64kB RAM. Just pure copying and masking data into 16 chars that the sprites occupies takes around 2400 cycles (only 26 lines high sprite so not copying full 4 chars) which is around 40 scanlines and that's around 7 char lines (due to badlines). And you need to adjust pointer where you read video ram to get char under the sprite, get its offset and prepare pointer to its definition. In MWP where the screen is notlinear, it can wrap anywhere back to the beginning of video memory area, so you need to check that condition etc. so it will sum up to some "nice" overhead. As pure copying takes 23 cycles per byte (indexed lda, and, ora, sta = 21 cycles without page crossing + iny = 23 cycles total) * 8 bytes in char = 184 cycles copying one char and then comes the overhead where you have some multiple lda, sta, adc, sbc, cmp, bne etc. and you have easily another 50+ cycles per character.

Edited November 12, 2010 by MaPa

Heaven/TQA · November 12, 2010

Mapa... yeah. that's what I am thinking, too... makes more things complicated for simple 3k.... but for 5200/400 it can be advantage where RAM is tight...

snicklin · November 12, 2010

MaPa, you are a seriously talented coder. This demo looks a little like a game on the Amiga several years ago.

By the way, what is "MWP" which I keep on seeing on this board?

sack-c0s · November 12, 2010

That's looking pretty damn good actually.

I'd like to see it finished as well, but seeing as I'm sitting in the shadow of a massive pile of work I need to finish *yesterday* I can sympathise on the finding time front

popmilo · November 12, 2010

... And you need to adjust pointer where you read video ram to get char under the sprite, get its offset and prepare pointer to its definition. In MWP where the screen is notlinear, it can wrap anywhere back to the beginning of video memory area, so you need to check that condition etc. so it will sum up to some "nice" overhead. As pure copying takes 23 cycles per byte (indexed lda, and, ora, sta = 21 cycles without page crossing + iny = 23 cycles total) * 8 bytes in char = 184 cycles copying one char and then comes the overhead where you have some multiple lda, sta, adc, sbc, cmp, bne etc. and you have easily another 50+ cycles per character.

I have done soft sprites on C64, and never thought about nonlinear screens... But, same core principal as yours I guess.

Lda, and, ora, sta (all absolute, x or y indexed).

in my routine I separated it in this way:

"restore screen chars under previous sprite position"

"new background to buffer"

"sprite to buffer"

"buffer chars to screen"

I have rough skeleton of new routine that would combine second and third step with assumption of random chars under sprite. Calculations show it would be significantly faster.

And on A8 with its faster cpu it should be possible to make it faster ...

All this with using lookup tables for shifting and a lot of self modifying code.

On c64 it makes sense because 8 sprites (4x4 size) take 128 chars and 128 is left for background...

On A8 this is not the case, so I'm thinking to use some "bitmap" like mode... maybe made of chars to get that one color more but who knows...

Are you using preshifted sprite data ?

popmilo · November 12, 2010

MaPa, you are a seriously talented coder. This demo looks a little like a game on the Amiga several years ago.

By the way, what is "MWP" which I keep on seeing on this board?

I don't know if you know about "AtariWiki" but this is great info:

http://atariwiki.strotmann.de/wiki/Wiki.jsp?page=Ironman%20Atari#section-Ironman+Atari-MWP

MaPa · November 12, 2010

By the way, what is "MWP" which I keep on seeing on this board?

MWP stands for Minimum Wrapping Principle or something like that. AFAIK analmux "invented" it some years ago. This scrolling technique allows 8way scroll and uses memory about of one screen. It uses 2 LMS commands in DLIST, first points to first displaying line (of course) and the second is positioned so that at the end of memory area it points to its beginning again. When scrolling the position of second LMS command varies.

Heaven/TQA · November 12, 2010

MWP reminds me somehow Sync Scrolling on ST... weird cycle exact Shifter manipulation at top of the screen and a lookup table...

MaPa · November 12, 2010

I have done soft sprites on C64, and never thought about nonlinear screens... But, same core principal as yours I guess.

Lda, and, ora, sta (all absolute, x or y indexed).

in my routine I separated it in this way:

I don't know how on c64 but on ATARI I can't image now how I can have absolute,x or y indexed if 256 are not enough to cover all sprite definitions, character definitions etc. so I would need to self-

modify code and prepare several addresses on several places or have kilobytes of code for several situations and combinations. For example I have something like this (speeded up a little by unrolling loop):

       lda ($fa),y		; load char definition data under soft sprite
and ($f8),y		; AND mask
ora ($fc),y		; ORA sprite data
sta ($f6),y		; save to sprite char definition
iny
lda ($fa),y		; 2. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 3. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 4. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 5. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 6. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 7. byte
	and ($f8),y
ora ($fc),y
sta ($f6),y
iny
lda ($fa),y		; 8. byte
and ($f8),y
ora ($fc),y
sta ($f6),y
iny

If I would replace indexed (),y by absolute,x then I would need to change 1 address on 8 places or do not unroll loop and then I will add 3 cycles for BNE or whatever per byte and will save 4 cycles by changing (),y into $abs,x and y but on the other hand, preparing ZP pointers takes less cycles then preparing $abs addresses somewhere in RAM. Or am I missing something?

Are you using preshifted sprite data ?

Yes, I'm.

Edited November 12, 2010 by MaPa

snicklin · November 12, 2010

Thanks PopMilo and MaPa, I can see how this works now. Hmm, it's a powerful technique to say the least.

I guess though that it starts to get very complicated when you're using tiles, that'll be another "layer" on top of your calculations.

Edited November 12, 2010 by snicklin

analmux · November 12, 2010

MWP stands for Minimum Wrapping Principle or something like that.

(edited one time)

Exactly (not "minimal warp principal" )

But respect that you try to combine s.w.sprites and mwp scrolling. That was never my intention. The SMB3 stuff was supposed to work only with PM gfx, no software sprites, or at least not in the scrolling zones, only in bowser zones. OK, all problems can be solved of course, but you need to keep track of the 2nd copy (2nd LMS) screen line. Then, if you're doing s.w.sprites, the first problem will be the shadow copies, sometimes occurring, sometimes not.

Combining scrolling and s.w.sprites, I'd use a different scheme. Use only vertical wrapping and 24 LMS'es, and lines of 80 characters long. Then do double buffering only in horizontal direction. OK, then it will need twice as much screenmemory.

Edited November 12, 2010 by analmux

Heaven/TQA · November 12, 2010

Analmux... but then where is the advantage in the updated MWP method? I am still a fan of the NES/GB method.

popmilo · November 12, 2010

I don't know how on c64 but on ATARI I can't image now how I can have absolute,x or y indexed if 256 are not enough to cover all sprite definitions, character definitions etc. so I would need to self-

modify code and prepare several addresses on several places or have kilobytes of code for several situations and combinations. For example I have something like this (speeded up a little by unrolling loop):

...

If I would replace indexed (),y by absolute,x then I would need to change 1 address on 8 places or do not unroll loop and then I will add 3 cycles for BNE or whatever per byte and will save 4 cycles by changing (),y into $abs,x and y but on the other hand, preparing ZP pointers takes less cycles then preparing $abs addresses somewhere in RAM. Or am I missing something?

Nothing wrong with your approach

On C64 I used 4 verticaly placed chars to get "column" of 32 bytes.

4 columns are 128 bytes. So any byte of sprite can be reached with indexed addressing.

This is how my code for 21 line, 3 byte wide sprites looks like:

;main sprite routine (masked)

spriteplot_main  	ldy #20

spriteplot_01	ldx $1000,y	;x=m0
spriteplot_02	lda $1000,x	;a=shl_mask(m0)
spriteplot_03	and $1000,y	;a=m0 and ch0
spriteplot_04	ldx $1000,y	;x=s0
spriteplot_05	ora $1000,x	;a=a or shl(s0)
spriteplot_06	sta $1000,y ;left byte

spriteplot_11	ldx $1000,y	;x=m0
spriteplot_12	lda $1000,x	;a=shr_mask(m0)
spriteplot_13	ldx $1000,y	;x=m1
spriteplot_14	ora $1000,x	;a=a or shl_mask(m1)
spriteplot_15	and $1000,y	;a=a and ch1
spriteplot_16	ldx $1000,y	;x=s0
spriteplot_17	ora $1000,x	;a=shr(s0)
spriteplot_18	ldx $1000,y	;x=s1
spriteplot_19	ora $1000,x	;a=shl(s1)
spriteplot_10	sta $1000,y	;middle byte 1

spriteplot_21	ldx $1000,y	;x=m1
spriteplot_22	lda $1000,x	;a=shr_mask(m1)
spriteplot_23	ldx $1000,y	;x=m2
spriteplot_24	ora $1000,x	;a=a or shl_mask(m2)
spriteplot_25	and $1000,y	;a=a and ch2
spriteplot_26	ldx $1000,y	;x=s1
spriteplot_27	ora $1000,x	;a=shr(s1)
spriteplot_28	ldx $1000,y	;x=s2
spriteplot_29	ora $1000,x	;a=shl(s2)
spriteplot_20	sta $1000,y	;middle byte 2

spriteplot_31	ldx $1000,y	;x=m2
spriteplot_32	lda $1000,x	;a=shr_mask(m2)
spriteplot_33	and $1000,y	;a=m2 and ch3
spriteplot_34	ldx $1000,y	;x=s2
spriteplot_35	ora $1000,x	;a=a or shr(s2)
spriteplot_36	sta $1000,y	;right byte

			dey
			bpl spriteplot_01
			rts

It does require a lot of addresses set before main cycle, but if I remember correctly, it was well worth it.

Saving in absolute vs indirect in 21 main loop cycles was good enough to make it work faster.

Will have to do those calculations again.. it does look fishy...

And I wouldn't be able to do the lookup table masking and shifting without ",x" ...

I will try to refine this code and combine it with direct background read in main loop and see how it goes.

If its slow -> out goes the masking.

If its still slow -> out goes shifting online - in goes preshifted data

If its still slow -> reduce vertical resolution.

If its still slow -> reduce size of sprites

If its still slow -> go to the bar and get drunk

popmilo · November 12, 2010

double post.. dont read

Edited November 12, 2010 by popmilo

emkay · November 13, 2010

This thing is really great.

Hopefully, it will turn into a game sometimes?

It already looks very good, and seeing the protagonist, heck, this back in the 80s, could have become the "known A8 hero" , just like Mario on the Nintendo.

Not sure,whether the cpu usage is such a big problem. Reducing the height of the gamescreen and put some 32 bytes wide info panel beneath it, gives some additional CPU cycles. .... whatever

In this state, all PM is free? Placing additional moving elements anywhere on the screen gets easier now, than to multiplex one Player into 2 .

What a nice perspective of the possible

MaPa · November 13, 2010

This thing is really great.

Hopefully, it will turn into a game sometimes?

Because of all the responses here I'm thinking about returning to it and continue after probably complete rewrite.

It already looks very good, and seeing the protagonist, heck, this back in the 80s, could have become the "known A8 hero" , just like Mario on the Nintendo.

If I remember it right, protagonist was done by PG and Ooz did its animation.

In this state, all PM is free? Placing additional moving elements anywhere on the screen gets easier now, than to multiplex one Player into 2 .

In the file with soft sprite yes, but in the other with hw sprites, the protagonist uses 2 out of 4.

Heaven/TQA · November 13, 2010

PG "loves" doing animation... he is more into static gfx...

WIP (frozen) - 8way platform scroller (MWP technique)

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members