Jump to content
IGNORED

Sorting numbers - quickest way?


johnnywc

Recommended Posts

Hello all,

 

I am currently working on a kernel that allows up to 8 independant objects on the screen at once using intelligent flicker (ie. shows any two sprites in one vertical band with no flicker, 3 sprites with 2 flickering at 30hz, one doesn't flicker, 4 sprites all flickering at 30hz, etc.). The pre-calculation needs the Y values of the objects I'm displaying to be sorted in descending order (so that last scan lines can be compared to for determining when to flicker, etc and that reposition lines can be computed).

 

Anyway, I wrote this sort routine that works but is extremely slow (it takes 19 scan lines to finish!). Is there a tried-and-true method for sorting 8 numbers in the fastest time possible? My method uses a "bucket sort" that scans the numbers looking for the largest one, marks it as "processed" and repeats the loop (8 times):

 

Here is the code:

 

SortObjects

; logic:
; step through each object and find the largest y value
; store the index in the next position
; note: skip values that have already been processed

; mark all as not sorted
ldy  #MAX_OBJECTS-1
lda  #0
.initSorted
sta  O_Sorted,Y
dey
bpl  .initSorted

ldx  #MAX_OBJECTS-1; position to compute
.sortObj
ldy  #MAX_OBJECTS-1
lda  #0
sta  temp
.findMaxObj
lda  O_Sorted,Y
bne  .skipObj

; compare the object with the max value	
lda  O_Y,Y
cmp  temp

bmi  .skipObj

; object is greater; save
sty  index
sta  temp

.skipObj
dey
bpl  .findMaxObj

; store the result
ldy  index
sty  O_Sort,X

; mark this object as processed
lda  #1
sta  O_Sorted,Y

dex  
bpl  .sortObj	

RTS

 

Note that I'm storing the indexes of the objects in a separate array called O_Sort (O_Sorted is a boolean array that stored whether the object has been processed).

 

Of course, my assumption is that I need to process objects in a sorted order to implement intelligent flicker. Any ideas are welcomed.

 

Thanks,

Link to comment
Share on other sites

Thanks for the tip - that was going to be my first approach, but since all 8 objects can move each frame I was worried that would take too long. As it turns out, sorting the array each time takes even longer!! :)

 

Of course with this approach, I only have to sort objects that move up or down, and I only have to traverse the list in one direction based on whether the object moved up or down, and I can stop comparing once I hit a value that is greater (or less than).

 

I'll try this approach and see how much time it takes.

 

Thanks again!

Link to comment
Share on other sites

Hi there!

 

You may want to research in the [stella] archives about my Flickersort method used for Star Fire.

 

The BASIC idea is doing one iteration of a special Bubble Sort per frame. Doing so will have a "constantly improving" result.

 

Amazing bonus: If done right, this'll automatically do the "intelligence" as well - overlapping sprites will just continue to swap positions and so they automatically get an even distribution of on-screen time!

 

Greetings,

Manuel

Link to comment
Share on other sites

Selection sort isn't bad, but there are a few things you should do to make it faster.

 

First and foremost, having to check each element to see if it's the maximum and also check to see if it's already been selected is very wasteful. It's probably better to swap each selected item with the top element and shrink the size of the area being searched after each pass of the outer loop. Alternatively, the search can be performed on a temp copy of the original data to be sorted; every time an item is selected, the copy of the item in the temp array should be zeroed.

 

I'd suggest doing something like this:

 lda #7
 sta psize
sort1:
 ldx psize
sort2a:
 txa
 ldy items,x
 dex
 bmi sortdone1
sort2b:
 cpy items,x  
 bcc sort2a
 dex
 bpl sort2b
sortdone1:
; A holds the present location of the maximum item
; Do something useful with it, then...
 dec psize
 bpl sort1

Note that the inner loop time is MUCH smaller than what you had before, since the important information is all kept in registers.

Link to comment
Share on other sites

Thank you all for your help. I'm going to implement the bubble sort and see what kind of results I get.

 

While we're on the subject, my objects each have 3 bytes of info (X,Y and state) needed for drawing. My first thought was that it would be quicker to have a secondary table that stored the indexes to the objects so that when I sort I only have to swap indexes in this table (instead of swapping all 3 bytes in the object table).

 

Are there any thoughts on this?

 

Thanks again!!

 

BTW - I'm working on a re-write of Wizard of Wor using intelligent flicker (that non-intelligent flicker is a killer!). I'm sure I could hack the original code, but I'm using this as a learning opportunity too. I hope to implement the intelligent flicker, add more mazes (including the special mazes like The Arena, Warlord and Pit levels), and the in-between "DOUBLE SCORE DUNGEON" transition scene. I'll probably have to use 8K but we'll see.

Link to comment
Share on other sites

While we're on the subject, my objects each have 3 bytes of info (X,Y and state) needed for drawing.  My first thought was that it would be quicker to have a secondary table that stored the indexes to the objects so that when I sort I only have to swap indexes in this table (instead of swapping all 3 bytes in the object table). 

922718[/snapback]

 

If it's only three bytes per object, use a selection sort and do the swap. If you're crunched for time, it may be useful to unroll some of the loops. This will cause some significant code bloat (I don't know what your target size is) but allow for a major speedup since you can avoid redundant loads/stores. I'll see if I can code a macro-based version for you.

Link to comment
Share on other sites

I had to resort to an archive to check out that article.

http://web.archive.org/web/20041106215212/...ants/sprite.htm

 

Interesting. I had written a 32 sprite mux for the C64 back in 1990. This certainly brings back memories. I wish I had kept some source code. Maybe some day I'd like to disassemble TMNT for the C64 and figure out what I did. I do remember being satisfied with the 'mapper' part of the code.

Link to comment
Share on other sites

Interesting. I had written a 32 sprite mux for the C64 back in 1990. This certainly brings back memories. I wish I had kept some source code.

923138[/snapback]

 

Doing a sprite multiplexer for the C64 is a bit different from doing one on the 2600, since the former has enough RAM to allow for radix-bucket sorting. Depending upon the application, the 2600 may as well, but it's hardly a given.

 

The basic approach to radix-bucket sorting would be something like this (assuming eight zones of 16 scan lines each; two sprites per zone)

 ldx #30 ; (Number of zones*2-1)*2
 lda #0
cllp:
 sta zoneptr+1,x
 dex
 dex
 bpl cllp  

 ldy #7 ; Number of sprites
lp:
 lda spritey,x ; abs,y mode
 lsr
 lsr
 and #$FC
 tax
 lda zoneptr+1,x
 beq storehere1
 inx
 inx
 lda zoneptr+1,x
 bne nothere1
storehere1:
; Store sprite shape into zoneptr,x
nothere1:
 lda spritey,x ; abs,y mode
 lsr
 adc #[sprite_height/2]
 lsr
 and #$FC
 tax
 lda zoneptr+1,x
 beq storehere2
 inx
 inx
 lda zoneptr+1,x
 bne nothere2
storehere2:
; Store sprite shape into zoneptr,x
nothere2:
 dey
 bpl lp

Note that there is no need for sorting, though steps must be taken to ensure that in case of 'zone overload' objects will flicker instead of disappearing entirely. If that is addressed [and there are ways of dealing with it] the above general approach can be a good one.

Link to comment
Share on other sites

the interesting part starts with the "flicker engine".....

 

One approach which works well if there are at most four objects/zone is to have alternate frames take the objects in ascending and descending order. If there are one or two objects in a zone, they'll appear solid. If three, two will flicker 30Hz and one not. If four, all will flicker 30Hz. If there are more than four, some will disappear (oops).

 

I don't know what the best way is to construct a flicker engine for dividing more than four objects among two sprites. In a game like Star Fire where both sprites are used as a pair, the algorithm is comparatively simple. The general case, however, is harder. There are a number of approaches I can see, but all of them require either icky amounts of space or icky amounts of time. Constraining objects to move in discrete zones [as in Pooyan :P or Millipede :)] simplifies things by making it more practical to search for objects in zone 1, then zone 2, etc.

 

and what is happening when sprites cross the "zones"...

923672[/snapback]

 

Well, the partial code I gave above handled part of the issue. You need to indicate whether a sprite is a 'new' sprite or a 'continuation', which can be handled using the X position. If zero, it's a continuation of a sprite; anything else, it's a new sprite.

Link to comment
Share on other sites

Hi there!

 

I don't know what the best way is to construct a flicker engine for dividing more than four objects among two sprites.  In a game like Star Fire where both sprites are used as a pair, the algorithm is comparatively simple.  The general case, however, is harder.

 

Actually the Star Fire Flickersort algorithm will work 100% the same, totally regardless of how many hardware sprites are actually available, as it'll always cause an even distribution of screen time for any object. And the kernel would still try to draw as many sprites as it can - only at the advantage of having more hardware sprites available to draw them.

 

Greetings,

Manuel

Link to comment
Share on other sites

Manuel...just went through the flicker sorter...

 

is is this part of the source?

 

;-----------------------------------------------------------
;FLICKER SORT SUB
;-----------------------------------------------------------
  
FlickerSort:
   LDX #MAXOBJECTS-2
SortLoop:
   LDY sprtsizetype+1,X   ; A-> 1st sprite type and size
   LDA ypos+1,X           ; A-> 1st sprite ypos
   SBC shapeheight,Y      ; A-> 1st sprite ypos - size
   SBC #$03               ; Fix skipping problem
   CMP ypos,X             ; 2nd Sprite drawable?
   BPL DontSwap           ; Y: All right!
SwapThis
   LDA xpos,X             ; N: Swap horizontal position
   LDY xpos+1,X
   STA xpos+1,X
   STY xpos,X
   LDA ypos,X               ; Swap vertical position
   LDY ypos+1,X
   STA ypos+1,X
   STY ypos,X
   LDA xymov,X            ; Swap horizontal movement
   LDY xymov+1,X
   STA xymov+1,X
   STY xymov,X
   LDA zmov,X            ; Swap horizontal movement
   LDY zmov+1,X
   STA zmov+1,X
   STY zmov,X
   LDA sprtsizetype,X            ; Swap size & type
   LDY sprtsizetype+1,X
   STA sprtsizetype+1,X
   STY sprtsizetype,X
DontSwap:
   DEX
   BPL SortLoop

Link to comment
Share on other sites

Actually the Star Fire Flickersort algorithm will work 100% the same, totally regardless of how many hardware sprites are actually available, as it'll always cause an even distribution of screen time for any object. And the kernel would still try to draw as many sprites as it can - only at the advantage of having more hardware sprites available to draw them.

924334[/snapback]

 

Right, but if there are an odd number of objects (especially if it's three) what may be the optimal flicker for equalizing objects' display time will not look the best. For example, if there are three objects, it will look better if one is displayed 100% of the time and the other two 50%, than if all three are displayed 66% of the time. For that matter, having all three at 50% would still be better than all three at 66%.

Link to comment
Share on other sites

Hm... why should 50% look better than 66%?

924541[/snapback]

 

The rate of flicker is much more noticeable than the duty cycle. This is why movie projectors use a 2-blade (for sound speed) or 3-blade (for silent speed) shutter. Using the 2- or 3-blade shutter reduces the duty cycle of the projection, but causes it to flicker at 48 or 54fps instead of 24 or 18.

 

An object which is shown at 50% duty cycle can flicker at 30Hz. An object which is shown at 66% duty cycle can flicker at, at best, 20Hz. Likewise, an object which is shown at 33% duty cycle can flicker at 20Hz; an object which is shown at 40% duty cycle would have a flicker-pattern rate of at best 12Hz.

Link to comment
Share on other sites

Hi there!

 

An object which is shown at 50% duty cycle can flicker at 30Hz.  An object which is shown at 66% duty cycle can flicker at, at best, 20Hz.  Likewise, an object which is shown at 33% duty cycle can flicker at 20Hz; an object which is shown at 40% duty cycle would have a flicker-pattern rate of at best 12Hz.

 

I see. But an object with 40% screen time would look better with a pattern like: 1-0-1-0-0 as with 1-1-0-0-0?

 

And if you have four objects, 1-0-1-0 would be better than 1-1-0-0, or?

 

Then... I think for 2 hardware sprites the Flickersort algorithm might improve the result when doing 2 iterations per frame. It'd do the 1-0-1-0 for four sprites. For three sprites it won't help any though.

 

With three objects, even if the flicker rate is the same on 66% vs 33%, wouldn't 66% look at least slightly improved over 33% on a real TV? Or would it just change the brightness?

 

Greetings,

Manuel

Link to comment
Share on other sites

With three objects, even if the flicker rate is the same on 66% vs 33%, wouldn't 66% look at least slightly improved over 33% on a real TV? Or would it just change the brightness?

Depends on the phosphorescence effect of the TV and how fast it changes the intensity level from 0 to 1 and 1 to 0 (which vary when switching e.g. from 0.5 to 1). Rather complicated IMO.

 

Does anybody know the average values here?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...