4 to 10 scanlines just to sort 8 bytes?

matseng · May 7, 2019

Since it is like 35 years ago I seriously coded 6502 I'm currently a bit rusty, but I'm trying to figure out a decently fast sorting algorithm for small arrays like for a list of Y-values of objects that are to be displayed on the screen as the scanlines passes by.

This must be something that is quite common and already solved many times over for 2600 games.

I did my own version of a bubble sort that is kept sorted byte-by-byte as I move the Y-values from the objects into the array of sorted data that is used in the kernel display loop. It is significantly larger in code size than the standard bubble sorts I found for 6502, but it performs rather well both for data that is already pre-sorted and data that is in reverse order.

In the table below the first three entries are for my code the last two is something I found on the net. (This is for sorting 8 bytes)

; First value is the number of scanlines required for pre-sorted data,
; second is for data that is fully reverse order.
;
; Yellow  3.0 -  7.5 (369 bytes) Continous Unrolled 
; Orange  7.4 - 12.0 (262 bytes) Continous Subroutined 
; Red     4.0 -  9.0 (138 bytes) Continous Fallthrough
; Blue    2.4 - 29.9 ( 51 bytes) Bubblesort
; Green  12.1 - 14.0 ( 53 bytes) Optisort

But spending 4 to 9 scanlines of time for sorting just eight bytes seems like a very long time. I guess you guys have something much better for this, or is it what's actually expected time wise ?

Here's a visual comparison of the data from the above table:

[Edit: Seems like attachments are auto-displayed so I removed it from the post itself]

sorttest.asm

Edited May 7, 2019 by matseng

+splendidnut · May 8, 2019

Nice question! I've been wondering about what others have done in the sprite sorting department.

For Chaotic Grill, I'm currently using a sorting network. Code as follows:

;----------------------------------------------------
;--- Sprite sorting using sorting network
	
	MAC	SWAP	;17/21 cycles - 14 bytes
		LDX		<sortedSpriteList+{1}	;3	[3]
		LDY		<sortedSpriteList+{2}	;3	[6]
		LDA		<spriteY,x		;4	[10]
		CMP		<spriteY,y		;4	[14]
		BCC		.noSwap			;2,3    [16/17]
		STX		<sortedSpriteList+{2}	;3	[19]
		STY 	        <sortedSpriteList+{1}   ;3	[21]
.noSwap:
	ENDM
	
	MAC SORT_SPRITES
	;sort sprites using sorting network
	
	;--- first load sprites into sorted list, using original order
		LDX		#0
addSpritesToSortList:
		TXA
		STA		<sortedSpriteList,x
		INX
		CPX		<totalSprCnt		;change to 6 if including item
		BNE		addSpritesToSortList
		CPX		#6
		BNE		.sort5sprites
		JMP		.sort6sprites

.sort5sprites:
	;-- sorting network for 5 sprites:	- 21*9 worst case (189)
		SWAP 0,1
		SWAP 3,4
		SWAP 2,4
		SWAP 2,3
		SWAP 0,3
		SWAP 0,2
		SWAP 1,4
		SWAP 1,3
		SWAP 1,2
		JMP 	.doneSortSprites
	
.sort6sprites:
		;-- sorting network for 6 sprites:	21*12 worst case	(252)
		SWAP 1,2
		SWAP 0,2
		SWAP 0,1
		SWAP 4,5
		SWAP 3,5
		SWAP 3,4
		SWAP 0,3
		SWAP 1,4
		SWAP 2,5
		SWAP 2,4
		SWAP 1,3
		SWAP 2,3
	
.doneSortSprites:
	ENDM

I run the sort once per frame for flicker management.

But I'm thinking of switching to keeping the list ordered all the time by allowing the player/enemy movement logic to only change order when they move vertically... and then the only necessary thing would be swapping with a neighboring sprite. That would probably be the fastest way to handle it.

ChildOfCv · May 8, 2019

In general, your best bet is a sort that knows the rules of your situation and can exploit them.

For example, if the sprites are only moving by small amounts, you only have to check for an ordering change whenever one makes a vertical move, and then only for that item. Then you ought to be done with the sorting within a swap or two, since you not only know which direction to look, but also have a pretty good chance of only moving a position or two at once in the list.

explorer · May 8, 2019

If data are very closed to have sorted, one of the best algorithms is Insertion Sort.

For general data, one of the best is Quick Sort.

NoLand · May 9, 2019

Maybe using a binary tree as you go along in your logic and then just flatten the tree? However, you'd have to reserve some memory for this (assuming the tree may extend from the root-node to either side in its entirety) and traversing the tree may be just as expensive.

Sign In

4 to 10 scanlines just to sort 8 bytes?

Recommended Posts

matseng

Link to comment

Share on other sites

+splendidnut

Link to comment

Share on other sites

ChildOfCv

Link to comment

Share on other sites

explorer

Link to comment

Share on other sites

NoLand

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Apps

My Activity Streams

More