Jump to content
enthusi

Lynx loader from scratch

Recommended Posts

One last note:

add "stz $fda0,x" in the copy loop to zero out the palette.

And yes, now the wheel is much much rounder ... :-)

Share this post


Link to post
Share on other sites

just a hint:

the second stage in the microloader is not encrypted. thus there is no real speed difference to your ansatz.

if you want to do a REAL optimization, you have to choose the filler bytes such, that the multiplication is faster.

Share this post


Link to post
Share on other sites

The challenge was to fit as much as possible in the first 50bytes since this is the minimum.

 

But, oh, I get your point, if the first stage is only a few bytes that loads the rest and we fill the remainder with optimal values, decryption plus additional loading might be quicker.

 

But to find this kind of optimized code one needs to have exact cycle counts. There is a guy in the 6502-FB group who made a simulator with lots to debugging features. But for Apple][.

 

I do not trust handybug's cycle count, but the decription could be run in another simulator ...

 

Next challenge :-)

Edited by 42bs

Share this post


Link to post
Share on other sites

actually it depends on how the multiplication is written. if it cares about if bits are set or not.

 

anyway: already with the microloader + unpacker for the first game binary you are so fast, that the "blob" from turning on is already disturbing the music ;-)

  • Like 1

Share this post


Link to post
Share on other sites

Yes :)

Could it be that the last byte of the 50 bytes has to be 0?

I seem to run into a problem/bug and that would explain it.

Share this post


Link to post
Share on other sites

If the last byte is 0 it stops decryption. Otherwise it continues with the next bootloader block.

Share this post


Link to post
Share on other sites

This was the final version, that I now use, btw:

enthusi_one_shot_loader
    dex
    txs
    ldx #31
l2    
    lda code,x
    pha
    stz $fda0,x
    dex
    bpl l2
    bmi code_start
code
*=$1e0
  code_start
    lda  cart0
    sta blocks2load
load_a_full_block
   inc block ;I use $03 which is initialised as 0 from BIOS
   lda block
   jsr $fe00
   tay 
pageloop
    lda cart0
target
    sta $200,y
    iny
    bne pageloop
    inc target+2
    dex
    bne pageloop
    dec blocks2load
    bne load_a_full_block
ready ;we are at $0200 here where the main game starts
;alternatively I load two more bytes as startadress (stored in stack) and use RTS in code to jump there 
  • Like 1

Share this post


Link to post
Share on other sites

 

This was the final version, that I now use, btw:

enthusi_one_shot_loader
    dex
    txs
    ldx #31
l2    
    lda code,x
    pha
    stz $fda0,x
    dex
    bpl l2
    bmi code_start
code
*=$1e0
  code_start
    lda  cart0
    sta blocks2load
load_a_full_block
   inc block ;I use $03 which is initialised as 0 from BIOS
   lda block
   jsr $fe00
   tay 
pageloop
    lda cart0
target
    sta $200,y
    iny
    bne pageloop
    inc target+2
    dex
    bne pageloop
    dec blocks2load
    bne load_a_full_block
ready ;we are at $0200 here where the main game starts
;alternatively I load two more bytes as startadress (stored in stack) and use RTS in code to jump there 

You know that 65C02 has "bra" ;-)

Also, you should init AUDIN (see Karri's comment)

Edited by 42bs

Share this post


Link to post
Share on other sites

Yeah, right after BPL I still find BMI easier to read but that's just what I am used to :)

AUDIN is for the main code then (this loader right now is fixed to $200 anyway and for own projects).

Share this post


Link to post
Share on other sites

Yeah, right after BPL I still find BMI easier to read but that's just what I am used to :)

AUDIN is for the main code then (this loader right now is fixed to $200 anyway and for own projects).

 

The problem with AUDIN is relevant if you use bank-switching.

The loader must be in both banks (!), but your "main" application is likely on bank 0. Therefore one needs to init AUDIN, else it will not work on all Lynxes.

Share this post


Link to post
Share on other sites

Off topic: Loading with bank-switching

This is a thought and I write it down, so it does not get lost :-)

If the code is split up between both banks (RAID0), loading could be even quicker, as the block only needs to be selected only half as often.

 

Means: First 1K is loaded from block x, bank 0, switch AUDIN, 2nd 1K is loaded from block x, bank 1.

There is no need to have a full 512K game, it should work also with smaller ones.

 

This trick might help playing a sample off the card: At least 0.256s at 8kHz ;-)

Share this post


Link to post
Share on other sites

Yes, but even block changing is fast enough for 8Khz samples ; )

It would be a bit of fun to use it to load twice as many small files without byte-offset seeking. If you dont change block but just AUDIN you need a proper interleave for the banks since the ripple counter continues ;-)

Share this post


Link to post
Share on other sites

Yes, but even block changing is fast enough for 8Khz samples ; )

It would be a bit of fun to use it to load twice as many small files without byte-offset seeking. If you dont change block but just AUDIN you need a proper interleave for the banks since the ripple counter continues ;-)

 

If you do not load full blocks, yes the counter poses a challenge.

Share this post


Link to post
Share on other sites

For my current game it is a notable speed difference (well, only notable when you launch them side by side) if I use my own loader or the one lynxdir implements.

Mostly just on the leftover green of the palette though :)

But until I have use for it I tend to ignore AUDIN now.

Share this post


Link to post
Share on other sites

Great findings! There is so much new ideas, interleaved banks to speed up access. Keep offsets as zero to speed up loading files. And a single block loader.

Share this post


Link to post
Share on other sites

Great findings! There is so much new ideas, interleaved banks to speed up access. Keep offsets as zero to speed up loading files. And a single block loader.

That trick was already used in lynxer. you will find it in some of the old homebrew roms. but maybe by accident ;-)

Share this post


Link to post
Share on other sites

Yepp, LYNXER had #ALIGN, but not interleaving of bank 0 and 1. At least non of the 68k versions and the C lynxer (I think Matthias wrote it) did neither.

Share this post


Link to post
Share on other sites

I am still not sure that you win a lot. as there is only one counter reset, you cannot read one bank (sample from interrupt) while the other is loading sprites etc. I am not evn sure if there is a second counter.

Share this post


Link to post
Share on other sites

There is only one counter. Using interleaving for arbitrary data seems complicated as the counter increase on every read.

But reading a complete block works as the lower bits are zero after you have read the block on bank 0.

Share this post


Link to post
Share on other sites

I see no problem with interleaving either.

Imaginge complete 256 Byte pages.

bank0    bank1
---------------
page1    X
  X    page2
page3    X

you'd still have to set anew block after n pages of course.

And X doesnt have to be empty but could be just the same starting in back1 instead of 0.

You gain nothing except for causing some confusion/obfuscation ;-)

If you load full blocks in that fashion you effectively double the size of a block though.

Share this post


Link to post
Share on other sites

 

This was the final version, that I now use, btw:

enthusi_one_shot_loader
<snip>

 

Funny. I found this code dated 2009 called micro_loader on my disc, but I am not sure who did it:

 

	.psc02                  ; turn on 65SC02 instruction set

        RCART_0 = $fcb2 ; cart data register

	.org    $0200

	ldx	#15
b0:	stz	$fda0,x
	dex
	bpl	b0
	;; size in pages
	lda	RCART_0
	sta	2
	;; dst $233
	lda	#2
	sta	4

	asl			; 4 pages per block
	sta	5
	
	ldy	#51		; already 51 bytes loaded in 1st block
b1:	
	lda	RCART_0
	sta	(3),y
	iny
	bne	b1
	inc	4		; next dst page
	dec	2
	beq	done
	dec	5		; next block pages
	bne	b1
	inc	0		; next block
	lda	0
	jsr	$fe00		; select
	bra	b1
done:

It was written in ca65 syntax.

Share this post


Link to post
Share on other sites

Oh, Never Seen this. Interesting. No C64 guy when $00 is being used :-) Are 0,1,2 even initialized?

Share this post


Link to post
Share on other sites

Interesting find. I remember seeing the file name at some point in time.

 

Leaving out the directory is something I briefly discussed with Wookie when we were trying to understand the obfuscation and RSA encryption phases years ago. Some consoles like PSX stream in objects from the CD at run time. So you could move around in your world and automatically load in new objects in front of you and discard old objects behind you. This problem is very similar to displaying nautical charts using OpenGL. You need to have fast access to objects that are nearby in order to get decent drawing speeds on 4K displays.

 

Once I get the eJagfest and "Shaken, not stirred" out of mind I could have a look on some experimental spatial engine that would allow creation of a RPG with dynamically loaded content. Perhaps Stardreamer could take a turn in that direction? To boldly go where no Atarian has ever gone before?

  • Like 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...