Jump to content
Andrew Davie

Session 12: Initialisation

Recommended Posts

It's a bit like someone coming up to Yoda and thanking him for his advice and calling him "young master"... with no idea he's talking to the most powerful Jedi master of all.

Share this post


Link to post
Share on other sites

 

 

Yes, that's a good 8-byte clear. It clears memory but doesn't set the stack pointer.

you should look at Session 24 "Some nice code" for a better one...

        ldx #0 
        txa 
Clear   dex 
        txs 
        pha 
        bne Clear

The above exits with stack pointer set to $FF, all memory zeroed, X and A zero.

 

how about

 

 ldx #$FF
 txs
 lda #$00
LOOP
 tsx
 pha
 bne LOOP 

Omegamatrix can stick in

a few extra pha's at the

beginning of the loop as

long as the total is divisible

in to 256

 

 

 

 

Edited by bogax

Share this post


Link to post
Share on other sites

 

 

Yes, that is a brilliant, optimal solution. Just stick a CLD on there and you are done.

 

 

I sometimes run into a situation where I need to re-boot or switch to an entirely new kernel (say titlescreen to a playing screen). The main issue to avoid is scanline bounces. The optimal code takes about 36 scanlines to complete which is too long. I made a routine that saves about 26 scanlines with the trade-off of using much more bytes. I tried to balance the byte cost vs amount of scanlines gained, and this was the best balance I could find:

    cld
    lda    #0
    ldx    #$2C
    txs
.loopClear:
    pha
    pha
    pha
    pha
    pha
    pha
    tsx
    cpx    #$7E
    bne    .loopClear
    ldx    #$FF
    txs

 

like this (see above)

 

 ldx #$FF
 txs
 lda #$00
LOOP
 pha
 pha
 pha
 pha
 pha
 pha
 pha
 tsx
 pha
 bne LOOP
 cld
Edited by bogax

Share this post


Link to post
Share on other sites

 

how about

 

 ldx #$FF
 txs
 lda #$00
LOOP
 tsx
 pha
 bne LOOP 

 

It's not better as it stands -- 9 bytes instead of 8. However, in the special case where you need extra speed it's definitely quicker.

Share this post


Link to post
Share on other sites

ldx #$FF

txs

lda #$00

LOOP

tsx

pha

bne LOOP

 

Hmm.... This gonna works?

 

lax #$00

dex

txs

Loop

tsx

pha

bne Loop

 

EDIT : or

 

lax #$00

Clear dex

txs

pha

bne Clear

Edited by LS_Dracon

Share this post


Link to post
Share on other sites

Although LAX#imm is supposedly always stable when the argument is zero, undocumented opcodes are not supported on all hardware.

Share this post


Link to post
Share on other sites
    cld
    lda    #0
    ldx    #$2C
    txs
.loopClear:
    pha
    pha
    pha
    pha
    pha
    pha
    tsx
    cpx    #$7E
    bne    .loopClear
    ldx    #$FF
    txs

 

I came up with a more optimized solution which I posted in my blog a while ago:

    cld
    lda    #0
    ldx    #CXCLR
    txs
    ldx    #28
.loopClearFaster:
    pha
    pha
    pha
    pha
    pha
    pha
    dex
    bpl    .loopClearFaster
    txs

@Bogax the point of the above code is to balance speed vs bytes used. By starting at CXCLR and working down a lot more cycles are saved (plus less loops by stuffing multiple PHA's). The above code only takes 10 scanlines and 48 cycles. That's pretty good performance. The compact code takes 36 scanlines and 22 cycles to complete.

 

 

You typically only need speed if you are switching kernels... say from a title screen to playing screen, and want to easily avoid scanline bounces.

Edited by Omegamatrix

Share this post


Link to post
Share on other sites

Although LAX#imm is supposedly always stable when the argument is zero, undocumented opcodes are not supported on all hardware.

Yep.

It's safe in Atari 2600 I assume, as many homebrews uses this opcode. Actually this and DCP.

 

 

 

Very nice!

Thanks but it's just your code with lax ;)

Share this post


Link to post
Share on other sites

Although LAX#imm is supposedly always stable when the argument is zero, undocumented opcodes are not supported on all hardware.

I use LAX all the time, but have never used LXA #IMM as it was reportedly highly unstable. Looking at this page it might be true that loading zero might always work:

 

http://www.oxyron.de/html/opcodes02.html

 

 

note to LAX: DO NOT USE!!! On my C128, this opcode is stable, but on my C64-II it loses bits so that the operation looks like this: ORA #? AND #{imm} TAX.

 

I'm writing this opcode as "LXA" because that is how DASM compiles it.

Share this post


Link to post
Share on other sites

LAX not works with imm (at least in DASM)

So LXA is working fine on emulator.

 

BTW I'm testing and having problems in these codes, it's not working.

 

TSX must be set before PHA, but doesn't make sense to me...


	cld
	lxa #0
	txs

loop	tsx
	pha
	bne loop
Edited by LS_Dracon

Share this post


Link to post
Share on other sites

 

LAX not works with imm (at least in DASM)

So LXA is working fine on emulator.

 

BTW I'm testing and having problems in these codes, it's not working.

 

TSX must be set before PHA, but doesn't make sense to me...

	cld
	lxa #0
	txs
loop	tsx
	pha
	bne loop

 

The problem seems to be that your (and my) code exits with SP=0, whereas it should be $FF

Add another PHA at the end, like this...

 lxa #0
 txs
loop pha
 tsx
 bne loop
 pha

starts with x=0 and then puts that into SP, the PHA writes 0 to location 0, and sets the SP to $FF and we loop

when SP is 1, the pha will write 0 to location 1, and SP becomes 0 which is then tsx'd and the loop ends, with SP=0

the final PHA resets the SP to $FF

 

I haven't actually run this. But it looks reasonable. However, "LXA" is considered unstable and should probably not be used. And there's no LAX immediate as you have pointed out.

 

So...

 lda #0
 tax
 txs
loop pha
 tsx
 bne loop
 pha

It's not so elegant anymore. 9 bytes, but does have the advantage of a quicker (512 cycles) clear at the cost of an extra byte.

Share this post


Link to post
Share on other sites
 lax #0
 txs
loop pha
 tsx
 bne loop

 

There is a problem with the above code in that the stack pointer is left pointing to 0 after completion.

 

LAX not works with imm (at least in DASM)

So LXA is working fine on emulator.

 

BTW I'm testing and having problems in these codes, it's not working.

 

TSX must be set before PHA, but doesn't make sense to me...


	cld
	lxa #0
	txs

loop	tsx
	pha
	bne loop

The branch in the loop is never taken, as the very first time through TSX brings a value of 0 to X. PHA does not affect any flags.

 

 

Edit: Andrew beat me to it.

Edited by Omegamatrix

Share this post


Link to post
Share on other sites

Here's another one. :)

;25 scanlines + 18 cycles (1918 cycles total)
;A  = 0
;X  = 0
;Y  = random
;SP = $FF
;zp ram location $FF = random

    cld
    lda    #0
.loopClear:
    ldx    #$48          ; PHA opcode = $48
    txs
    inx
    bne    .loopClear+1  ; jump between operator and operand to do PHA

This sets the stack correctly, but leaves ram location $FF untouched. Not clearing $FF is okay for me. It can be used for a random seed, and often programmers use JSR with the stack aligned to $FF anyhow. starting at $48 instead of 0 or $FF makes the routine quicker.

 

 

Edit just realized the mirror for the TIA registers starts at $40, so I don't actually clear:

VSYNC

VBLANK

NUSIZ0

NUSIZ1

COLUP0

COLUP1

 

Most of these registers the programmer will set up during the program, so it's still not too bad as long as the user is aware that the initial state of them is unknown.

Edited by Omegamatrix

Share this post


Link to post
Share on other sites

I believe I have just come up with an 8 byte solution that includes CLD, an no illegal opcodes:

;39 scanlines + 65 cycles (3029 cycles total)
;A  = 0
;X  = 0
;Y  = random
;SP = $FF

    cld
.loopClear:
    ldx    #$0A          ; ASL opcode = $0A
    inx
    txs
    pha
    bne    .loopClear+1  ; jump between operator and operand to do ASL

It takes the most cycles of any solution, but clears all the TIA registers and RIOT ram. :)

Edited by Omegamatrix
  • Like 3

Share this post


Link to post
Share on other sites

I believe I have just come up with an 8 byte solution that includes CLD, an no illegal opcodes:

;39 scanlines + 65 cycles (3029 cycles total)
;A  = 0
;X  = 0
;Y  = random
;SP = $FF

    cld
.loopClear:
    ldx    #$0A          ; ASL opcode = $0A
    inx
    txs
    pha
    bne    .loopClear+1  ; jump between operator and operand to do ASL

It takes the most cycles of any solution, but clears all the TIA registers and RIOT ram. :)

 

The branch into mid-instruction which is a asl is very clever.

However, I'm struggling to understand this. X is effectively initialised at 11 (first time) so that's where the first "a" value goes. But "a" is undefined -- effectively random.

second time you do an "asl" every loop, so after 8 loops a will guaranteed be 0. And you branch until Z is zero (effectively when x gets to 0). So you never clear locations 0 to 10.

And furthermore locations 10 to 17 effectively have randomish data.

This code is bizarre, and this is my third attempt to analyse/respond.

Share this post


Link to post
Share on other sites

 

The branch into mid-instruction which is a pha is very clever.

However, I'm struggling to understand this. X is effectively initialised at 11 (first time) so that's where the first "a" value goes. But "a" is undefined -- effectively random.

second time you do an "asl" every loop, so after 8 loops a will guaranteed be 0. And you branch until Z is zero (effectively when x gets to 0). So you never clear locations 0 to 10.

And furthermore locations 10 to 17 effectively have randomish data.

This code is bizarre, and this is my third attempt to analyse/respond.

Hi Andrew,

 

A=0 by the time it hits the TIA mirrors at $40-$7F. It doesn't matter what value A starts with, as it will be zero for the "second time through" as it clears the mirrored addresses. As a bonus you know the carry will also always end up being clear by the end of this routine.

  • Like 1

Share this post


Link to post
Share on other sites

Does the second routine make sense now?

 SP    REGISTER    VALUE (FROM ACCUMULATOR, which gets ASL'd)
$0B     REFP0      %XXXXXXXX
$0C     REFP1      %XXXXXXX0
$0D     PF0        %XXXXXX00
$0E     PF1        %XXXXX000
$0F     PF2        %XXXX0000
$10     RESP0      %XXX00000
$11     RESP1      %XX000000
$12     RESM0      %X0000000
$13     RESM1      %00000000
$14     RESBL      A=0 for now on

;writes continue to start of TIA mirrors

 SP    REGISTER    VALUE (FROM ACCUMULATOR)
$40     VSYNC       0
$41     VBLANK      0
$42     WSYNC       0
...

;Writes continue through ZP $80-$FF clearing RIOT RAM
;At end of routine TIA registers and RIOT RAM cleared,
;A=X=0, SP = $FF

Share this post


Link to post
Share on other sites

You typically only need speed if you are switching kernels... say from a title screen to playing screen, and want to easily avoid scanline bounces.

But why would you be clearing ram and registers at that point anyway? Of all the games I've altered to have more than a single kernel, I've never had to do it. Powerup only "requires" it because everything is in an unknown state...but even that is too broad of a statement to be using (i.e. you really only need to clear the stuff your regular game init routine misses, or gfx/aud registers that you won't be using at all).

Share this post


Link to post
Share on other sites

Assuming LXA is stable as LAX and removing dex from the loop and setting stack as $FF, this should works?

We could test LXA in real hardware. I'm searching about it and people who said it's not stable, misunderstand referring as LAX.

   lxa #0 
   dex 
   txs 
loop 
   pha 
   tsx 
   bne loop

EDIT : Definitely unstable, and it's not "lax #imm", it's AND A with X and load on X.

Since X not starts as 0, A as well, it's not useful.

Edited by LS_Dracon

Share this post


Link to post
Share on other sites

But why would you be clearing ram and registers at that point anyway? Of all the games I've altered to have more than a single kernel, I've never had to do it. Powerup only "requires" it because everything is in an unknown state...but even that is too broad of a statement to be using (i.e. you really only need to clear the stuff your regular game init routine misses, or gfx/aud registers that you won't be using at all).

It's just much easier to clean it all. IMHO it also makes the game a lot easier to troubleshoot.

Share this post


Link to post
Share on other sites

EDIT : Definitely unstable, and it's not "lax #imm", it's AND A with X and load on X.

Since X not starts as 0, A as well, it's not useful.

Although LXA is unstable, it is possible that using 0 for the immediate value could be stable as Nukey described.

 

My notes describes LXA as:

AND byte with accumulator, then transfer accumulator to X register.
And the unstable behaviour is described as:

ORA #? AND #{imm} TAX

 

 

In either case the accumulator is AND'd with the immediate value right before TAX. As long as you are ANDing with 0 you should be okay. That being said I'd still be a little iffy to implement it. Who knows if the behaviour will be different on some consoles?

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...