Jump to content
IGNORED

How to get a routine address pointer ?


shazz

Recommended Posts

Hello...

 

I'd like to get the address of a given point of a routine

 

lda #<MyRoutine
sta RoutinePtr
lda #>MyRoutine
sta RoutinePtr+1

 

gives the address of the routine start point. Ok.

 

But if I want to have the address within this routine

ldy #10

lda #<MyRoutine,Y
sta RoutinePtr
lda #>MyRoutine,Y
sta RoutinePtr+1

 

 

I get :

RoutinePtr = $0000....

 

What is the correct syntax ?

Edited by shazz
Link to comment
Share on other sites

I've got a big routine wasting cycles. So based on some criteria I want to jump at the beginning of the routine to waste all the cycles or jump somwhere into this routine to waste less (I need to waste cycle to get in sync)

 

This routine is just sth like :

 

WasteOddTime SUBROUTINE
nop ; 95 ;-> waste 95 cycles
nop ; 93
nop ; 91
nop ; 89
nop ; 87
...
rts

 

so my current buggy code is :

LDA #<WasteOddTime,Y ;-> offset like +1 to waste only 93 cycles
STA DelayRoutine
LDA #>WasteOddTime
STA DelayRoutine+1

Edited by shazz
Link to comment
Share on other sites

There's no such command as LDA Immediate,y.

 

You can make the STA instructions take 4 cycles, instead of 3, by doing this:

  lda #<MyRoutine
 sta.w RoutinePtr
 lda #>MyRoutine
 sta.w RoutinePtr+1

 

The .w forces DASM to use Absolute addressing(16-bit) instead of Zero Page addressing(8-bit).

 

Why are you wasting 95 cycles?

Link to comment
Share on other sites

Darrel, yes that works but my question was how to get the address of a given offset after the label like MyRoutine+10

For the moment I get the label address then a 16bit addition....

 

And I waste 95 cycles because the VCS has too many :)

No no this is my working but non optimized moving 48px sprite routine to sync the GRP0 writes with the TIA...

 

But yes why 95.... because it is (posX/3)+43 for posX=159.... but I guess 96-76 = 20 would be enough :)

 

Damn I really need a better formula to compute the delay required to draw the sprite...

 

My current formula :

 

In VBlank time

1. compute cycles to wait : ((posX/3)+43) % 76

2. if even, get EvenNOPtable offset corresponding to this nb of cycles

3. if odd, get OddNOPtable offset corresponding to this nb of cycles

4. store the address in DelayRoutine

 

Before scanline

STA WSYNC

LDA #>ScanLineLoop

PHA

LDA #<ScanLineLoop-1

PHA

JMP (DelayRoutine)

 

ScanLineLoop:

...

Edited by shazz
Link to comment
Share on other sites

If the 10 is a constant value, let DASM do the work:

lda #<[MyRoutine + 10]
sta RoutinePtr
lda #>[MyRoutine + 10]
sta RoutinePtr+1

 

If it changes during runtime, use 16 bit addition:

 

 lda #10
jsr UpdateRoutinePtr
...
lda #20
jsr UpdateRoutinePtr
...

UpdateRoutinePtr:
clc
adc #<MyRoutine
sta RoutinePtr
lda #0
adc #>MyRoutine
sta RoutinePtr+1
rts

Link to comment
Share on other sites

ok. Makes sense.

 

And about my current formula, it seems only the computation takes a of time (especially dividing a big number by 3)... any idea ?

or more simply, is there a genric routine to waste n cycles ? like the sleep macro but dynamic ?

Edited by shazz
Link to comment
Share on other sites

Yeah I saw it but too lack of comments for me and when I see ;????? it makes me think this is magic and I don't like when it's magic :)

I knew I should have cleaned up the source code before posting it to the stella mailing list, as this 'problem' comes up every time someone looks at the code. ;)

 

Someone on the stella Mailing list had asked if it was possible to have a free moving 48 pixels sprite. I was curious, so I simply took the 48 pixel sprite routine and quickly merged it with the single cycle delay routine that Jim Nitchals had posted a little while before. Instead of doing proper cycle counting, I simply tweaked all the delay timer values and player position values until all the writes to the graphics registers lined up perfectly and I was able to move the sprite everywhere on the screen.

 

Back then I was still working with MS-DOS, which meant that I had to exit the text editor in order to compile the code and test the binary. Therefore I always used things like the ;????? as bookmarks for the place in the source code that I was currently working on. So there is nothing magical about those values. These are simply the lines where I had to tweak something to make the two routines work together properly. Sorry about the confusion. ;)

Link to comment
Share on other sites

Therefore I always used things like the ;????? as bookmarks for the place in the source code that I was currently working on.

Oops, your bad. You should have used "; <--- resume here!" to bookmark those spots. ";?????" is supposed to be used only as an abbreviated version of ";????? WTF ?????" ;)

Link to comment
Share on other sites

eh eh :) Funny story :)

 

By the way, Don't take it bad Eckhard, my comment about those magic comments are not the reason, this is just that I'm not able to understand the signe cycle delay routine which is the real "magic" of this piece of code, and I'm very impressed, all the stuff I'm playing aroudn to do the same rely either on huge tables either on huge computations, so when I see (without understand) that with a little table of $c9 you sync your code... I'm amazed...

 

I'll try to find Jim's post, maybe it will help.

But I guess I need first to have a working version with those drawbacks to understand how you did.

Link to comment
Share on other sites

I'll try to find Jim's post, maybe it will help.

But I guess I need first to have a working version with those drawbacks to understand how you did.

This is Jim's original post:

http://www.biglist.c...3/msg00160.html

 

And this post illustrates the timing of the delay routine a little better:

http://www.biglist.c...3/msg00162.html

Link to comment
Share on other sites

There's a divide by 3 in this topic. The page it references is gone, but you can get it via Wayback Machine.

 

I have a more optimized solution I discovered:

 

;---------------------------------------
;  Divide by 3
;  18 bytes, 30 cycles
;---------------------------------------
 sta  temp
 lsr
 adc  #21
 lsr
 adc  temp
 ror
 lsr
 adc  temp
 ror
 lsr
 adc  temp
 ror
 lsr

 

 

At some point I will compile all of the solutions I have found into a single file, as I have discovered some more since that topic. Circus Atariage currently makes use of the divide by 18, and the divide by 3.

Link to comment
Share on other sites

30 cycles !!!! Whoo maybe I'll be able to get rid of my huge table !!!!

 

Oh by the way it works, I can move my sprite where I want ! I discovered that a scanline routine must be aligned..... what a stupid bug !

 

I'm happy, it's far from being optimlized, consume lot of RAM and ROM but it's mine :)

Let's optimize it now !

 

EDIT : added omegamatrix divide3 routine !!!! It roxx !

source8b.asm

source8b.bin

Edited by shazz
Link to comment
Share on other sites

Replace your macro.h file with this one:

macro.h.zip

 

Then change PosObject to be this:

PosObject:   ; A holds X value
       sec ; X holds object, 0=P0, 1=P1, 2=M0, 3=M1, 4=Ball
       sta WSYNC
DivideLoop
       sbc #15        ; 2
       sbcs DivideLoop; 2    4
       eor #7         ; 2    6
       asl            ; 2    8
       asl            ; 2   10
       asl            ; 2   12
       asl            ; 2   14
       sta.wx HMP0,X  ; 5   19
       sta RESP0,X    ; 4   23 <- set object position
SLEEP12 rts            ; 6   29

 

the sbcs in is a macro that will drop in the bcs command, plus it validates that the branch is to the same page. If it's not the compile will fail with an error message.

 

There's version of the macro for all 8 branch commands, and another set of macros that are preceded with d for branches that must branch to a different page (for a 4 cycle branch).

 

Don't use the macros for every branch though, only for those with critical timing.

Link to comment
Share on other sites

Thanks Darrel ! Yes, very useful !!! (strange my macro.h is v106 but has different and less macros)

Even if in my case modifying one line of my code was making the BPL of my scanline loop to cross, so 77 cycles !!!! Unfortunate but I learned :)

 

So with your help and omegamatrix, I really improved the code, no more RAM routine generation ! I use a delay table per cycle (and not per 2 cycles anymore)

 

Starts to look good ! I'm proud (of you :D) !

Now I can use the RAM for something else !

source8e.bin

source8e.asm

Edited by shazz
Link to comment
Share on other sites

 

At some point I will compile all of the solutions I have found into a single file, as I have discovered some more since that topic. Circus Atariage currently makes use of the divide by 18, and the divide by 3.

 

I think it would be useful if you retained those

routines that you could qualify for lesser accuracy

 

I mean if you eg have a divide by 3 that's

accurate for dividends 0..x where x is less

than 255 that could still be useful even if it fails

for numbers > x

if you know your dividend is always going

to be <=x

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...