Jump to content
IGNORED

Problem using 2nd DLI


Recommended Posts

I have on my "to learn" list multiple DLIs and thanks to the people who have sent me some of their code to study (obviously I need to do this)... but... I thought I'd try and add a 2nd DLI on the lower status bar of my game - when I do the game crashes - trying to run code from incorrect location.

 

DLI pushes A/X/Y to the stack then I have two routines each ends by setting VDSLST to the other one, then pulls A/X/Y before the RTI.

 

Single DLI works fine. Displist also works fine.

 

First DLI set at top of screen, 2nd where the background turns grey:

post-19705-0-95575300-1525020139.png

What am I doing wrong?

Thanks :dunce:

  • Like 1
Link to comment
Share on other sites

You write :

push A/X/Y and pulls A/X/Y

 

You make it in this direction?

 

You must push:

pha 
txa
pha
tya
pha

and pull:

pla
tay
pla
tax
pla

Best way is of course show us source code of DisplayListInterrupt.

Yes I'm doing exactly this - single DLI works fine.

I work on XE so would have to get MAC on emulator to show code (will do tomorrow)

 

roughly this order of events:

flag to stop DLI running other code - DLI then just pushes pulls quits

point to first DLI

switch DL (has two lines with DLI set)

enables DLI (crashes here)

 

ok I type here....

 

DLICODE1

colour changes etc then

LDA # <DLICODE2

STA VDSLST

LDA # <DLICODE2

STA VDSLST+1

JMP QDLICODE

 

DLICODE2

colour changes etc then

LDA # <DLICODE1

STA VDSLST

LDA # <DLICODE1

STA VDSLST+1

JMP QDLICODE

 

QDLICODE

pulls A/X/Y

RTI

 

HA - as i've typed this It's occurred to me that the 2nd part is not pushing onto the stack so it's then pulling incorrect values back as it exits - ok will see if that's it tomorrow and report back!!

Thanks guys ;)

  • Like 1
Link to comment
Share on other sites

The problem with 2 or more DLIs is that you might be doing the task for one in the wrong place. What helps is using a VBlank Immediate routine that sets the first vector, initializes indexes etc (e.g. for colour changes it's often easy to use tables then just increment an index along the way).

 

Another method could be to check VCOUNT though it does add some time. Compare to the lowest expected value, then BCC to a section that initializes the variables used during the frame.

  • Like 2
Link to comment
Share on other sites

...

 

HA - as i've typed this It's occurred to me that the 2nd part is not pushing onto the stack so it's then pulling incorrect values back as it exits - ok will see if that's it tomorrow and report back!!

Thanks guys ;)

It was that - It now works.

 

 

The problem with 2 or more DLIs is that you might be doing the task for one in the wrong place. What helps is using a VBlank Immediate routine that sets the first vector, initializes indexes etc (e.g. for colour changes it's often easy to use tables then just increment an index along the way).

 

Another method could be to check VCOUNT though it does add some time. Compare to the lowest expected value, then BCC to a section that initializes the variables used during the frame.

 

thanks rybags :thumbsup: I am using VCOUNT for some of the colour effects so I was pondering about also using it in the 2nd way you mention. I have some other examples to study from NRV and the Rain of Terror guys, heck the danger then is I think of something else to do with my game instead of getting what I have already finished in time :-o

Link to comment
Share on other sites

The nice thing about manipulating VDSLST in readiness for the next interrupt is that it's done after the DLI code performs colour changes, etc (the timing-critical parts). As long as the interrupts aren't too closely placed, you have much greater leeway with what happens after you've finished writing to the hardware registers, so it's often best to opt for the fastest entry method even if the exit method is slower as a result. You can also - by trading off the expense of pushing multiple registers - use one of the index registers to access a look-up table of DLI addresses:

	inc InterruptNumber
	ldx InterruptNumber
	lda TableLo,x
	sta VDSLST
	lda TableHi,x
	sta VDSLST+1
	pla
	tax
	pla
	rti


TableLo
	.byte <DLI1, <DLI2, ...
TableHi
	.byte >DLI1, >DLI2, ...

If the last entry in the table is the address of the first DLI, VDSLST will automatically be set up for the first interrupt on the next frame, although you can reset InterruptNumber and VDSLST in the VBI as an extra insurance that things stay in sync if you really need to.

 

You can also prime all three CPU registers with values ahead of a write to WSYNC if things are really tight, and then just store A, X and Y in the required hardware registers.

 

About the only drawback with the VDSLST manipulation approach is that each entry point needs a copy of the register pushing code.

  • Like 2
Link to comment
Share on other sites

If they are tightly packed together then the DLIs can share a common pull/exit sequence.

Typically the last thing you are doing, e.g. setting the next dlist high byte, is a non-zeropage address and so you can use a branch rather than a jump.

So expanding on what was written in post #6:

DLICODE1
  push regs
  colour changes etc then
  LDA #<DLICODE2
  STA VDSLST
  LDA #>DLICODE2
  STA VDSLST+1
  BNE QDLICODE

DLICODE2
  push regs
  colour changes etc then
  LDA #<DLICODE1
  STA VDSLST
  LDA #>DLICODE1
  STA VDSLST+1
  BNE QDLICODE

QDLICODE
  pulls regs
  RTI

If branching distances become an issue, move the QDLICODE to the middle of dlists :)

 

Similarly, some time can be saved on push/pulls as we tend to know that the dli's aren't going to interfere with each other and so can use self-modifying code.

 

So a macro for the setup can store the reg values in the exit code:

.MACRO dli_entry
  STY QDLICODE+1
  STX QDLICODE+3
  STA QDLICODE+5
.ENDM

DLICODE1
  dli_entry
  colour changes etc then
  LDA #<DLICODE2
  STA VDSLST
  LDA #>DLICODE2
  STA VDSLST+1
  BNE QDLICODE

QDLICODE
  LDY #0
  LDX #0
  LDA #0
  RTI

DLICODE2
  dli_entry
  colour changes etc then
  LDA #<DLICODE1
  STA VDSLST
  LDA #>DLICODE1
  STA VDSLST+1
  BNE QDLICODE

Edited by Wrathchild
  • Like 2
Link to comment
Share on other sites

if you have 3 bytes on ZP left you can store A,X,Y there - it's even faster than the self modifying code example above. using it all the time (i.e., once in a couple of years).

Edited by pirx
  • Like 2
Link to comment
Share on other sites

I love these threads - there are so many ways of doing even the most simple of things on this cool little machine :)

 

Yeah, threads like this one explain, why so many projects never get finished. Even people from outside get pointed to a fully wrong decision.

 

There are really more relevant parts to use in the page 0 . Particulary some math routines or sprite multiplexing.

But, who cares...

Link to comment
Share on other sites

I love these threads - there are so many ways of doing even the most simple of things on this cool little machine :)

 

Yes this is really useful, the great thing with assembler is you just think I'll do that, hmm I'll do that and off you go, from programming in BASIC to now having all the RAM to play with really opens things up.

 

Unless any of the US magazines covered the more intermediate to advanced assembler stuff I think there is a lot of information that could make a good read to people (like me) who are "progressing". I have a bunch of books but they only go so far. Would be great to have more of this pulled together by topic somewhere, filling the gap between the existing books and a commercial quality games.

 

I've started writing an assembler column for a future Pro© Mag (if we get another one) or perhaps Excel mag to get people started / interested - but releases are few and far between so it will take ages for this to get anywhere.

 

So storing the registers using less instructions/as discussed gives you more time for your first DLI code to operate/take effect sooner/get more done in the time. Perhaps at some point I will need this!

 

I'm getting an issue with the 2 DLIs getting out of sync: in my example screenshot the grey background is on the top half sometimes. I am looping on wsync on the first DLI and thought possibly it went past the point where it reached the 2nd DLI but it's not that. Will take a look next session and use the vcount check to switch between them as discuseed if needed.

 

p.s. emkay re Page 0 - I'm not at the point of using Sprite Multiplexing (or any serious math). I was delighted to have a single split of the players for my status bars in Ramp Rage and Space Fortress Omega ;)

Edited by therealbountybob
  • Like 1
Link to comment
Share on other sites

p.s. emkay re Page 0 - I'm not at the point of using Sprite Multiplexing (or any serious math). I was delighted to have a single split of the players for my status bars in Ramp Rage and Space Fortress Omega ;)

 

The point is: You should take a look at what the program does the most, and the most CPU intensive. This is what has to be put into Page 0. Not some DLIs...

Link to comment
Share on other sites

You have (at least) 128 bytes of page zero to do with as you please if you don't use the FP package. You can use the whole lot if you don't need the OS. I think DLIs - especially if time-critical - constitute a perfectly reasonable use of ZP for the purpose of storing registers. You're talking about "what is the most CPU intensive". Let's say a DLI fires twenty-four times per frame. That's 1,200 times a second on a PAL machine. What's your measure of CPU intensive? :)

  • Like 1
Link to comment
Share on other sites

Yeah, threads like this one explain, why so many projects never get finished. Even people from outside get pointed to a fully wrong decision.

 

There are really more relevant parts to use in the page 0 . Particulary some math routines or sprite multiplexing.

But, who cares...

--- Deleted --- (has to be language barrier)

OK - I got into a bit of a mess here(this forum) from "posting before thinking syndrome. Meaning I am often an asshole that speaks before thinking.

 

So here, I want to ask emkay an honest question. This is a case of people discussing fun techniques for making "better" code. You come in mid thread and say it's "wasted effort". My initial reaction was "why - we're not using Graphics 7".

 

I think this is an issue of where we mis-communicate on here. I want to be more conscious of perhaps offending people when I post. So - I am asking honestly. Why did this post warrant your reply emkay? I think we often try to convey the same message. We all want to get the most out of our precious little 40 year old machines. I think we perhaps "fight" because we argue over petty details, not realizing we all want the same goals.

 

  • Like 1
Link to comment
Share on other sites

I think this is an issue of where we mis-communicate on here. I want to be more conscious of perhaps offending people when I post. So - I am asking honestly. Why did this post warrant your reply emkay? I think we often try to convey the same message. We all want to get the most out of our precious little 40 year old machines. I think we perhaps "fight" because we argue over petty details, not realizing we all want the same goals.

Because you could read that in every "solution" ? "Use Page 0" ?

Particular for DLIs , it is not really useful?

Antic does cycle stealing when DLIs were needed. Depending on the command, the CPU is doing something, while the RAM access is halted.

Outside the DMA range, the CPU can run full speed (well, we know, except the RAM refresh) . The loss or gain is rather pointless in the DMA range.

Calculations, sorting routines, etc. , would do better using page 0 and outside the DMA range(or even VBI).

 

At post #14 the best answer was given already....

Link to comment
Share on other sites

If we're going to qualify the suggestions already offered by actual active programmers, let's not lose sight of the fact that getting into the register changes often has to happen as quickly as possible. In that case, you want to store any required registers as efficiently as possible, using the fewest machine cycles. TXA, PHA is a five cycle sequence. STX ABS is a four cycle instruction. STX ZP takes three cycles. DMA overheads, etc, are not going to cancel out savings afforded by ZP usage.

 

Again: you have plenty of ZP space. No-one is dictating that ZP should be used by DLIs any more than anyone - aside from yourself - is saying that it should not. If the developer requires extra cycles and ZP is available or can be made available without adversely affecting other parts of the program, then have at it. ZP references are also more compact (1 byte shorter than ABS and of equal size to the TXA/PHA sequence), and performing a pass through software late in the development phase, deliberately moving absolute references to unused ZP addresses can save valuable space and further improve performance.

 

It amazes me that anything negative can be derived from such a positive thread (several developers, all offering their views based on their experience), but perhaps we should not be surprised.

Edited by flashjazzcat
  • Like 7
Link to comment
Share on other sites

I am enjoying this thread as an exercise in seeing how many ways this can be done. When I was trying to learn how to use DLIs,every example from the old magazines and De Re Atari all pushed A, X, Y to the stack at entry and pulled them back right before the RTI. Even if you aren't using ZP, the cycle savings on this thread vs what you find in those old sources are invaluable. The way I count storing x and loading it vs transferring it to the stack is 8 cycles vs 11 cycles:

; Save X at beginning

STX XTemp ; 4 cycles

; Load X before RTI

LDX XTemp ; 4 cycles

Total: 8 Cycles and 6 bytes of memory (placing XTemp in ZP would be 6 cycles and 4 bytes of Memory)

 

VS

 

; Save X at beginning

TXA ; 2 cycles

PHA ; 3 cycles

; Load X before RTI

PLA ; 4 cycles

TAX ; 2 cycles

 

 

Total: 11 cycles and 4 bytes of memory.

 

my cycle counting source is: https://www.atariarchives.org/alp/appendix_1.php

 

BTW, I think that if you did put XTemp in ZP, you could also use it in your VBI for temp storage for other purposes. When I say "temp storage" I am referring to a value that only is used during the same instance of the DLI/VBI in which it was created. Also, using XTemp outside of the DLI\VBI could be disastrous because an interrupt could happen at any time and use the wrong value. If my thinking is wrong on this, please let me know.

  • Like 5
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...