Jump to content

Photo

Bankswitch per jmp/jsr/rts


33 replies to this topic

#1 SvOlli OFFLINE  

SvOlli

    Chopper Commander

  • 179 posts
  • Location:Hannover, Germany

Posted Wed May 1, 2013 12:24 PM

I just wanted to share this, as it was in my head for a couple of days now, without any concrete use.

Jac! had inspired me to write some code that does almost transparent bankswitching that works using jmp and even jsr/rts.

First, here's the code for F4 bankswitching, which - of cause - needs to be present at the same location of each bank:
.define HOTSPOTS $FFF4

.macro bankjsr _addr
   lda #>(_addr-1)
   ldy #<(_addr-1)
  jsr bankjmpcode
.endmacro

.macro bankjmp _addr
  lda #>(_addr-1)
  ldy #<(_addr-1)
  jmp bankjmpcode
.endmacro

.macro bankrts
  jmp bankrtscode
.endmacro

bankjmpcode:
  pha
  asl
  rol
  rol
  rol
  and #%00000111
  tax
  tya
  pha
  lda HOTSPOTS,x
  rts

bankrtscode:
  tsx
  lda $02,x
  asl
  rol
  rol
  rol
  and #%00000111
  tax
  lda HOTSPOTS,x
  rts
Disadvantages: jmp/jsr destroy all registers, rts just leave Y intact.
Advantages: very flexible, bankrts als works when not called via bankjsr, does not need a fixed ram address for the jmp destination.

For F6/F8 bankswitching, just reduce the number of "rol"s and the number of bits in the "and"s.

#2 Gemintronic ONLINE  

Gemintronic

    Jason S. - Lead Developer & CEO

  • 8,837 posts

Posted Wed May 1, 2013 12:34 PM

Can the NES MMC chip strategy work on the Atari 2600? Wouldn't that make extra memory and bankswitching simpler at the cost of extra components?

#3 Thomas Jentzsch OFFLINE  

Thomas Jentzsch

    Thrust, Jammed, SWOOPS!, Boulder Dash, THREE·S, Star Castle

  • 22,729 posts
  • Always left from right here!
  • Location:Düsseldorf, Germany, Europe, Earth

Posted Thu May 2, 2013 8:44 AM

:idea: For Boulder Dash (and Star Castle) we use a macro when defining a sub routine. This macro defines a variable which contains the bank "<name>_BANK".

So when you call the subroutine, the macro generates the bank variable name from the subroutine name and then determines the bank from that variable.

I think if you follow that path, your code could become a bit simpler.

#4 SvOlli OFFLINE  

SvOlli

    Chopper Commander

  • Topic Starter
  • 179 posts
  • Location:Hannover, Germany

Posted Thu May 2, 2013 11:59 AM

Posted Image For Boulder Dash (and Star Castle) we use a macro when defining a sub routine. This macro defines a variable which contains the bank "<name>_BANK".

So when you call the subroutine, the macro generates the bank variable name from the subroutine name and then determines the bank from that variable.

Also an interesting idea. It's like one of the golden rules of C/C++ programming: try to move as many of the calculations to compile-time instead of run-time.

I want to use that kind of bankswitching for my next (8k-or-bigger) demo.

Right now I'm using a jump-table, something like this:
jumptable:
.word part1-1, part2-1, part3-1, part4-1, reset
[...]
mainloop:
jsr vblank
lda index
asl
tax
lda jumptable+1,x
pha
lda jumptable,x
pha
rts
And in this kind of code can be easily extended to bankswitch the way I introduced here. The rest is just sugar coating, but I like the idea, that jsr and rts work almost like expected. Best part is: the bankswitching-rts code does also work, when not called using bankswitch-jsr.

#5 roland p OFFLINE  

roland p

    River Patroller

  • 2,395 posts
  • $23
  • Location:The Netherlands

Posted Thu May 2, 2013 12:14 PM

I hardcode all jumps so they are fast. The following fragment is in bank 3, which jumps to bank 4

End of bank 3:

org $3FDF
rorg $FFDF

JMP_KERNEL_3
nop BANK4
JMP_KERNEL_3_BORDER_RIGHT_1
nop BANK4
JMP_KERNEL_3_BORDER_RIGHT_2
nop BANK4

End of bank 4:

org [[JMP_KERNEL_3+3] & $0FFF + $4000]
rorg [JMP_KERNEL_3+3]
JMP KERNEL_3
JMP KERNEL_3_BORDER_RIGHT_1
JMP KERNEL_3_BORDER_RIGHT_2

So in bank 3 I want to jump to KERNEL_3 in bank 4. So I use JMP JMP_KERNEL_3 there. It jumps to the nop which triggers the bankswitch, then bank 4 becomes active and does JMP KERNEL_3.

Maybe the code can be cleaner, I just hacked around :) It has to be expanded if you want to rts back.

Edited by roland p, Thu May 2, 2013 12:16 PM.


#6 Mr SQL OFFLINE  

Mr SQL

    Stargunner

  • 1,746 posts

Posted Thu May 2, 2013 1:34 PM

Also an interesting idea. It's like one of the golden rules of C/C++ programming: try to move as many of the calculations to compile-time instead of run-time.

I want to use that kind of bankswitching for my next (8k-or-bigger) demo.

Right now I'm using a jump-table, something like this:

jumptable:
.word part1-1, part2-1, part3-1, part4-1, reset
[...]
mainloop:
jsr vblank
lda index
asl
tax
lda jumptable+1,x
pha
lda jumptable,x
pha
rts
And in this kind of code can be easily extended to bankswitch the way I introduced here. The rest is just sugar coating, but I like the idea, that jsr and rts work almost like expected. Best part is: the bankswitching-rts code does also work, when not called using bankswitch-jsr.


SvOlli,
you might like the backswitching model I've implemented in my ASDK Framework; it's incredibly simple, all proxy stub so there are no cycles to count.

I'll be teaching the method presently on my Learn Assembly thread but you'll get it just by looking at it :)

#7 Mr SQL OFFLINE  

Mr SQL

    Stargunner

  • 1,746 posts

Posted Thu May 2, 2013 2:55 PM

SvOlli,
I just added the chapter, you should read it just for fun - I guarantee you'll get a kick out of it :)

http://atariage.com/...k/#entry2746526

#8 SvOlli OFFLINE  

SvOlli

    Chopper Commander

  • Topic Starter
  • 179 posts
  • Location:Hannover, Germany

Posted Thu May 2, 2013 3:24 PM

SvOlli,
I just added the chapter, you should read it just for fun - I guarantee you'll get a kick out of it :)

Literately! ;)

Looking at your code, you pay extra 2 * 7 bytes per bankswitching jsr/rts combination. Mine is 24 bytes in total + 6 bytes extra per jsr/rts. This means at 3 calls both implementation use equal amount of rom. Starting with 4 call mine needs less space.

#9 Mr SQL OFFLINE  

Mr SQL

    Stargunner

  • 1,746 posts

Posted Thu May 2, 2013 4:21 PM

Literately! Posted Image

Looking at your code, you pay extra 2 * 7 bytes per bankswitching jsr/rts combination. Mine is 24 bytes in total + 6 bytes extra per jsr/rts. This means at 3 calls both implementation use equal amount of rom. Starting with 4 call mine needs less space.

Excellent points SvOlli, both models have their advantages and I would definitely switch to yours with a large number of calls when hitting space constraints!

You've got rolls in your code while I have action figures, but I'm surprised none of the advanced coders rolled their eyes at my previous chapters where I first avoid bitwise operations and then neatly dispose of Hexadecimal notation by submerging it in a bath of liquid Helium :)

Edited by Mr SQL, Thu May 2, 2013 4:21 PM.


#10 Omegamatrix OFFLINE  

Omegamatrix

    Quadrunner

  • 6,125 posts
  • Location:Canada

Posted Thu May 2, 2013 5:57 PM

Here is another way to bankswitch. I discovered it when I had a program keep intermediately crashing, long ago. It involves branching from $FF page to $FE, when the low address being branched to also is a low address of a hotspot. Since the high address of the branch gets updated after, the hotspot gets triggered. I jump between all 4 banks at the beginning of overscan. You will have to open the debugger to take a look. Nothing but a green screen otherwise. ;)


Attached File  TestBank.zip   1.21KB   183 downloads

#11 Thomas Jentzsch OFFLINE  

Thomas Jentzsch

    Thrust, Jammed, SWOOPS!, Boulder Dash, THREE·S, Star Castle

  • 22,729 posts
  • Always left from right here!
  • Location:Düsseldorf, Germany, Europe, Earth

Posted Fri May 3, 2013 3:24 AM

Also an interesting idea. It's like one of the golden rules of C/C++ programming: try to move as many of the calculations to compile-time instead of run-time.

Yup! And especially relevant for a console with so little CPU power.

#12 Thomas Jentzsch OFFLINE  

Thomas Jentzsch

    Thrust, Jammed, SWOOPS!, Boulder Dash, THREE·S, Star Castle

  • 22,729 posts
  • Always left from right here!
  • Location:Düsseldorf, Germany, Europe, Earth

Posted Fri May 3, 2013 3:27 AM

And in this kind of code can be easily extended to bankswitch the way I introduced here. The rest is just sugar coating, but I like the idea, that jsr and rts work almost like expected. Best part is: the bankswitching-rts code does also work, when not called using bankswitch-jsr.

You should have a look at my Star Castle code (link in my blog). There the task scheduler calls subroutines which then return to the scheduler. Or, to make things a bit more efficient (and complicated :)) directly continue with the next task.

#13 Andrew Davie OFFLINE  

Andrew Davie

    Stargunner

  • 1,782 posts
  • Dr.Boo
  • Location:Tasmania

Posted Fri May 3, 2013 6:22 AM

Per Thomas's comment, here's the basics of bankswitching under Boulder Dash™...
It's pretty simple, and fairly elegant to use...
First, the macro for defining 'subroutines'. Every routine that lives in a bank is defined using this macro...

MAC DEFINE_SUBROUTINE			 ; name of subroutine
BANK_{1}	 = _CURRENT_BANK		 ; bank in which this subroutine resides
			 SUBROUTINE			 ; keep everything local
{1}									 ; entry point
ENDM


Here's the actual usage...


DEFINE_SUBROUTINE ProcessExplosion
; contents...
rts



and here's how we call it...

			 lda #BANK_ProcessExplosion
			 sta SET_BANK
			 jsr ProcessExplosion

So we can easily move the subroutine somewhere else into a different bank -- basically just cut/paste it to anywhere, and all the code gets it right. The bank is correct.
This also makes it very easy to create vector tables, because the BANK_label is automatically calculated for all subroutines. Easy.

Just one addition: The banks themselves are also 'auto-calculated' using the following macro...

		    MAC NEWBANK ; bank name
			    SEG {1}
			    ORG ORIGIN
			    RORG $F000
BANK_START	  SET *
{1}			 SET ORIGIN / 2048
ORIGIN		  SET ORIGIN + 2048
_CURRENT_BANK   SET {1}
		    ENDM



Cheers
A

#14 Mikes360 OFFLINE  

Mikes360

    Chopper Commander

  • 117 posts
  • Location:Mansfield, United Kingdom

Posted Fri Dec 13, 2013 2:52 AM

			 lda #BANK_ProcessExplosion
			 sta SET_BANK
			 jsr ProcessExplosion

This looks very nice but I am having trouble understanding the above. Is #BANK_ProcessExplosion the address of a bankswitch hotspot $1FF8 for example?

 

What is SET_BANK doing? By the name is implies it is actually doing the bankswitch.

 

When a bankswitch happens I thought you ended up in the same place but in the new bank so how can you guarantee jsr ProcessExplosion is in the correct place in the other bank?



#15 SeaGtGruff OFFLINE  

SeaGtGruff

    Quadrunner

  • 5,558 posts
  • Location:Georgia, USA

Posted Fri Dec 13, 2013 9:08 AM

Here's a method I use:

 

   MAC GOTO
   LDY #[(>{0}&$F0)-$10]/$20
   LDX #<{0}
   LDA #>{0}
   JMP Switch_and_Go
   ENDM
 
   MAC GOSUB
   LDY #[(>.&$F0)-$10]/$20
   STY Return_Bank
   LDY #[(>{0}&$F0)-$10]/$20
   LDX #<{0}
   LDA #>{0}
   JSR Switch_and_Go
   ENDM
 
   MAC RETURN
   JMP Switch_and_Return
   ENDM
 
Target = $80 ; or whatever
Target_Lo = Target
Target_Hi = Target+1
Return_Bank = $82 ; or whatever
 
; Then this goes just before the bankswitching hotspots, in all banks
 
Switch_and_Go
   STX Target_Lo
   STA Target_Hi
   LDA Select_Bank,Y
   JMP (Target)
 
Switch_and_Return
   LDY Return_Bank
   LDA Select_Bank,Y
   RTS
 
Select_Bank ; hotspots
   HEX FF FF FF FF ; this is for 16K ROMs

 

This assumes that each 4K bank has its own address range, with bank 0 at $1000, bank 1 at $3000, bank 2 at $5000, etc. To use:

 

   ; if you don't care about coming back to the original bank
   GOTO Whatever ; the GOTO macro will figure out what bank to switch to based on the address of Whatever
 
   ; or, if you want to return to the original bank
   GOSUB Whatever ; the GOSUB macro will figure out which bank you're currently in and save it, then figure out which bank you want to switch to
 
   ; to return from the GOSUB
   RETURN

 

I don't know if this is the "best" or "fastest" or "smallest" method, but the idea is to be as "transparent" as possible-- let the macros do the work of figuring out the bank numbers. The downside is that it doesn't allow for multiple levels of GOSUB calls, since there's only one Return_Bank variable. Also, it destroys whatever was in A, X, and Y.



#16 Mikes360 OFFLINE  

Mikes360

    Chopper Commander

  • 117 posts
  • Location:Mansfield, United Kingdom

Posted Mon Dec 16, 2013 8:17 AM

This method looks great thanks for sharing this. I am also looking for a "transparent" approach and this is a very good example. I assume Andrews method works with switching only the last 2k of ROM?



#17 SvOlli OFFLINE  

SvOlli

    Chopper Commander

  • Topic Starter
  • 179 posts
  • Location:Hannover, Germany

Posted Mon Dec 16, 2013 1:23 PM

Andrew's method works only on the 3F or 3E bankswitching which changes the first 2k of ROM, the last 2k are fixed.

 

While we're on it: the bankswiching I introduced on this topic is working fine in a demo (not game) that's due for release in 2014. I've got a subdirectory for each bank and move code from bank to bank, just by moving the file from one directory to another.

 

The code also got optimized a bit:

.macro bankjsr _addr
   ldx #>(_addr-1)
   lda #<(_addr-1)
   jsr bankjmpcode
.endmacro

.macro bankjmp _addr
   ldx #>(_addr-1)
   lda #<(_addr-1)
   jmp bankjmpcode
.endmacro

.macro bankrts
   jmp bankrtscode
.endmacro
 
bankjmpcode:
   pha
   txa
   pha
   ; slip through

bankrtscode:
   tsx
   lda $02,x
   asl
   rol
   rol
   rol
   and #%00000111
   tax
   lda HOTSPOTS,x
   rts

Now, Y is unchanged, and less ROM is used. :)



#18 Kylearan OFFLINE  

Kylearan

    Chopper Commander

  • 202 posts

Posted Tue Dec 17, 2013 3:35 AM

[...] a demo (not game) that's due for release in 2014.

 

Oooh, looking forward to this! If by any chance this happens to be at Revision, expect some competition. ;)  (Which would be entirely your own "fault", as your talk at this year's Revision prompted me to start coding a demo for the VCS.  :grin: )

 

Anyway, thanks to all for showing the different ways of doing bank switching; I'll borrow some ideas when it comes to linking my demo...



#19 enthusi OFFLINE  

enthusi

    Moonsweeper

  • 358 posts
  • Location:Potsdam, Germany

Posted Tue Dec 17, 2013 7:27 AM

Real VCS demo competition at Revision2014?

Must.... be... there....

Damn, I really hope I can make it.



#20 SvOlli OFFLINE  

SvOlli

    Chopper Commander

  • Topic Starter
  • 179 posts
  • Location:Hannover, Germany

Posted Tue Dec 17, 2013 4:30 PM

Short notice in entry #17 in the macros "lda" and "ldx" are switched. Copy'n'paste error. Sorry, I wanted to edit this, but can't.

 



#21 Omegamatrix OFFLINE  

Omegamatrix

    Quadrunner

  • 6,125 posts
  • Location:Canada

Posted Thu Dec 19, 2013 8:39 PM

SvOlli, I think you can replace this with five LSR's and save a byte (same amount of cycles as before):

   asl
   rol
   rol
   rol
   and #%00000111


#22 SvOlli OFFLINE  

SvOlli

    Chopper Commander

  • Topic Starter
  • 179 posts
  • Location:Hannover, Germany

Posted Fri Dec 20, 2013 1:23 AM

Spotted nicely,

 

but since the code is generated, I'm not sure if I'm gonna include this change. It's generated, because it's also working with F6 and F8 with only a few modifications:

 

F6:

   asl
   rol
   rol
   and #%00000011

F8:

   asl
   rol
   and #%00000001

And sticking to this pattern keeps the code more straight forward. On the other hand, if I should need that one byte in my demo, I know where to start... icon_winking.gif



#23 Thomas Jentzsch OFFLINE  

Thomas Jentzsch

    Thrust, Jammed, SWOOPS!, Boulder Dash, THREE·S, Star Castle

  • 22,729 posts
  • Always left from right here!
  • Location:Düsseldorf, Germany, Europe, Earth

Posted Fri Dec 20, 2013 3:56 AM

To save space you could use BRK instead of JSR. Also the target address could be stored in the two bytes following the BRK. You just have to slightly adjust the return address then.


Edited by Thomas Jentzsch, Fri Dec 20, 2013 3:59 AM.


#24 enthusi OFFLINE  

enthusi

    Moonsweeper

  • 358 posts
  • Location:Potsdam, Germany

Posted Fri Dec 20, 2013 4:25 AM

SuperMarioBros on NES used that routine, which is not competitive in size but I just wanted to add it :)

ScreenRoutines:
      lda ScreenRoutineTask        ;run one of the following subroutines
      jsr JumpEngine
    
      .word InitScreen
      .word SetupIntermediate
      .word WriteTopStatusLine
...


JumpEngine:
       asl          ;shift bit from contents of A
       tay
       pla          ;pull saved return address from stack
       sta $04      ;save to indirect
       pla
       sta $05
       iny
       lda ($04),y  ;load pointer from indirect
       sta $06      ;note that if an RTS is performed in next routine
       iny          ;it will return to the execution before the sub
       lda ($04),y  ;that called this routine
       sta $07
       jmp ($06)    ;jump to the address we loaded


#25 Thomas Jentzsch OFFLINE  

Thomas Jentzsch

    Thrust, Jammed, SWOOPS!, Boulder Dash, THREE·S, Star Castle

  • 22,729 posts
  • Always left from right here!
  • Location:Düsseldorf, Germany, Europe, Earth

Posted Fri Dec 20, 2013 5:15 AM

SuperMarioBros on NES used that routine, which is not competitive in size...

It is just a matter of how often you call that subroutine.

SvOlli's macro requires 7 bytes/call, NES 5, mine 3. NES and mine have extra overhead in the called routine.

So to find the optimal solution you have to know how many calls you need and then do the math.




0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users