Jump to content

Photo

Lean F4 boot framework


19 replies to this topic

#1 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • 197 posts

Posted Fri Mar 9, 2012 10:01 AM

I fiddled around with F4 (32k) bankswitching today and figured out how to boot everything without wasting space for stubs in every bank that jumps to Start in the appropriate bank.
The idea is very simple: put Start in bank 7 and point the initialization vector to $1FFB in all others. At $1FFB you put $4C = JMP Abs which causes the cart to switch to bank 7, read the address at $1FFC-$1FFD (Start) and finallly jump to the initialization stuff just as it would happen if the machine booted in bank 7.
The good thing with this is that it doesn't waste any space - it even uses space that would otherwise be wasted ($1FF4-$1FFB)! Just 18 bytes of overhead per bank - even less if you don't need JMPBank in some of them (saves six bytes).
This approach can be adapted for F6 and F8 as well. The code (feel free to use):
processor 6502
	include vcs.h
	include macro.h

;FPS = 50 or 60
;PAL = 0 or 1
#if FPS==50		 ;PAL
VBLNK   equ 48
LINES   equ 228
OVERSCN equ 36
#else			   ;NTSC
VBLNK   equ 40
LINES   equ 192
OVERSCN equ 30
#endif

VBLNK64 equ VBLNK*19/16-1   ;value to set TIM64 to for VBLNK
OVERS64 equ OVERSCN*19/16   ;ditto for overscan

JMPBank equ $1FEE

	;18 byte bootstrap macro
	;Includes JMPBank routine and JMP to Start in Bank 7
	MAC END_SEGMENT
.BANK   SET {1}
	echo "Bank",.BANK,":", (JMPBank - *), "free"

	org JMPBank + (.BANK * 4096)
	rorg JMPBank
;Jump to fnptr in bank X
;Example usage:
;   SET_POINTER fnptr Address
;   ldx #Bank
;   jmp JMPBank
;
	;$1FEE-$1FF3
	nop $1FF4,X	 ;3 B
	jmp (fnptr)	 ;3 B
	;$1FF4-$1FFB
	.byte 0,0,0,0
	.byte 0,0,0,$4C ;JMP Start (reading the instruction jumps to bank 7, where Start's address is)
	;$1FFC-1FFF
	.word $1FFB
	.word $1FFB
	;Bank .BANK+1
	org $1000 + ((.BANK + 1) * 4096)
	rorg $1000
	ENDM

	;RAM
	SEG.U VARS
	org $80

fnptr   ds  2

	echo "RAM:", ($100 - *), "bytes left"

	;ROM
	SEG CODE

	;Bank 0
	org $1000
	rorg $1000

Dummy
	.byte $FF	   ;Dummy byte else DASM fails to assemble this correctly.
					;You can remove this when you have anything in Bank 0 at $1000

	END_SEGMENT 0

	END_SEGMENT 1

	END_SEGMENT 2

	END_SEGMENT 3

	END_SEGMENT 4

	END_SEGMENT 5

	END_SEGMENT 6

Start
	CLEAN_START

MainLoop
	VERTICAL_SYNC
	lda #VBLNK64
	sta TIM64T

WaitForVblankEnd
	lda INTIM
	bmi WaitForVblankEnd_Overflow
	bne WaitForVblankEnd
WaitForVblankEnd_Overflow
	lda #0
	sta VBLANK
	;NOTE: Don't set COLUBK before VBLANK has been turned off (above)
	;	  Otherwise you get ugly colors for the first few lines

	ldx #LINES
Kernel
	sta WSYNC
	stx COLUBK
	dex
	bne Kernel

	lda #OVERS64
	sta TIM64T
	lda #2
	sta VBLANK
WaitForOverscanEnd
	lda INTIM
	bmi WaitForOverscanEnd_Overflow
	bne WaitForOverscanEnd
WaitForOverscanEnd_Overflow
	jmp MainLoop

	echo "Bank",7,":", (JMPBank - *), "free"

	org JMPBank + $7000
	rorg JMPBank
;JMPBank
;Jump to fnptr in bank X
	;$1FEE-$1FF3
	nop $1FF4,X	 ;3 B
	jmp (fnptr)	 ;3 B
	;$1FF4-$1FFB
	.byte 0,0,0,0
	.byte 0,0,0,0
	;$1FFC-1FFF
	.word Start
	.word Start

Edited by Tjoppen, Fri Mar 9, 2012 10:06 AM.


#2 Wickeycolumbus OFFLINE  

Wickeycolumbus

    River Patroller

  • 4,591 posts
  • Location:Michigan

Posted Sat Mar 10, 2012 11:50 AM

Nice trick :thumbsup:

#3 Omegamatrix OFFLINE  

Omegamatrix

    Quadrunner

  • 5,462 posts
  • Location:Canada

Posted Sat Mar 10, 2012 7:49 PM

Very clever! I only tested it in Stella. What I found is it works, but depends on the BRK vector in the last bank (which is fine). I have modified your code to show this. If the BRK vector is hit in Bank 7 then the colours will scroll instead of being stationary.


Attached File  TestBankTjoppen.zip   1.95KB   74 downloads


I think this was a very clever approach to use BRK in this fashion! The other BRK vectors in the other banks are still free too. :)

I also noticed in your code that you put in a dummy byte to have DASM compile the empty space correctly. I have also ran into this bug, and used the same solution (even using a .byte $FF), ha ha

#4 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • Topic Starter
  • 197 posts

Posted Sun Mar 11, 2012 1:51 PM

Very clever! I only tested it in Stella. What I found is it works, but depends on the BRK vector in the last bank (which is fine). I have modified your code to show this. If the BRK vector is hit in Bank 7 then the colours will scroll instead of being stationary.


You seem to imply that BRK would happen randomly? Since it's under software control you can just do you BRK-reliant stuff in one of the other banks since. Their BRK vector is free, as you pointed out.

Speaking of BRK, why to peoeple use it? Is it to save ROM, since It's slower but smaller than JSR?

I also noticed in your code that you put in a dummy byte to have DASM compile the empty space correctly. I have also ran into this bug, and used the same solution (even using a .byte $FF), ha ha


Yep, it's probably time to switch to that other, better assembler I read about on here.

#5 Wickeycolumbus OFFLINE  

Wickeycolumbus

    River Patroller

  • 4,591 posts
  • Location:Michigan

Posted Sun Mar 11, 2012 6:12 PM

Speaking of BRK, why to peoeple use it? Is it to save ROM, since It's slower but smaller than JSR?


Yeah. Probably wouldn't really need to be used in an F4 game, but it comes in handy some times. Of course, you can have different break vectors in every bank, so that can come in handy too.

#6 Omegamatrix OFFLINE  

Omegamatrix

    Quadrunner

  • 5,462 posts
  • Location:Canada

Posted Sun Mar 11, 2012 6:41 PM

You seem to imply that BRK would happen randomly? Since it's under software control you can just do you BRK-reliant stuff in one of the other banks since. Their BRK vector is free, as you pointed out.

It looks like in Stella it hits BRK every single time. You can see the address and processor being pushed on the stack. Every time Stella opened it seemed to like starting in Bank 0, with the stack pointer at $FF. On a real 2600 I'm not sure if the stack pointer would always be at $FF, and I think the bank selection would be more random.

I tried the rom I modified on a Harmony cart today (single image mode), and BRK was hit 50% of the time, and the other 50% it was the Reset vector. Does anyone else get this result?



Speaking of BRK, why to peoeple use it? Is it to save ROM, since It's slower but smaller than JSR?


Yes, for saving bytes. I've used this before in a Genesis controller hack of Kung Fu Master. JMP was being done to go to an address at the end of the rom to preforme a bankswitch. All bankswitching went to the same place, and it occurred many places in the rom. So I used BRK, and then used three PLA's to reset the stack pointer (I didn't know where it was) and finally did the bankswitch. This added 16-19 cycles the old code which luckily could be spared, but saved 16 much needed bytes by using BRK instead of JMP.


Yep, it's probably time to switch to that other, better assembler I read about on here.


I don't mind DASM. Never really had big enough trouble with it to want too switch to another assembler. The biggest hitch for me is all of me disassembles, and 99% of the rest out there on the web are all meant to compile with DASM.

#7 Omegamatrix OFFLINE  

Omegamatrix

    Quadrunner

  • 5,462 posts
  • Location:Canada

Posted Sun Mar 11, 2012 7:11 PM

Little off topic, but I just had a thought. If you adjusted the stack pointer to one of the color registers you could do three color updates in 7 cycles. And you could also use SEI or CLI, CLV, etc to adjust the color provided by the status register.

We know:

COLUP0 = $06
COLUP1 = $07
COLUPF = $08
COLUBK = $09
CTRLPF = $0A


Something like:
STA COLUBK
STX COLUPF
STY COLUP1
SAX COLUP0
BRK ; pointed to COLUBK, or COLUPF

Would do 7 color updates in 19 cycles. However, since CTRLPF is also beside the color registers you might be able to do some amazing color changes by playing with the score bit, and incorporating the ball.

#8 eshu OFFLINE  

eshu

    Chopper Commander

  • 195 posts

Posted Sun Mar 11, 2012 7:28 PM

Little off topic, but I just had a thought. If you adjusted the stack pointer to one of the color registers you could do three color updates in 7 cycles. And you could also use SEI or CLI, CLV, etc to adjust the color provided by the status register.

We know:

COLUP0 = $06
COLUP1 = $07
COLUPF = $08
COLUBK = $09
CTRLPF = $0A


Something like:
STA COLUBK
STX COLUPF
STY COLUP1
SAX COLUP0
BRK ; pointed to COLUBK, or COLUPF

Would do 7 color updates in 19 cycles. However, since CTRLPF is also beside the color registers you might be able to do some amazing color changes by playing with the score bit, and incorporating the ball.



Grr I've been saving that one for ages - I've just posted a very early WIP of the game I'm planning to use it in at: http://www.atariage....ng-on-for-ages/

Here's an image showing how I'm planning to use it:
battlebg2.png

#9 Omegamatrix OFFLINE  

Omegamatrix

    Quadrunner

  • 5,462 posts
  • Location:Canada

Posted Sun Mar 11, 2012 8:07 PM

Grr I've been saving that one for ages - I've just posted a very early WIP of the game I'm planning to use it in at: http://www.atariage....ng-on-for-ages/

Here's an image showing how I'm planning to use it:
battlebg2.png


Are you using BRK there? Mind posting some code?


Screenshot looks beautiful, too. :)

Edited by Omegamatrix, Sun Mar 11, 2012 8:08 PM.


#10 eshu OFFLINE  

eshu

    Chopper Commander

  • 195 posts

Posted Sun Mar 11, 2012 8:23 PM

I've only actually got the bit that uses BRK coded on paper - but it's to get 6 colours in 12 pixels. It works like this:

P1|P1|PF|PF|PF|PF|P0|P0|M1|M1|P0|P0
--|--|--|--*--|--|--*--|--|--*--|--

* Are where BRK stores to COLUPF, COLUP1, COLUP0 in succession.

Edit: Obviously you have to control the address BRK is called from and set up the processor status appropriately.

My notes have 5F, 5D, 5B, 59, 57, 55 for the colours - so it must get called from $5B57 with V,B,and I flags set for P=$54

Edit again: Which means I have the colours in the reverse order in the mockup above :) - I think it's probably the mockup that's wrong but my notes aren't very well organised :)

Edited by eshu, Sun Mar 11, 2012 8:36 PM.


#11 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • Topic Starter
  • 197 posts

Posted Mon Mar 12, 2012 7:03 AM

Little off topic, but I just had a thought. If you adjusted the stack pointer to one of the color registers you could do three color updates in 7 cycles. And you could also use SEI or CLI, CLV, etc to adjust the color provided by the status register.

I never thought of that - I'll have to cook up an effect based on it for Revision :)

#12 stephena OFFLINE  

stephena

    River Patroller

  • 2,489 posts
  • Stella maintainer
  • Location:Newfoundland, Canada

Posted Mon Mar 12, 2012 7:50 AM

It looks like in Stella it hits BRK every single time. You can see the address and processor being pushed on the stack. Every time Stella opened it seemed to like starting in Bank 0, with the stack pointer at $FF. On a real 2600 I'm not sure if the stack pointer would always be at $FF, and I think the bank selection would be more random.

I tried the rom I modified on a Harmony cart today (single image mode), and BRK was hit 50% of the time, and the other 50% it was the Reset vector. Does anyone else get this result?


I would be interested in test ROMs and/or definitive results on this. Changing Stella to be more random is easy; I just want to make sure it reflects the actual machine.

#13 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • Topic Starter
  • 197 posts

Posted Mon Mar 12, 2012 3:03 PM

I would be interested in test ROMs and/or definitive results on this. Changing Stella to be more random is easy; I just want to make sure it reflects the actual machine.

Note that some demos like Tricade seem to rely on Stella booting in a specific bank and Harmony in the other (hence the two ROMs, AFAICT). In other words, you might want to disable such randomness for certain specific ROMs. If I were you I'd spider the VCS prods on pouet to automatically add all their MD5s to an exemption list (the F8 ones at least).

#14 stephena OFFLINE  

stephena

    River Patroller

  • 2,489 posts
  • Stella maintainer
  • Location:Newfoundland, Canada

Posted Mon Mar 12, 2012 3:18 PM

Note that some demos like Tricade seem to rely on Stella booting in a specific bank and Harmony in the other (hence the two ROMs, AFAICT). In other words, you might want to disable such randomness for certain specific ROMs. If I were you I'd spider the VCS prods on pouet to automatically add all their MD5s to an exemption list (the F8 ones at least).


Right, but I was hoping someone else would do that for me, and forward the ROM MD5s :)

I'm actually content to do whatever Harmony is doing, so there's only one type of behaviour to consider.

#15 kevtris OFFLINE  

kevtris

    Star Raider

  • 86 posts
  • Location:Flyover, USA

Posted Tue Mar 20, 2012 7:56 PM


I would be interested in test ROMs and/or definitive results on this. Changing Stella to be more random is easy; I just want to make sure it reflects the actual machine.

Note that some demos like Tricade seem to rely on Stella booting in a specific bank and Harmony in the other (hence the two ROMs, AFAICT). In other words, you might want to disable such randomness for certain specific ROMs. If I were you I'd spider the VCS prods on pouet to automatically add all their MD5s to an exemption list (the F8 ones at least).


I investigated this. The tricade and doctor ROMs come in two kinds: the "emulator" and "Real" versions. BOTH versions rely on the ROMs starting up in bank 1. If you start up in bank 0, the demo starts halfway through. I diff'd the two "doctor" ROMs ("emulator" and "real") and the only difference is which address is read in bank 1 to get into bank 0. The "emulator" version reads FFF8, and the "Real" version reads FFF9. Kinda interesting that the "Real" version should not work at all since it's already in bank 1, and will just result in it selecting bank 1 again then crashing. That 1 byte is the only difference between the two ROMs.

When the demo switches to bank 0, it starts executing garbage code too before the actual demo code runs. I guess they got it running and didn't investigate why it ran.

#16 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,454 posts
  • begin 644 contest

Posted Tue Mar 20, 2012 8:48 PM

In general, it's not a good idea to put code in bankswitch hotspot locations unless you know the hardware on which it will run.

There are at least three kinds of hardware I can think of:

1. If the hardware is edge-triggered, it will switch before any values are fetched.
2. If the hardware is level-triggered, it will read the value first, then switch.
3. The hardware might ignore the contents (that is, the ROM will not be enabled, and it not return a proper value at all.)

There are further complications. If there are multiple, sequential fetches in hotspots, type 1 will usually switch to the first bank in the sequence and read subsequent data from that bank, and type 2 will switch to the last, reading data from the current bank until the switch.

There is one exception: If the contents of hotspots are the same in all banks and you are reading only one byte, types 1-2 will work the same. This doesn't help you with type 3, though.

A trick that works in any case is a BRK at $1FF3, RESET vectors set to $1FF3, and a IRQ vector to Start in bank 1. A BRK does a dummy fetch of $1FF4 (discarding the value) so it will always start in bank 1 automatically regardless of the underlying hardware.

#17 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,454 posts
  • begin 644 contest

Posted Tue Mar 20, 2012 10:02 PM

Just a followup - why is there a $4C at $1FFB in all but the last bank?

As said, the only way code in hotspots works as expected in most hardware is if there is just one byte at a time (no sequential bytes in hotspots) and all banks have the same data. If there is also a $4C in the last bank, this should correct the issue and it would work on most hardware.

#18 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • Topic Starter
  • 197 posts

Posted Thu Mar 22, 2012 12:52 PM

3. The hardware might ignore the contents (that is, the ROM will not be enabled, and it not return a proper value at all.)

Unlikely since this would require extra chip-enable logic compared to just outputing whatever is in ROM at that position.

A trick that works in any case is a BRK at $1FF3, RESET vectors set to $1FF3, and a IRQ vector to Start in bank 1. A BRK does a dummy fetch of $1FF4 (discarding the value) so it will always start in bank 1 automatically regardless of the underlying hardware.

Ah, that's rather clever.

Just a followup - why is there a $4C at $1FFB in all but the last bank?

As said, the only way code in hotspots works as expected in most hardware is if there is just one byte at a time (no sequential bytes in hotspots) and all banks have the same data. If there is also a $4C in the last bank, this should correct the issue and it would work on most hardware.

You're right, the last bank should have $4C too - I simply reasoned that it might not be required, but your argument is compelling. Works fine on the Harmony though.
I'm actually seeing some kind of strange problem with the F4 ROM I'm fiddling with atm. It's probably just a "normal" bug unrelated to the bankswitching code, but you never know..

#19 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,454 posts
  • begin 644 contest

Posted Fri Mar 23, 2012 12:03 AM


3. The hardware might ignore the contents (that is, the ROM will not be enabled, and it not return a proper value at all.)

Unlikely since this would require extra chip-enable logic compared to just outputing whatever is in ROM at that position.

F4 with conventional hardware already has a chip enable (inverted A12, usually) and it would only require extra logic inside the PLD, but generally this sort of logic is "free" so it would be possible. The reason why it might be done is because some legacy games write to bankswitch hotspots and disabling the ROM would avoid the output contention.

In an early version of Harmony bankswitching, I floated the bus because of this very reason but eventually decided against doing so because some homebrews do put data in hotspots. I figured the chance of damage was low.

#20 Omegamatrix OFFLINE  

Omegamatrix

    Quadrunner

  • 5,462 posts
  • Location:Canada

Posted Sun Apr 1, 2012 12:23 PM


Little off topic, but I just had a thought. If you adjusted the stack pointer to one of the color registers you could do three color updates in 7 cycles. And you could also use SEI or CLI, CLV, etc to adjust the color provided by the status register.

I never thought of that - I'll have to cook up an effect based on it for Revision :)


I've played around with BRK after this came up. It's interesting because it made me think of things in different ways. For example an inline JMP is really easy to update the program counter high address:

;current address $F3xx

JMP .next-$2000 ; go to the next instruction in code as if there was no jump at all
.next:

;address is now $D3xx

You can also jump into ram to have the to get $00xx as the high address for a black color, or any other RIOT ram mirrors such as $20xx, $40xx, etc. Running in RIOT ram limits the low address thrown on the stack by BRK ($82-$FF, $00, $01), but it seems a fair trade-off.

I realized now that I don't have to use SEC or CLC as the bit 0 is never used in the color registers. PLP might be a better option.

;SP at some RIOT ram location
LDX #COLUBK
PLP
TXS ; doen't affect status register
BRK


Anyhow I made a small demo that is really just some playing around. In the demo I'm trying to keep the background color the same while doing color updates to pixels as quick as I can. I got 5 color updates in 5 pixels, but couldn't find a way to do 6 in 6 or more.


Attached File  TestColors.zip   2.16KB   82 downloads




0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users