Anyway, on to the code. My goal was to create a simple multi-threading capability so that I could implement the loader as its own thread and the main game loop as its own thread. The 6502 is a tricky little CPU. To make multi-threading happen you have to have separate stacks, one for each thread, some zero page variables to store the stack specific data and which thread is current, and you need to have code that will save and restore thread contexts. To make all of this work, I created a small header file with a bare minimum implementation. This only supports two threads, but could be expanded to support up to 4. I wouldn't try to do more than 4 because there are only 256 bytes of stack space and having less than 64 bytes of stack is flirting with danger.
Here's the code:
/*
* ClassicGameDev.com Example Task Switching
*/
// this function initializes the stack for a given task so that
// we can start the task by restoring back to it as if it had been
// put to sleep previously
inline initialize_task(sp, entry) {
tsx // get current stack pointer
txa // x -> a
tay // a -> y
ldx sp // put task stack pointer in x
txs // set stack pointer to task stack pointer
lda hi(entry) // get high byte of task entry address
pha // push high byte of task entry address on task stack
lda lo(entry) // get low byte of task entry address
pha // push low byte of task entry address on task stack
lda #$34 // put standard cpu status flags in a
pha // push standard cpu status flags on task stack
lda #$0 // a = 0
pha // push initial a value on stack
pha // push initial x value on stack
pha // push initial y value on stack
tya // y -> a
tax // a -> x
txs // restore the stack pointer
}
// this macro handles storing the current task registers on its stack
inline save_task_context() {
pha
txa
pha
tya
pha
}
// this macro handles restoring a task registers from its stack
inline load_task_context() {
pla // pop the y value
tay // initialize y
pla // pop the x value
tax // initialize x
pla // pop the a value
}
// this handles storing the current stack pointer in memory, updating the
// current task index in memory and then loading the new task's stack pointer
inline switch_tasks(cur_task) {
ldy cur_task // load current task index into y
tsx // put the current stack pointer in x
stx $00,y // store current stack pointer into current task sp var
// this updates the current task index
lda #1 // a = 1
eor cur_task // a = cur_task ^ a (this flips from 0 to 1/1 to 0)
sta cur_task // cur_task = a
tay // make y equal the new task index
ldx $00,y // load new task stack pointer from its task sp var
txs // set stack pointer to task stack pointer
}
// this function will start the threading system by restoring
// the given task context, setting the cpu status, and calling
// rts to jump to it. this can be called from a regular function
// to start the threading system.
inline start_all_tasks(first_task) {
ldy first_task // get the index of the first task
ldx $00,y // load its stack pointer into x
txs // set stack pointer to first task sp
load_task_context() // load task context from its stack
plp // load task cpu status, this will clear interrupt disable flag
rts // return to the task entry point
}
// this is the interrupt handler to hook up to the timer IRQ to
// implement task switching
interrupt context_switch() {
// save the current task context on its stack
save_task_context()
// switch tasks
switch_tasks(current_task)
// load the task context from tnew task stack
load_task_context()
cli // clear the interrupt flag
}
I think the comments explain what the code does. The context_switch interrupt function at the bottom is called by a Lynx timer set to interrupt at roughly 30Hz. I picked that arbitrarily for now. It will probably need to be tweaked to get a good balance of execution times.
Here's the code that shows how to use the threading:
#include <task_switch.h>
#define IRQ_ADDR_HI $FFFF
#define IRQ_ADDR_LO $FFFE
byte task0_sp : $0 = #$ff // task 0 stack pointer storage
byte task1_sp : $1 = #$7f // task 1 stack pointer storage
byte current_task : $2 = #0 // current task index
#ram.org 0x0200
// make this the function the one that the second stage loader
// loads and runs. it handles further initializing the system,
// initializing the tasks and starting the task switching
function noreturn startup() {
// disable interrupts
sei
// set the irq handler pointer to our context_switch function
lda hi(context_switch)
sta IRQ_ADDR_HI
lda lo(context_switch)
sta IRQ_ADDR_LO
// initialize one of the handy interrupts to fire at 30 Hz
initialize_timer()
// initialize the stacks for the two tasks
initialize_task(task0_sp, game_loop)
initialize_task(task1_sp, asset_loader)
// start the task system, this will set the CPU status flags to
// a known value for the first task which will clear the disable
// interrupts bit
start_all_tasks(current_task)
}
function noreturn game_loop() {
forever {
// this is the main game loop
}
}
function noreturn asset_loader() {
forever {
// this is the asset loader
}
}
#ram.end
I haven't run this on an actual Lynx just yet. I have run it on a 6502 simulator. Right now it isn't as efficient as it could be. For instance, this uses 6502 opcodes exclusively when it could use 65c02 opcodes. The HLAKit compiler only has support for the bare 6502 right now. When I add 65c02 support I'll update this.
The next step for this system is to use the bit test/set functions to create a simple semaphore that can be used to synchronize a queue of load requests between the main thread and the loader thread. Once I get that far, I'll load it up on my Lynx and then write up a much longer tutorial for the Lynx section on my wiki.
--Wookie
Edited by Wookie, Sat Sep 11, 2010 2:21 AM.














