Assembly on the 99/4A

marc.hull · May 26, 2010

Mathew

I read ALL the previous post and have come to the conclusion that I shot my mouth off before I knew what your intention was. I thought you were nay saying the interrupt when you were simply avoiding the built in routine. My apologies for assuming your meaning. I'll go back to lurking now............. ;-)

matthew180 · May 26, 2010

This is the highly anticipated (well, for Owen anyway. ;-) ) game loop post! There is a lot to cover and because of the VDP routines and complete character set data (which I already posted), I'm only going to include the guts here. The complete code is in the .zip file, along with a XB program listing of what this code is doing.

This code is complied with the TI's E/A and runs on Classic99 without any problems. I will be breaking it down in the next few posts, but I wanted to get this out so anyone following along could start to mess with it.

Matthew

      DEF  MAIN

**
* VDP Memory Map
VDPRD  EQU  >8800             * VDP read data
VDPSTA EQU  >8802             * VDP status
VDPWD  EQU  >8C00             * VDP write data
VDPWA  EQU  >8C02             * VDP set read/write address

**
* Workspace
WRKSP  EQU  >8300             * Workspace
R0LB   EQU  WRKSP+1           * R0 low byte reqd for VDP routines
R1LB   EQU  WRKSP+3           * R1 low byte reqd for VDP routines
R2LB   EQU  WRKSP+5           * R2 low byte reqd for VDP routines

**
* VRAM Base Locations (must match the values set up in the
* set video mode subroutine.)
NAMETB EQU  >0000             * Name table base
PTRNTB EQU  >2000             * Pattern generator table base
COLRTB EQU  >0300             * Color table base


**
* General Workspace Use:
*
* R0  VDP routines will modify
* R1  VDP routines will modify
* R2  VDP routines will modify
* R3
* R4  Random number routine will modify
* R5  Random number routine will modify
* R6
* R7
* R8
* R9
* R10 Stack pointer
* R11 Return address
* R12
* R13
* R14 Local function state
* R15 Main state maching state


**
* Scratch pad RAM use - Variables
*
*           >8300             * Workspace
*           >831F             * Bottom of workspace
STACK  EQU  >8320             * Subrouting stack, grows down (8 bytes)
*           >8322             * The stack is maintained in R10 and
*           >8324             * supports up to 4 BL calls
*           >8326
TICK   EQU  >8328             * 1 tick every 16.6ms (rolls after 18.2 mins)
VSYNC  EQU  >832A             * 1 when VSYNC is detected, otherwise 0

* Random Number Memory Map
RAND16 EQU  >83C0             * 16-bit random number
RAND8  EQU  >83C1             * 8-bit random number


**
* Runtime Constants
* In an EA3 program these will be in 8-bit RAM, in a cartridge they
* will be in 8-bit ROM.
*
VSTAT  DATA >8000             * VDP vsync status
NUM01  DATA 1                 * 16-bit number 1


**
* Program execution starts here
MAIN   LIMI 0
      LWPI WRKSP

*      Initialize the call stack and Finite State Machine (FSM)
      LI   R10,STACK         * Set up the stack pointer
      LI   R15,STINIT        * Initial state, one-time initialization
      CLR  @TICK             * Clear the tick counter


**
*      Finite State Machine (FSM)
FSM00
      CI   R15,STQUIT        * WHILE R15 != STQUIT
      JNE  FSM10
STQUIT BLWP @>0000            * Quit

*      Since interrupts are disabled to prevent the console ISR and GPL
*      from wrecking things, use of the nice VDP interrupt is not possible,
*      however the VDP vsync signal can be polled which will be close enough
*      for a game and gives a good 1/60th of a second clock.
FSM10
      CLR  @VSYNC            * VSYNC indicator only active for a single cycle
      CLR  R1
      MOVB @VDPSTA,R1        * Reading clears the VDP sync indicator
      COC  @VSTAT,R1
      JNE  FSM20             * No VSYNC, skip updating the TICK

      INC  @TICK             * Increment the tick
      INC  @VSYNC            * Set the VSYNC indicator

*      Branch to the current state
FSM20
      B    *R15              * SWITCH R15


*      One time initialization
*
STINIT
      BL   @GMODE            * Set the graphics mode
      BL   @LSCS             * Load standard character set
      BL   @OTINIT           * One time initialization

      LI   R15,STRUN         * Set next state
      B    @FSM50            * BREAK


*      Main state when things are running, game is playing, etc.
*
STRUN
      BL   @PLOT
      B    @FSM50            * BREAK


*      Every state jumps here when complete so any necessary out-of-state
*      logic or decision making can happen if necessary.
FSM50

FSM99
      B    @FSM00            * WEND
*// MAIN


*********************************************************************
*
* <subroutine skeleton>
*
SKEL
      MOV  R11,*R10+         * Push return address onto the stack

*      Subroutine code here ...

      DECT R10               * Pop return address off the stack
      MOV  *R10,R11
      B    *R11
*// SKEL


*********************************************************************
*
* Plot a random character
*
PLOT
*      Only draw on the VSYNC
      C    @VSYNC,@NUM01     * If VSYNC is not active, return
      JEQ  PLOT01
      B    *R11
PLOT01
      MOV  R11,*R10+         * Push return address onto the stack

*      Get a random screen location
      BL   @RANDNO           * Get a random number (in R5)
      LI   R3,768
      CLR  R4                * Dividend will be R4,R5
      DIV  R3,R4             * Make a number between 0 and 767
      MOV  R5,R0             * Move to R0 for the VDP routine
      AI   R0,NAMETB         * Adjust to the name table base

*      Get a random character 40, 48, 56, 64
      BL   @RANDNO           * Get a random number (in R5)
      SRL  R5,14             * Make a number between 0 and 3
      SLA  R5,3              * Multiply by 8 (number is now 0, 8, 16, 24)
      A    @CHR040,R5        * Add to the base character
      MOV  R5,R1
      SWPB R1                * Remember, the MSB goes to the VDP!

      BL   @VSBW

      DECT R10               * Pop return address off the stack
      MOV  *R10,R11
      B    *R11
*// PLOT


**
* Table based tile patterns to help make things easier
*
* Format: Tile name (0 to 255), number of pattern bytes
*         Pattern Data, ...
*
* Use a label to set up references to specific tiles.  The
* generic labels here should be replaced with something
* meaningful.
*
* This data will be in 8-bit RAM for an EA3 program, and in
* 8-bit ROM for a cartridge.
*
* The "name" and "length" values could be bytes, but using
* full words makes the code easier.  If you have a lot of
* individual definitions, you may consider changing to BYTE.
*
DEFTBL
CHR040 DATA 40,8
      DATA >007E,>7E7E,>7E7E,>7E7E
CHR048 DATA 48,8
      DATA >007E,>7E7E,>7E7E,>7E7E
CHR056 DATA 56,8
      DATA >007E,>7E7E,>7E7E,>7E7E
CHR064 DATA 64,8
      DATA >007E,>7E7E,>7E7E,>7E7E
DEFEND

**
* Color data is straight up, since there can only be 32
* bytes total, a table format is not really necessary.
COLTBL DATA >7050,>90E0


*********************************************************************
*
* One-Time Initialization
*
OTINIT
      MOV  R11,*R10+         * Push return address onto the stack

*      Initialize tile pattern definitions
      LI   R1,DEFTBL         * Start of defintion table
OTI01  MOV  *R1+,R0           * Move the character code into R0
      SLA  R0,3              * Mul by 8 to adjust offset into PGT
      AI   R0,PTRNTB         * Add pattern generator table base
      MOV  *R1+,R2           * Move the byte count into R2
      BL   @VMBW
      CI   R1,DEFEND
      JNE  OTI01             * Loop until end of table

*      Set colors
      LI   R0,COLRTB+5       * Start with color set 5 (char 40)
      LI   R1,COLTBL
      LI   R2,4
      BL   @VMBW

      DECT R10               * Pop return address off the stack
      MOV  *R10,R11
      B    *R11
*// OTINIT


*********************************************************************
*
* Sets the graphics mode
*
*                             * Bit value 128 64 32 16  8  4  2  1
*                             * Bit order   0  1  2  3  4  5  6  7
GMODE
      MOV  R11,*R10+         * Push return address onto the stack

      CLR  R0                * M3 is bit 6 and is off for Graphics I
      BL   @VWTR

*      This is the "busy" register
      LI   R0,>01E0          * 11100000 Graphics I
      BL   @VWTR             * 16K,No Blank,Enable Int,M1,M2,0,8x8,No Mag

      LI   R0,>0200          * Name Base Table to >0000 - >02FF (768 bytes)
      BL   @VWTR

      LI   R0,>030C          * Color Table to >0300 - >0320 (32 bytes)
      BL   @VWTR

      LI   R0,>0404          * Pattern Generator Table
      BL   @VWTR             * >2000 - >2800 (2048 bytes)

      LI   R0,>0507          * Sprite Attribute Table
      BL   @VWTR             * >0380 - >03FF (128 bytes)

      LI   R0,>0605          * Sprite Pattern Table
      BL   @VWTR             * >2800 - >2C00 (1024 bytes)

      LI   R0,>0380          * Disable all sprite processing by writing
      LI   R1,>D000          * >D0 (208) to the vertical position of the
      BL   @VSBW             * first sprite entry

*      Set colors
      LI   R0,>07F4          * R7 is the text-mode color and border color
      BL   @VWTR             * White on bark blue

      LI   R0,>0300          * Start of color table
      LI   R1,>F400          * White on dark blue
      LI   R2,>0020          * All color table entries (32 bytes)
      BL   @VSMW

      DECT R10               * Pop return address off the stack
      MOV  *R10,R11
      B    *R11
*// GMODE


*********************************************************************
*
* Load a nice character set
*
LSCS
      MOV  R11,*R10+         * Push return address onto the stack

      LI   R0,>2000          * Start at the space character
      LI   R1,SCS1
      LI   R2,SCS1E-SCS1
      BL   @VMBW

      DECT R10               * Pop return address off the stack
      MOV  *R10,R11
      B    *R11
*// LSCS


*********************************************************************
*
* Generates a weak pseudo random number and places it in RAND16
*
* R4   - Destroyed
* R5   - 16-bit random number and stored in RAND16 for next round
*
RANDNO
      LI   R4,28643          * A prime number to multiply by
      MPY  @RAND16,R4        * Multiply by last random number
      AI   R5,31873          * Add a prime number
      MOV  R0,R4             * Save R0
      MOV  @TICK,R0          * Use the VSYNC tick to mix it up a little
      ANDI R0,>000F          * Check if shift count is 0
      JEQ  RAND01            * A 0 count means shift 16, which is a wash
      SRC  R5,0              * Mix up the number to break odd/even pattern
RAND01 MOV  R5,@RAND16        * Save this number for next time
      MOV  R4,R0             * Restore R0
      B    *R11
*// RANDNO

gameloop.zip

Opry99er · May 26, 2010

Don't lurk, Marc!!! we need your input--- I especially do. Now that Beryl is taking a turn for the assembly.

matthew180 · May 26, 2010

Mathew

I read ALL the previous post and have come to the conclusion that I shot my mouth off before I knew what your intention was. I thought you were nay saying the interrupt when you were simply avoiding the built in routine. My apologies for assuming your meaning. I'll go back to lurking now.............

Hey, no problem, it's all good. I'm trying to be factual and simply present my take on things, but I screw something up I totally expect someone to call me out on it. ;-)

Matthew

+adamantyr · May 26, 2010

Hey, no problem, it's all good. I'm trying to be factual and simply present my take on things, but I screw something up I totally expect someone to call me out on it.

For your demonstration, breaking up your program into a bunch of smaller subroutines connected by Branch and Links is pretty good.

My experience with my CRPG design, though, was that you gained a lot of mileage by figuring out what was truly a subroutine, and what could be done by just railing the code all into one large linear sequence. For example, I have multiple steps taken during travel mode. It loads map data, it loads character graphics, it sets up the mobile objects... but it's all one routine, broken down by symbolic labels into smaller pieces. Based on a status value, I redirect to an earlier or later position in the linear sequence, depending on what is needed. For example, status screens have their own character set, so checking for a character set change is a last step, so I can easily restore things without re-loading a map or replacing mobs.

I also put my one-time-only initialization at the very start of the code... and then use that space as buffer afterward. NOT a modern-day programming technique at all, but for a vintage system it makes perfect sense.

Adamantyr

matthew180 · May 26, 2010

I also put my one-time-only initialization at the very start of the code... and then use that space as buffer afterward. NOT a modern-day programming technique at all, but for a vintage system it makes perfect sense.

Adamantyr

*That* is a good idea! But, it only works for EA3 programs or those that are loaded. For a ROM cartridge, the code is, well, in ROM. :-) But, for a smaller game, the gain is probably small. For an RPG though, I could see a lot of useful space. Nice tip, thanks!

Matthew

Tursi · May 26, 2010

Looks like Marc came around to our side eventually. My point was exactly as Sometimes99er correctly emphasized (Thank you!). If you are using the interrupt as LIMI 2/LIMI 0, then it would have exactly the same effect to check the status byte yourself, and call your interrupt function yourself (thus bypassing the console ROM code).

MAINLP LIMI 2  * This is the only place the console interrupt code could run anyway!
LIMI 0

versus

MAINLP MOV @VDPST,R0  * get the VDP status register
COC @BLANK,R0  * test the vertical interrupt bit (which was reset by the read)
JEQ NOBLANK
BL @MYINT      * it was set, so call our custom interrupt code
NOBLANK
* carry on here

It's a little more code inline, but substantially less code executed, especially if you're only interested in your own function.

As for the scratchpad usage, I found KSCAN inconvenient since it uses the lower part of the scratchpad (>8374,>8375). I wrote a really simple ASCII-only replacement for TI Farmer - by disabling ints and using my own KSCAN the whole scratchpad is now available to me.

Note that doesn't HAVE to be a concern. If you store your variables carefully in scratchpad, or even in VDP, you need not worry about it. It's just another way to do things!

Anyway, I don't mean to derail it, I'm just a bit slow to get back in the conversation.

So back to the show!

Edited May 26, 2010 by Tursi

Opry99er · May 26, 2010

I'd be interested in seeing that KSCAN alternative. I'm not quite ready to use it yet, but I have a feeling I'll be needing all the extra memory I can get with this game. This is such a cool thread. :thumbsup:

sometimes99er · May 26, 2010

I think the console KSCAN uses this time delay

Time delay
0498 020C LI 12,>04E2 Loop counter
049A 04E2
049C 060C DEC 12
049E 16FE JNE >049C
04A0 045B B *11

That's probably why it is considered so slow. The reason for this I think is to let the keyboard matrix settle after being pulled (voltage).

If you're making you're own KSCAN without a delay, you should use it like once on the VDP interrupt only. If you keep reading keys to detect a keypress, the real hardware might return wrong keys as I understand it.

matthew180 · May 26, 2010

The "game loop" code I posted (and that I'll be going over in detail soon) polls the VDP status. This is almost verbatim from the FlyGuy II code:

FSM10
      CLR  @VSYNC            * VSYNC indicator only active for a single cycle
      CLR  R1
      MOVB @VDPSTA,R1        * Reading clears the VDP sync indicator
      COC  @VSTAT,R1
      JNE  FSM20             * No VSYNC, skip updating the TICK

      INC  @TICK             * Increment the tick
      INC  @VSYNC            * Set the VSYNC indicator

*      Branch to the current state
FSM20
      B    *R15              * SWITCH R15

Matthew

matthew180 · May 26, 2010

That's probably why it is considered so slow. The reason for this I think is to let the keyboard matrix settle after being pulled (voltage).

If you're making you're own KSCAN without a delay, you should use it like once on the VDP interrupt only. If you keep reading keys to detect a keypress, the real hardware might return wrong keys as I understand it.

The 99/4A does not debounce the keyboard?? Hmm. Well, I can already think of a better way to do this than using a delay loop. This will be an interesting topic when I get to keyboard handling.

A side note, Tursi's PS2 keyboard adapter should, by default, be providing key debouncing since he is controlling the 99/4A's keyboard connector with an iC instead of mechanical switches.

So, no debouncing... That would explain why the 99/4A has such sporadic keyboard input.

Another side note: for anyone who does not know what debouncing is: when you have a mechanical input to a computer or electronic circuit, like a key being pressed, the physical connection of the switch is electrically noisy. The metal contacts coming together and breaking apart can cause spikes and jumps, which to a fast computer can look like the switch was pushed multiple times. Debouncing is *usually* taking a sample of the switch over time, until it is the same value for a given amount of time, before accepting the input. The sample time depends, but it usually something faster than a human but very slow to the computer, like 1ms or so.

Matthew

Tursi · May 27, 2010

As far as I understand the 99/4A does do debouncing, using the delay routine that Sometimes99er posted. I've seen it run when you press a key in the Classic99 debugger.

Fair point that my code does not do debounce, but for the purposes I'm using it (game input only) it should be fine, but you're right that I should try it on real hardware. In theory the only error from lack of debounce that you should see is repeated keys on a single keypress - the wrong key should not come up.

Correct that the PS/2 keyboards do their own debounce internally, you don't need to worry about it from the other end of the cable.

To see my KSCAN code just go grab the TI Farmer assembly source from the TI Farmer thread - all my XB support functions are in there, and Owen, just for you I left in the XB lines of code, so you can see how I translated them. Not that my approach is the only way, but I thought you might appreciate it.

matthew180 · May 28, 2010

The code I posted earlier is a basic game loop skeleton, but it could really be used for any assembly program I suppose. There seems to be a lot to it, but most of the code at this point is simply support routines.

If you are following along, you can download the source and compile it with the E/A cartridge (either on the real 99/4A or with Classic99), and run it.

Over the next few posts I'll break down the code and finally start adding some interesting features. So, time to get started.

     DEF  MAIN

"DEF" is an assembler and loader directive that specifies where our program begins. The E/A or XB loader will add this name to the REF/DEF table so our code can be called. I used the label "MAIN" because that is pretty universal in the world of C, Windows, Unix, MAC, etc. programming as the name of a processes entry point.

**
* VDP Memory Map
VDPRD  EQU  >8800             * VDP read data
VDPSTA EQU  >8802             * VDP status
VDPWD  EQU  >8C00             * VDP write data
VDPWA  EQU  >8C02             * VDP set read/write address

These are assembler directives that let us use labels instead of numbers. Any place in the code where you see VDPRD, the assembler will replace with >8800, etc. These are the hardware memory mapped locations for accessing the VDP. These values are used by the VDP routines I posted previously and included in the complete code download.

**
* Workspace
WRKSP  EQU  >8300             * Workspace
R0LB   EQU  WRKSP+1           * R0 low byte required by VDP routines
R1LB   EQU  WRKSP+3           * R1 low byte
R2LB   EQU  WRKSP+5           * R2 low byte

More equates to specify the workspace and the addresses of the low bytes for R0, R1, and R2. These come in handy particularly when dealing with the VDP routines because it is the MSB of a register that is sent to the VDP and the value we need a lot of the time is in the LSB of another register. The R0LB label is required by the VDP routines, the others are optional.

**
* VRAM Base Locations (must match the values set up in the
* set video mode subroutine.)
NAMETB EQU  >0000             * Name table base
PTRNTB EQU  >2000             * Pattern generator table base
COLRTB EQU  >0300             * Color table base

Equates to use when calculating VRAM addresses. Using the equates allows the VDP tables to be moved without having to change a lot of code. We specify the base address of the various tables here and use the labels in our calculations. These labels must match the table locations set up in the "set video mode" subroutine.

**
* Scratch pad RAM use - Variables
*
*           >8300             * Workspace
*           >831F             * Bottom of workspace
STACK  EQU  >8320             * Subroutine stack, grows down (8 bytes)
*           >8322             * The stack is maintained in R10 and
*           >8324             * supports up to 4 BL calls
*           >8326
TICK   EQU  >8328             * 1 tick every 16.6ms (rolls after 18.2 mins)
VSYNC  EQU  >832A             * 1 when VSYNC is detected, otherwise 0

* Random Number Memory Map
RAND16 EQU  >83C0             * 16-bit random number
RAND8  EQU  >83C1             * 8-bit random number

Here we are setting up equates that specify memory locations in the scratch pad RAM that we will be using. The WP will be loaded with >8300 and will use 32 bytes for the 16 general purpose registers.

Next will be 8 bytes used for the subroutine stack which will support a call depth of 4 (remember, addresses are 16-bit.)

The TICK count will be incremented every time the VDP issues a VSYNC which happens 60 times a second on NTSC consoles, and 50 times a second on PAL consoles. Assuming NTSC, that would be an update every 16.6ms, and since there are 65536 values in a 16-bit value, that means the counter will roll over every 18.2 minutes. This is fine since we are simply using it to determine how much (if any) time has elapsed since a previous event.

The VSYNC variable will be set to 0 unless the VSYNC signal was received, at which point it will have a value of 1 for a single pass through the game loop. We can use this variable to quickly check for and synchronize to the VSYNC.

RAND16 and RAND8 store the random numbers generated by our random number generator subroutine. The address >83C0 is used because that is what the console uses to store a random number "seed" in the form of the amount of time the use took to "press and key" from the master title screen. This makes for a really good seed and there is no reason not to use it.

As our program grows, we will be reserving more and more of the scratch pad RAM.

**
* Runtime Constants
* In an EA3 program these will be in 8-bit RAM, in a cartridge they
* will be in 8-bit ROM.
*
VSTAT  DATA >8000             * VDP vsync status
NUM01  DATA 1                 * 16-bit number 1

These are assembler directives and simply reserve and initialize memory as specified. DATA reserves 16-bit values, BYTE and TEXT reserve 8-bit values. I'm using them as constants because in a cartridge they will be in a ROM file and therefore unchangeable. In a program designed to be loaded (like this one), we could actually write to these values since they will be in RAM, either the low 8K or high 24K of the 32K RAM expansion.

In both cases we are using memory locations to hold the data even though we are treating them as unchangeable values. So why not just use equates (you may be asking)? Good question. The reason is because we can use these memory locations in instructions where an immediate values cannot be used. Remember, and equate is just a "search and replace", but these labels represent real memory locations. For example, take the NUM01 above. There are a lot of times when you need to compare a memory address to a number and the "immediate" instructions only work with registers. The the value "one" comes up a lot, as do other values which you will see as the program grows. There are several ways to code the check, and in the example code it is used to test if VSYNC is 0 or 1:

      C    @VSTAT,@NUM01
-or-
      MOV  @VSTAT,R1
      CI   R1,1
-or-
      MOV  @VSTAT,@VSTAT   * This trick uses the CPU "compare to zero"

      CI   @VSTAT,1        * ILLEGAL

The "MOV" trick is okay, but only lets us test if the register is zero or not zero. If we specifically need to test among other values, then it won't help. Also, MOV requires 4 memory accesses minimum but C only needs 3, so C will be faster.

**
* Program execution starts here
MAIN   LIMI 0
      LWPI WRKSP

This is where execution of our program will start. First thing we do it shut off interrupts and leave them off. Next the WP is set up with the address we specified via the equate, which is >8300.

*      Initialize the call stack and Finite State Machine (FSM)
      LI   R10,STACK         * Set up the stack pointer

In this code R10 is used as a stack pointer. Since the TMS9900 CPU does not have stack support in the form of a real stack register, we will make our own. A stack is just a convention used to store and retrieve temporary data.

To set up a stack you simply set aside some memory and load a register with the first address. If we had a stack pointer we would load that, and use "push" and "pop" instructions. But we don't, so I picked R10 and the "pushing" and "popping" have to be done manually.

So, when we place a value on the stack (push), the data is copied to where the stack pointer (R10) is pointing, then the stack pointer is incremented or decremented depending on if your stack "grows" up or down in memory (up being towards bigger addresses.) In our case the stack grows up. It starts at address >8320 and ends at address >8327 (8 bytes):

           MSB   LSB
R10 --+->  >8320 >8321
grows+->  >8322 >8323
"up" +->  >8324 >8325
     +->  >8326 >8327

Since we will be using our stack to store addresses, we will always "push" 16-bit values (words) on the stack, and remove (pop) 16-bit values off the stack. When the values are popped off the stack, the data where the stack pointer is pointing is copied to some designated register (or another memory location), and the stack pointer adjusted the opposite direction of a push (so decremented in our case.)

Using a stack like this allows us to have a few levels of subroutine calls (one advantage of BLWP over BL is that you don't need a stack, but you do need an entirely new workspace for each BLWP level.) There are three branching instructions in the TMS9900:

* BLWP: Branch and Load Workspace Pointer

* BL: Branch and Load

* B: Branch

We won't be using BLWP, so I'll leave it as an exercise for you to look it up. The B instruction is very simple, it unconditionally branches to the designated address. The B instruction is just like the unconditional jump instruction JMP, except JMP is restricted to jumps within -128 to +127 "words" away from the current location. This is because the location to jump to is stored as part of the JMP instruction's opcode as an offset, and there are only 8-bits to store the offset value (and the range of an 8-bit value (one byte) is 0 to 255 or -128 to +127.)

However, the B instruction's opcode is immediately followed by a complete 16-bit value (one word) that specifies the address to branch to, so it can branch to any *evenly* addressable location in the 64K range of the TMS9900 CPU. Instructions are always on even addresses. So, use B when you need to jump far, and JMP when you are within 127 words (the assembler will let you know if you try to JMP too far.) The main thing to remember is, B uses 4-byte to encode the instruction, JMP only uses 2.

So, that leaves us with BL. The "branch" part is just like the B instruction. However, the "load" part of the instruction is what lets us use this instruction for calling subroutines. To call a subroutine we need to remember where we are, jump to the address where the subroutine starts, then return to where we left off. So, to remember where we are before branching to a subroutine, we need to store the value in the program counter (PC), and that's exactly what the "load" part of BL does. The current PC value is placed in R11 (this cannot be changed, and whatever was in R11 is wiped out) and the branch is taken.

Now we are sitting in our subroutine and when we are done we need to "return" to the code that called the subroutine. Since we were careful not to destroy R11, it still holds the address of where we were before the BL call. Thus, we issue a B instruction using indirect addressing on R11, like this:

      B    *R11

Note: The assembler has a pseudo instruction "RET" that will be replaced with "B *R11". So any place you see "RET", it is the same as writing out B *R11.

The stack comes in to play when we need to call a subroutine from within a subroutine. Think about it, if we call a subroutine with BL, then that subroutine calls another with BL, the original return address is blown away unless we save it:

      BL   @SUB1    <--- stores current PC in R11
. . .
SUB1   code
      code
      code
WIPE   BL   @SUB2    <--- stores current PC in R11, blowing away previous return value
      code
      B    *R11     <--- original return address is gone, R11 holds the address at WIPE
. . .
SUB2   code
      code
      code
      B    *R11     <--- returns to SUB1

To fix this, we have to store R11 in any subroutine that needs to call another. For assembly language, 2 or 3 levels is usually all you need. Any more than that and you need to rethink your program's organization. Thus, I set up a stack to support 4 levels of calls. In "bottom level" subroutines, i.e. those that don't need to BL to any other routine, you do not have to deal with the stack. Thus, in any subroutine that needs to call another subroutine, you do this:

      BL   @SUB1       <--- stores current PC in R11
. . .
SUB1   MOV  R11,*R10+   <--- "push" R11 onto the stack and use auto-increment to adjust stack
      code
      code
      BL   @SUB2       <--- stores current PC in R11, blowing away previous return value
      code

      DECT R10         <--- adjust stack pointer (pop)
      MOV  *R10,R11    <--- copy address back to R11
      B    *R11        <--- returns to original calling location
. . .
SUB2   code             <--- subroutine does not call any others, no stack required
      code
      code
      B    *R11        <--- returns to SUB1

I hope this is clear. If you are not familiar with the addressing mode of the TMS9900, you should read up on them a little so you better understand what is going on.

      LI   R15,STINIT        * Initial state, one-time initialization
      CLR  @TICK             * Clear the tick counter

In this code, R15 is used as the finite state machine (FSM) state variable. This just sets the initial value and clears the TICK counter.

A FSM is very simple really and you deal with them every day and don't realize it. For example, a stop light is a FSM. Basically a FSM has "states", and depending on the current state there are a fixed number of other states you could go to, meaning a fixed number of possibilities from where you are.

So, for a stop light, this would be the FSM:

state = red
timer = 10

forever

 dec timer

 when state is red:
   if timer = 0 then
     state = green
     timer = 10
   end if

 when state is green:
   if timer = 0 then
     state = yellow
     timer = 5
   end if

 when state is yellow:
   if timer = 0 then
     state = red
     timer = 10
   end if

end forever

Notice that it is illegal to go from yellow to green or from green to red. Once in a given state, based on the current state and external input, you decide the next state which includes staying in the current state. In a game for example, the "run game" state is maintained until all the lives are gone, at which point you would switch to the "attract mode" state or "enter initials" state if their score was high enough. So, the initial state for our game loop is STINIT or "state initialize".

**
*      Finite State Machine (FSM)
FSM00
      CI   R15,STQUIT        * WHILE R15 != STQUIT
      JNE  FSM10
STQUIT BLWP @>0000            * Quit

This is the top level "forever" loop that contains the state machine. R15 holds the current state, and as long as it is not equal to STQUIT, we will jump to FMS10. BLWP @>0000 performs a power-on reset. >0000 is the address the CPU loads when power is applied, so we are doing the same thing. Currently there is no condition to set R15 to STQUIT, so to end the program you have to power off the console (the QUIT key won't work since interrupts are disabled and it is the console ISR that checks for that key combination.)

FSM10
      CLR  @VSYNC            * VSYNC indicator only active for a single cycle
      CLR  R1
      MOVB @VDPSTA,R1        * Reading clears the VDP sync indicator
      COC  @VSTAT,R1
      JNE  FSM20             * No VSYNC, skip updating the TICK

      INC  @TICK             * Increment the tick
      INC  @VSYNC            * Set the VSYNC indicator

This is the guts of the game loop! Not much to it is there? This stuff is just not really that complicated. First we clear the VSYNC indicator since it is only active for a single loop through the FSM. We also clear R1 because it is about to get the value of the VDP status register and only the MSB will be modified, and when we check the status with COC we need to make sure the LSB is clear.

The MOVB @VDPSTA,R1 reads the VDP's status register into the MSB or R1 and also clears the register in the VDP. COC (compare ones corresponding) checks the VSYNC indicator from the status register. VSTAT was set to >8000 so we are only testing the most significant bit in R1. If there is no VSYNC then we skip forward to the FSM. If the VSYNC indicator was set, we increment the TICK count and set the VSYNC variable to 1.

*      Branch to the current state
FSM20
      B    *R15              * SWITCH R15

This is the FSM selection. R15 always holds the address of where the code is for the current state. This instruction simply jumps to the current state, which initially is STINIT.

*      One time initialization
*
STINIT
      BL   @GMODE            * Set the graphics mode
      BL   @LSCS             * Load standard character set
      BL   @OTINIT           * One time initialization

      LI   R15,STRUN         * Set next state
      B    @FSM50            * BREAK

This is the one-time initialization. First we set up the graphics mode, then load a decent character set (I really don't like the default character set!), and finally call a one-time initialization function that will to program specific stuff (since GMODE and LSCS are both pretty generic and meant to be reused.)

Finally we update R15 with the new state, which will be the "run" state. When the current state is done, we use a branch to jump to the bottom of the FSM for any additional processing that may need to happen. All states should jump to a single location! Just because this is assembly language does not mean we don't need to follow good program flow. We are basically performing code similar to C's WHILE loop and SWITCH statement.

*      Main state when things are running, game is playing, etc.
*
STRUN
      BL   @PLOT
      B    @FSM50            * BREAK

This is the whole "run" state. It calls PLOT which does all the work. Also, we never leave this state since we are not doing any user input and the program is just not complicated enough yet.

*      Every state jumps here when complete so any necessary out-of-state
*      logic or decision making can happen if necessary.
FSM50

FSM99
      B    @FSM00            * WEND
*// MAIN

This is the bottom of the FSM. You would place any "out of state" processing here if necessary, then the code branches back to the top. The FSM uses branch because as your game (or whatever you are writing) grows, the bottom of your FSM will be too far from the top to use JMP.

*********************************************************************
*
* <subroutine skeleton>
*
SKEL
      MOV  R11,*R10+         * Push return address onto the stack

*      Subroutine code here ...

      DECT R10               * Pop return address off the stack
      MOV  *R10,R11
      B    *R11
*// SKEL

This is a skeleton subroutine to copy and paste when adding new subroutines. It contain the code necessary to manage the stack.

Here is the BASIC code we are reproducing below:

100 CALL CLEAR
110 CALL SCREEN(5)
120 A$="007E7E7E7E7E7E7E"
130 FOR I=40 TO 64 STEP 8
140 CALL CHAR(I,A$)
150 NEXT I
160 CALL COLOR(2,8,1)
170 CALL COLOR(3,6,1)
180 CALL COLOR(4,10,1)
190 CALL COLOR(5,15,1)
200 X=INT(RND*32)+1
210 Y=INT(RND*24)+1
220 C=INT(RND*4)*8+40
230 CALL HCHAR(Y,X,C)
240 GOTO 200

The PLOT subroutine. I got the idea from the Raspberry thread where the example of filling the screen with 4-color squares was used to demonstrate the speed. While even in assembly we can't get as fast as the Raspberry demo running on a GHz speed CPU, we do pretty good.

*********************************************************************
*
* Plot a random character
*
PLOT
*      Only draw on the VSYNC
      C    @VSYNC,@NUM01     * If VSYNC is not active, return
      JEQ  PLOT01
      B    *R11
PLOT01
      MOV  R11,*R10+         * Push return address onto the stack

I want to point out that this subroutine has an initial check to see if VSYNC is active, and if not it simply returns. That means this subroutine will only run once every 16.6ms, or 60 times a second. Thus, the screen takes a little while to fill up completely with squares. If the VSYNC is active, we jump down to the push the return address on the stack because the rest of the subroutine will call other subroutines (the random number generator and VSBW.)

To see how fast assembly language can be, after you run this code once, comment out those first 3 lines so the routine runs every time it is called. The screen fills up in a few seconds! It's pretty cool and makes a nice effect.

*      Get a random screen location
      BL   @RANDNO           * Get a random number (in R5)
      LI   R3,768
      CLR  R4                * Dividend will be R4,R5
      DIV  R3,R4             * Make a number between 0 and 767
      MOV  R5,R0             * Move to R0 for the VDP routine
      AI   R0,NAMETB         * Adjust to the name table base

This code gets a 16-bit random number in R5 and divides it by 768 to get a screen location. Remember that the screen is really a linear block of memory 768 bytes long, so to get an X,Y location we only need 1 number, not 2! Once we have our number, we stuff it in R0 to prepare for the call to VSBW (which requires R0 to contain the VRAM address to write to.) We also add the name table offset to R0 so we are writing to the correct location. This is where using the equates come in handy. We can generate our screen location as a 0-based index, then add that to the real base address of the name table.

*      Get a random character 40, 48, 56, 64
      BL   @RANDNO           * Get a random number (in R5)
      SRL  R5,14             * Make a number between 0 and 3
      SLA  R5,3              * Multiply by 8 (number is now 0, 8, 16, 24)
      A    @CHR040,R5        * Add to the base character
      MOV  R5,R1
      SWPB R1                * Remember, the MSB goes to the VDP!

This is doing the same thing as above except we are getting a random number between 0 and 3 and using that to select 1 of 4 characters to display. The character value goes into the MSB of R1 for VSBW.

      BL   @VSBW

      DECT R10               * Pop return address off the stack
      MOV  *R10,R11
      B    *R11
*// PLOT

With R0 and R1 set up, we write the byte to VRAM which displays the character on the screen. Then we clean up the stack and return to the FSM.

DEFTBL
CHR040 DATA 40,8
      DATA >007E,>7E7E,>7E7E,>7E7E
CHR048 DATA 48,8
      DATA >007E,>7E7E,>7E7E,>7E7E
CHR056 DATA 56,8
      DATA >007E,>7E7E,>7E7E,>7E7E
CHR064 DATA 64,8
      DATA >007E,>7E7E,>7E7E,>7E7E
DEFEND

This is a table-based tile (character) pattern setup to help make things easier. The format is:

Tile name (0 to 255), number of pattern bytes

Pattern Data, ...

You can use a label to set up references to specific tiles if necessary. The generic labels here should be replaced with something meaningful. This data will be in 8-bit RAM for an EA3 program, and in 8-bit ROM for a cartridge. The "name" and "length" values could be bytes, but using full words makes the code easier. If you have a lot of individual definitions, you may consider changing to BYTE.

COLTBL DATA >7050,>90E0

The color data is so small, 32 bytes total for all 255 tiles, that using a table layout was over kill.

I tend to place the pattern DATA close to the subroutine that reads them, at least until the code is working, by which time I'm used to it being where it is, so I keep it there...

*********************************************************************
*
* One-Time Initialization
*
OTINIT
      MOV  R11,*R10+         * Push return address onto the stack

*      Initialize tile pattern definitions
      LI   R1,DEFTBL         * Start of defintion table
OTI01  MOV  *R1+,R0           * Move the character code into R0
      SLA  R0,3              * Mul by 8 to adjust offset into PGT
      AI   R0,PTRNTB         * Add pattern generator table base
      MOV  *R1+,R2           * Move the byte count into R2
      BL   @VMBW
      CI   R1,DEFEND
      JNE  OTI01             * Loop until end of table

This code loads the pattern data from the table above. Since we are writing multiple bytes to the VDP via VMBW, R1 has to hold the address in CPU RAM of the data to write to the VDP, so we load that address into R1 first.

The first word in the table is the starting character that we are going to write a pattern for, so we move that value to R0 and auto-increment R1 past that word. The next word in the table the in number of bytes to write starting at the character code identified by the 1st word. So, the count goes to R2 and R1 is auto-incremented past that word. Now R1 is pointing at the start of the actual pattern data, R2 holds the count, and R0 holds the starting character.

Now, R0 needs two modifications. First, since each character requires 8 bytes of pattern data, we have to multiply the character code by 8 to get the proper offset into the pattern generator table. So we do that with SLA (shift left arithmetic). In case you do not know, shifting binary values left multiplies by 2, and shifting right divides by 2. This works the same way as moving the decimal point in decimal numbers multiplies or divides by 10. So, shifting left 3 positions multiplies by 8 (2x2x2). Then we add the pattern table base to R0 which is the final VRAM location for the specified character's pattern data.

Then we call VMBW to write the data, and finally check if we are at the end of the table. If not, we go back and start over reading the character code to write pattern data for, the number of bytes that follow, and the next set of pattern data.

Note that if you were defining patterns for consecutive characters, you would just include the pattern data and set the "count" value accordingly. You don't have to set up each character. In this case, the characters were spaces 8 apart to get each one in a different color group.

*      Set colors
      LI   R0,COLRTB+5       * Start with color set 5 (char 40)
      LI   R1,COLTBL
      LI   R2,4
      BL   @VMBW

This simply writes the color data and should be self explanatory by now. If not, ask questions...

GMODE
      MOV  R11,*R10+         * Push return address onto the stack

      CLR  R0                * M3 is bit 6 and is off for Graphics I
      BL   @VWTR

*      This is the "busy" register
      LI   R0,>01E0          * 11100000 Graphics I
      BL   @VWTR             * 16K,No Blank,Enable Int,M1,M2,0,8x8,No Mag

      LI   R0,>0200          * Name Base Table to >0000 - >02FF (768 bytes)
      BL   @VWTR

      LI   R0,>030C          * Color Table to >0300 - >0320 (32 bytes)
      BL   @VWTR

      LI   R0,>0404          * Pattern Generator Table
      BL   @VWTR             * >2000 - >2800 (2048 bytes)

      LI   R0,>0507          * Sprite Attribute Table
      BL   @VWTR             * >0380 - >03FF (128 bytes)

      LI   R0,>0605          * Sprite Pattern Table
      BL   @VWTR             * >2800 - >2C00 (1024 bytes)

      LI   R0,>0380          * Disable all sprite processing by writing
      LI   R1,>D000          * >D0 (208) to the vertical position of the
      BL   @VSBW             * first sprite entry

*      Set colors
      LI   R0,>07F4          * R7 is the text-mode color and border color
      BL   @VWTR             * White on bark blue

      LI   R0,>0300          * Start of color table
      LI   R1,>F400          * White on dark blue
      LI   R2,>0020          * All color table entries (32 bytes)
      BL   @VSMW

This is a complete "set the VDP" subroutine. The comments should let you know what's going on. Basically it runs through every VDP write-only register and sets each to a specific value, which is the only way to know what is in the registers since they are write only. Sprites are disabled and finally the background (border) color is set. Also, all the character sets are defaulted to the same foreground/background color scheme.

LSCS
      MOV  R11,*R10+         * Push return address onto the stack

      LI   R0,>2000          * Start at the space character
      LI   R1,SCS1
      LI   R2,SCS1E-SCS1
      BL   @VMBW

      DECT R10               * Pop return address off the stack
      MOV  *R10,R11
      B    *R11
*// LSCS

This loads the "standard character set" from the data I posted very early on in this tread. The data is also included in the complete source .zip download. This is older code that I copy and pasted so you can see it does not use the equates we set up for the VDP table locations. Note how R0 is loaded with a value that assumes the pattern generator table is at >2000. It is in this case, but we should really fix this to be consistent, and the comment is wrong, the data starts with character >00, not the space >20 (32 decimal).

      LI   R0,PTRNTB

There, that fixes it. :-)

I think that is it except for the RNG and VDP routines which have been covered already (the RNG has its own thread.) Next time I'll be adding support for reading the joystick so we can get some user input and I'm going to develop a "scrolling within a window" so Owen will have something to mess with.

Side Note: While I appreciate the feedback everyone has given, no one is asking questions... So, either everyone knows all this already, or no one is trying out the code. Either way, I'll continue to post, but I'd like to know if I'm going over stuff people want to learn about, or if this is helping anyone at getting started with assembly? I'm trying to get into the guts of the game stuff, but there was a lot of necessary boring evil that had to be gone through first.

Matthew

Edited May 28, 2010 by matthew180

Opry99er · May 28, 2010

Excellent explanations!!!!

sometimes99er · May 28, 2010

Marvellous. Absolutely brilliant. Generally very close to my own present TI style/framework, so I can't complain much. My "GMODE" is still "rolled out", while Mark Wills had a loop set all VDP registers years ago and it's actually saving quite a few bytes there. I don't know why I never got around to do it like that. I'll post the differences next week, if nobody else does.

I always wondered why JMP and B didn't exchange automatically whenever needed/possible and maybe with a note in the compile status output. Sometimes JNE etc. becomes out of range too. It's 2 extra bytes coming into play every time, and sometimes it all counts. Would be nice to leave this (trouble) to some later stage (optimization).

Hehe, I like you're using my little demo there. Nice spillover effect. Any small tricks of trade will have impact on my code.

:thumbsup:

matthew180 · May 28, 2010

Marvellous. Absolutely brilliant. Generally very close to my own present TI style/framework, so I can't complain much. My "GMODE" is still "rolled out", while Mark Wills had a loop set all VDP registers years ago and it's actually saving quite a few bytes there. I don't know why I never got around to do it like that. I'll post the differences next week, if nobody else does.

That is a good idea, I didn't think about putting the registers and values into DATA statements and setting them in a loop. It would certainly be smaller. However, for the tutorial it might have been more confusing and having it unrolled is probably good for learning and understanding. I'll roll it up into a loop with a DATA statement in a future evolution of the code.

I always wondered why JMP and B didn't exchange automatically whenever needed/possible and maybe with a note in the compile status output. Sometimes JNE etc. becomes out of range too. It's 2 extra bytes coming into play every time, and sometimes it all counts. Would be nice to leave this (trouble) to some later stage (optimization).

All the "jump" instructions are limited the same way. Their opcodes are specified like this:

 0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|            OPCODE             |         DISPLACEMENT          |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

mnemonic  opcode     meaning
-------------------------------------
 JEQ    00010011  Jump equal
 JGT    00010101  Jump greather than
 JH     00011011  Jump high
 JHE    00010100  Jump high or equal
 JL     00011010  Jump low
 JLE    00010010  Jump low or equal
 JLT    00010001  Jump less than
 JMP    00010000  Jump unconditional
 JNC    00010111  Jump no carry
 JNE    00010110  Jump not equal
 JNO    00011001  Jump no overflow
 JOC    00011000  Jump on carry
 JOP    00011100  Jump odd parity

BLWP, BL, B have this format:

 0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|            OPCODE                     |   Ts  |       S       |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Where Ts is a source address modifier, and S is the source address register (Td and D being for a destination.) The modifiers allow the addressing modes available on the TMS9900 as follows:

Ts or Td    S or D         Addressing mode
---------+-----------------------------------------
  00    | 0,1,...15 | Workspace register
  01    | 0,1,...15 | Workspace register indirect
  10    |     0     | Symbolic
  10    | 1,2,...15 | Indexed
  11    | 0,1,...15 | Workspace register indirect auto-increment

So, any time you see an instruction with Ts,S and/or Td,D all the addressing modes are available for that operand (either the source or the destination.) Since the the branch instructions have a Ts,S in the opcode, the address to branch to is going to be represented by either a memory location or a register, both of which are a full 16-bits, and hence do not have the "distance" limitation of the jump instructions.

Hehe, I like you're using my little demo there. Nice spillover effect. Any small tricks of trade will have impact on my code.

It is a cool little demo. It just struck me, kind of like FlyGuy did, and I wanted to see what assembly language could do with it.

Matthew

sometimes99er · May 28, 2010

That is a good idea, I didn't think about putting the registers and values into DATA statements and setting them in a loop. It would certainly be smaller. However, for the tutorial it might have been more confusing and having it unrolled is probably good for learning and understanding. I'll roll it up into a loop with a DATA statement in a future evolution of the code.

Mark had the data, not the registers in DATA. The loop was the VDP registers, sort of. The DATA was hence only 8 bytes.

sometimes99er · May 28, 2010

Sometimes when JNE becomes out of range, I just quickly change it to something like

JEQ $+

B “original label”

I just tought it would be nice if the assembler could do that back and forth. I have to get into all those instruktion bits when compiling directly.

+InsaneMultitasker · May 28, 2010

Beware self-modifying code... I often do things like this versus using a stack...

 
SUB1   MOV  R11,@SUB1RT+2
      ...
SUB1RT B  @0

Matthew - this is an awesome thread you've got running This is giving me some great ideas for a few programs I'd have trouble writing without getting out of my dsr/utility/input-driven serial-event processing mindset

Edited May 28, 2010 by InsaneMultitasker

matthew180 · May 28, 2010

Beware self-modifying code... I often do things like this versus using a stack...
 
SUB1   MOV  R11,@SUB1RT+2
      ...
SUB1RT B  @0

Now THAT is totally awesome! I always forget that we can modify any memory address on our little machine (too much time coding on stupid "modern" computers I guess.) I have no problems with code like this, it is fast, compact, and totally understandable. For those who might not understand what is going on, I'll go over it in detail in another post. I'm going to have to change to this method of subroutine calling I think. :-)

Matthew - this is an awesome thread you've got running This is giving me some great ideas for a few programs I'd have trouble writing without getting out of my dsr/utility/input-driven serial-event processing mindset

Thanks. Personally I get a lot from reading other people's code; see how they solved the problem and what little bits of cleverness I can get out of it. It does not have to be low level stuff either. Take FlyGuy for example and the way Codex generated a complete level from a single number. Totally awesome.

Definitely if you get stuck in a certain mind set, trying to write something completely different based on example code can help get you out of a rut. I can't wait to see what you come up with!

Matthew

+InsaneMultitasker · May 28, 2010

Beware self-modifying code... I often do things like this versus using a stack...
 
SUB1   MOV  R11,@SUB1RT+2
      ...
SUB1RT B  @0
Now THAT is totally awesome! I always forget that we can modify any memory address on our little machine (too much time coding on stupid "modern" computers I guess.) I have no problems with code like this, it is fast, compact, and totally understandable. For those who might not understand what is going on, I'll go over it in detail in another post. I'm going to have to change to this method of subroutine calling I think. :-)

Two caveats I might mention:

1. The return will 'fail' if the code runs from a Read-only memory device - you can't modify ROM in-line.

2. If you use this trick in every subroutine, you can call any sub from another sub. However, if a sub doesn't call another sub, you will incur a small, small performance hit for each iteration. For a while I lazily did the former - or was it for consistency?

Just returned home with some coffee and a slice of Cheesecake Factory cheesecake. If I can survive my food coma, I may just get into the TI stuff this afternoon. :cool:

Opry99er · May 29, 2010

So I'm making some cool progress and it's very much thanks to this thread. Started going through and just typing everything in. It's exciting to see something work after it's been assembled. Im trying to learn this KSCAN stuff right now so I can make this thing a navigable map. I am really hoping to make that happen later today. Just got done playing and I'm about to crash hard. sleeeepy time!!! I'm hoping to wake up refreshed and ready to code, as I intend on making some significant progress!!! Thanks to Matthew for this thread and all the talented programmers who participate. If I ever get this off the ground and (maybe someday) playable, it will be due to you guys... Matthew, Marc, Adamantyr, sometimes, Mark, Tursi, and the rest. How damn lucky are we to have this kind of talent!!??

Opry99er · May 29, 2010

As far as I understand the 99/4A does do debouncing, using the delay routine that Sometimes99er posted. I've seen it run when you press a key in the Classic99 debugger.

Fair point that my code does not do debounce, but for the purposes I'm using it (game input only) it should be fine, but you're right that I should try it on real hardware. In theory the only error from lack of debounce that you should see is repeated keys on a single keypress - the wrong key should not come up.

Correct that the PS/2 keyboards do their own debounce internally, you don't need to worry about it from the other end of the cable.

To see my KSCAN code just go grab the TI Farmer assembly source from the TI Farmer thread - all my XB support functions are in there, and Owen, just for you I left in the XB lines of code, so you can see how I translated them. Not that my approach is the only way, but I thought you might appreciate it.

THANKS TURSI!!!! =) I am taking a glance now. I'm having a bit of a problem however... here's my source for my scroll. It's not working properly. It draws the 14x14 window, but displays a bunch of insanity instead of my map... it will move up and down fast as greased lightning, but the side to side are slow, and I think the math is bad somewhere... Any help you can give me would be great, guys... I'm really hoping to create a nice little walking tour of this world... so far, I just don't have the stuff. Please let me in on some efficiency help too. I know this can be done in a smaller code than what I've done here. =) Thanks!!!

DEF  START
  	REF  VSBW,VMBW,VWTR,KSCAN
WS 	EQU  >8300                  	* Workspace in scratch-pad
COLTAB EQU  >0380
CHRPAT DATA >0800
START  LWPI WS             			* Load workspaces
  	LI   R0,>0701       			* Set screen to white
  	BLWP @VWTR
  	CLR  R0             			* Clear screen
  	LI   R1,>2020       			* Inefficient but effective
  	LI   R2,768
CLOOP  BLWP @VSBW
  	INC  R0
  	DEC  R2
  	JNE  CLOOP
  	LI   R0,COLTAB              	* Populate color table
  	LI   R1,CLRSET
  	LI   R2,32
  	BLWP @VMBW
  	LI   R1,PATSET              	* Load PATSET address into R1
  	LI   R2,8           			* Set R2 to 8
PLOOP  MOV  *R1+,R0                	* Move value at R1 into R0, increment R1
  	CI   R0,>FFFF       			* Check if >FFFF (end patterns)
  	JEQ  DRWMAP         			* If so, jump to mapdraw
  	SLA  R0,3           			* Multiply R0 by 8 (offset in pattern table)
  	A	@CHRPAT,R0     			* Add pattern location to base
  	BLWP @VMBW                  	* Write pattern
  	AI   R1,8           			* Increment R1 by 8
  	JMP  PLOOP                  	* Loop pattern writing
LI   R12,MAPDAT     			*usable register to hold mapdat "status"
DRWMAP LI   R5,14                  	* Draw map
  	LI   R0,35                  	*Starting screen position
  	MOVB R12,R1   		*Load current mapdat location into R1
  	LI   R2,14                  	*Set print length
DLOOP  BLWP @VMBW                  	*print it
  	AI   R1,88                  	*carriage return for map data
  	AI   R0,32                  	*carriage return for screen
  	DEC  R5             			*loop counter
  	JNE  DLOOP                  	*if R5 is not 0, jump back to DLOOP to continue
****END DRAW SCREEN ROUTINE
LI R1,>0100       			*check for joystick 1 input
MOVB R1,@>8374              	*check for Y return
LP	BLWP @KSCAN    	*check for input
CLR  R1             			*clear register for Y return check
MOVB @>8376,R1              	*move Y return value into MSB of R1
CI   R1,>0400       			*is the Y value "4"? (up)
JNE  T1             			*if not, go to the next comparison
AI   R12,-88                	*move was "up", so subtract 88 from R12
JMP  DRWMAP         			*jumps to the draw routine
T1 	CI   R1,>FC00       			*compares MSB of R1 to >FC (down)
JNE  T2             			*if not, go to the next comparison
  	AI   R12,88         			*move was "down" so add 88 to R12
JMP  DRWMAP         			*jump to draw routine
T2	MOVB @>8377,R1            	*move X-return byte into R1
CI   R1,>0400       			*is the X value "4"? (right)
JNE  T3             			*if not, go to the next comparison
INC  R12                    	*move was "right" so add 1 to R12
JMP  DRWMAP         			*jump to draw routine
T3	CI   R1,>FC00       			*final comparison... is the X value >FC? (left)
JNE  LP             			*it was not, therefore no motion happened, jump to KSCAN
DEC  R12                    	*move was "left" so subtract 1 from R12
JMP DRWMAP                  	*jump to draw routine

LAVACLR.zip

Opry99er · May 29, 2010

Oh and just as a disclaimer... I'm using all the built in routines (VMBW, etc) just til I can get the hang of it. =) I definitely understand the usefulness of having custom routines, now that it's been explained to me by Matthew

+InsaneMultitasker · May 29, 2010

       LI   R12,MAPDAT                         *usable register to hold mapdat "status"
DRWMAP  LI   R5,14                       * Draw map
       LI   R0,35                      *Starting screen position
       MOVB R12,R1             *Load current mapdat location into R1
       LI   R2,14

Quick observation: Your MOVB is suspect... did you intend a MOV?

Assembly on the 99/4A

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members