Jump to content

Photo

Assembly on the 99/4A


594 replies to this topic

#51 marc.hull OFFLINE  

marc.hull

    Stargunner

  • 1,114 posts
  • Location:Oklahoma CIty.

Posted Wed May 26, 2010 11:42 AM

Mathew

I read ALL the previous post and have come to the conclusion that I shot my mouth off before I knew what your intention was. I thought you were nay saying the interrupt when you were simply avoiding the built in routine. My apologies for assuming your meaning. I'll go back to lurking now............. ;-)

#52 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,383 posts
  • Location:Castaic, California

Posted Wed May 26, 2010 11:45 AM

This is the highly anticipated (well, for Owen anyway. ;-) ) game loop post! There is a lot to cover and because of the VDP routines and complete character set data (which I already posted), I'm only going to include the guts here. The complete code is in the .zip file, along with a XB program listing of what this code is doing.

This code is complied with the TI's E/A and runs on Classic99 without any problems. I will be breaking it down in the next few posts, but I wanted to get this out so anyone following along could start to mess with it.

Matthew

       DEF  MAIN

**
* VDP Memory Map
VDPRD  EQU  >8800             * VDP read data
VDPSTA EQU  >8802             * VDP status
VDPWD  EQU  >8C00             * VDP write data
VDPWA  EQU  >8C02             * VDP set read/write address

**
* Workspace
WRKSP  EQU  >8300             * Workspace
R0LB   EQU  WRKSP+1           * R0 low byte reqd for VDP routines
R1LB   EQU  WRKSP+3           * R1 low byte reqd for VDP routines
R2LB   EQU  WRKSP+5           * R2 low byte reqd for VDP routines

**
* VRAM Base Locations (must match the values set up in the
* set video mode subroutine.)
NAMETB EQU  >0000             * Name table base
PTRNTB EQU  >2000             * Pattern generator table base
COLRTB EQU  >0300             * Color table base


**
* General Workspace Use:
*
* R0  VDP routines will modify
* R1  VDP routines will modify
* R2  VDP routines will modify
* R3
* R4  Random number routine will modify
* R5  Random number routine will modify
* R6
* R7
* R8
* R9
* R10 Stack pointer
* R11 Return address
* R12
* R13
* R14 Local function state
* R15 Main state maching state


**
* Scratch pad RAM use - Variables
*
*           >8300             * Workspace
*           >831F             * Bottom of workspace
STACK  EQU  >8320             * Subrouting stack, grows down (8 bytes)
*           >8322             * The stack is maintained in R10 and
*           >8324             * supports up to 4 BL calls
*           >8326
TICK   EQU  >8328             * 1 tick every 16.6ms (rolls after 18.2 mins)
VSYNC  EQU  >832A             * 1 when VSYNC is detected, otherwise 0

* Random Number Memory Map
RAND16 EQU  >83C0             * 16-bit random number
RAND8  EQU  >83C1             * 8-bit random number


**
* Runtime Constants
* In an EA3 program these will be in 8-bit RAM, in a cartridge they
* will be in 8-bit ROM.
*
VSTAT  DATA >8000             * VDP vsync status
NUM01  DATA 1                 * 16-bit number 1


**
* Program execution starts here
MAIN   LIMI 0
       LWPI WRKSP

*      Initialize the call stack and Finite State Machine (FSM)
       LI   R10,STACK         * Set up the stack pointer
       LI   R15,STINIT        * Initial state, one-time initialization
       CLR  @TICK             * Clear the tick counter


**
*      Finite State Machine (FSM)
FSM00
       CI   R15,STQUIT        * WHILE R15 != STQUIT
       JNE  FSM10
STQUIT BLWP @>0000            * Quit

*      Since interrupts are disabled to prevent the console ISR and GPL
*      from wrecking things, use of the nice VDP interrupt is not possible,
*      however the VDP vsync signal can be polled which will be close enough
*      for a game and gives a good 1/60th of a second clock.
FSM10
       CLR  @VSYNC            * VSYNC indicator only active for a single cycle
       CLR  R1
       MOVB @VDPSTA,R1        * Reading clears the VDP sync indicator
       COC  @VSTAT,R1
       JNE  FSM20             * No VSYNC, skip updating the TICK

       INC  @TICK             * Increment the tick
       INC  @VSYNC            * Set the VSYNC indicator

*      Branch to the current state
FSM20
       B    *R15              * SWITCH R15


*      One time initialization
*
STINIT
       BL   @GMODE            * Set the graphics mode
       BL   @LSCS             * Load standard character set
       BL   @OTINIT           * One time initialization

       LI   R15,STRUN         * Set next state
       B    @FSM50            * BREAK


*      Main state when things are running, game is playing, etc.
*
STRUN
       BL   @PLOT
       B    @FSM50            * BREAK


*      Every state jumps here when complete so any necessary out-of-state
*      logic or decision making can happen if necessary.
FSM50

FSM99
       B    @FSM00            * WEND
*// MAIN


*********************************************************************
*
* <subroutine skeleton>
*
SKEL
       MOV  R11,*R10+         * Push return address onto the stack

*      Subroutine code here ...

       DECT R10               * Pop return address off the stack
       MOV  *R10,R11
       B    *R11
*// SKEL


*********************************************************************
*
* Plot a random character
*
PLOT
*      Only draw on the VSYNC
       C    @VSYNC,@NUM01     * If VSYNC is not active, return
       JEQ  PLOT01
       B    *R11
PLOT01
       MOV  R11,*R10+         * Push return address onto the stack

*      Get a random screen location
       BL   @RANDNO           * Get a random number (in R5)
       LI   R3,768
       CLR  R4                * Dividend will be R4,R5
       DIV  R3,R4             * Make a number between 0 and 767
       MOV  R5,R0             * Move to R0 for the VDP routine
       AI   R0,NAMETB         * Adjust to the name table base

*      Get a random character 40, 48, 56, 64
       BL   @RANDNO           * Get a random number (in R5)
       SRL  R5,14             * Make a number between 0 and 3
       SLA  R5,3              * Multiply by 8 (number is now 0, 8, 16, 24)
       A    @CHR040,R5        * Add to the base character
       MOV  R5,R1
       SWPB R1                * Remember, the MSB goes to the VDP!

       BL   @VSBW

       DECT R10               * Pop return address off the stack
       MOV  *R10,R11
       B    *R11
*// PLOT


**
* Table based tile patterns to help make things easier
*
* Format: Tile name (0 to 255), number of pattern bytes
*         Pattern Data, ...
*
* Use a label to set up references to specific tiles.  The
* generic labels here should be replaced with something
* meaningful.
*
* This data will be in 8-bit RAM for an EA3 program, and in
* 8-bit ROM for a cartridge.
*
* The "name" and "length" values could be bytes, but using
* full words makes the code easier.  If you have a lot of
* individual definitions, you may consider changing to BYTE.
*
DEFTBL
CHR040 DATA 40,8
       DATA >007E,>7E7E,>7E7E,>7E7E
CHR048 DATA 48,8
       DATA >007E,>7E7E,>7E7E,>7E7E
CHR056 DATA 56,8
       DATA >007E,>7E7E,>7E7E,>7E7E
CHR064 DATA 64,8
       DATA >007E,>7E7E,>7E7E,>7E7E
DEFEND

**
* Color data is straight up, since there can only be 32
* bytes total, a table format is not really necessary.
COLTBL DATA >7050,>90E0


*********************************************************************
*
* One-Time Initialization
*
OTINIT
       MOV  R11,*R10+         * Push return address onto the stack

*      Initialize tile pattern definitions
       LI   R1,DEFTBL         * Start of defintion table
OTI01  MOV  *R1+,R0           * Move the character code into R0
       SLA  R0,3              * Mul by 8 to adjust offset into PGT
       AI   R0,PTRNTB         * Add pattern generator table base
       MOV  *R1+,R2           * Move the byte count into R2
       BL   @VMBW
       CI   R1,DEFEND
       JNE  OTI01             * Loop until end of table

*      Set colors
       LI   R0,COLRTB+5       * Start with color set 5 (char 40)
       LI   R1,COLTBL
       LI   R2,4
       BL   @VMBW

       DECT R10               * Pop return address off the stack
       MOV  *R10,R11
       B    *R11
*// OTINIT


*********************************************************************
*
* Sets the graphics mode
*
*                             * Bit value 128 64 32 16  8  4  2  1
*                             * Bit order   0  1  2  3  4  5  6  7
GMODE
       MOV  R11,*R10+         * Push return address onto the stack

       CLR  R0                * M3 is bit 6 and is off for Graphics I
       BL   @VWTR

*      This is the "busy" register
       LI   R0,>01E0          * 11100000 Graphics I
       BL   @VWTR             * 16K,No Blank,Enable Int,M1,M2,0,8x8,No Mag

       LI   R0,>0200          * Name Base Table to >0000 - >02FF (768 bytes)
       BL   @VWTR

       LI   R0,>030C          * Color Table to >0300 - >0320 (32 bytes)
       BL   @VWTR

       LI   R0,>0404          * Pattern Generator Table
       BL   @VWTR             * >2000 - >2800 (2048 bytes)

       LI   R0,>0507          * Sprite Attribute Table
       BL   @VWTR             * >0380 - >03FF (128 bytes)

       LI   R0,>0605          * Sprite Pattern Table
       BL   @VWTR             * >2800 - >2C00 (1024 bytes)

       LI   R0,>0380          * Disable all sprite processing by writing
       LI   R1,>D000          * >D0 (208) to the vertical position of the
       BL   @VSBW             * first sprite entry

*      Set colors
       LI   R0,>07F4          * R7 is the text-mode color and border color
       BL   @VWTR             * White on bark blue

       LI   R0,>0300          * Start of color table
       LI   R1,>F400          * White on dark blue
       LI   R2,>0020          * All color table entries (32 bytes)
       BL   @VSMW

       DECT R10               * Pop return address off the stack
       MOV  *R10,R11
       B    *R11
*// GMODE


*********************************************************************
*
* Load a nice character set
*
LSCS
       MOV  R11,*R10+         * Push return address onto the stack

       LI   R0,>2000          * Start at the space character
       LI   R1,SCS1
       LI   R2,SCS1E-SCS1
       BL   @VMBW

       DECT R10               * Pop return address off the stack
       MOV  *R10,R11
       B    *R11
*// LSCS


*********************************************************************
*
* Generates a weak pseudo random number and places it in RAND16
*
* R4   - Destroyed
* R5   - 16-bit random number and stored in RAND16 for next round
*
RANDNO
       LI   R4,28643          * A prime number to multiply by
       MPY  @RAND16,R4        * Multiply by last random number
       AI   R5,31873          * Add a prime number
       MOV  R0,R4             * Save R0
       MOV  @TICK,R0          * Use the VSYNC tick to mix it up a little
       ANDI R0,>000F          * Check if shift count is 0
       JEQ  RAND01            * A 0 count means shift 16, which is a wash
       SRC  R5,0              * Mix up the number to break odd/even pattern
RAND01 MOV  R5,@RAND16        * Save this number for next time
       MOV  R4,R0             * Restore R0
       B    *R11
*// RANDNO

Attached Files



#53 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Wed May 26, 2010 11:47 AM

Don't lurk, Marc!!! :) we need your input--- I especially do. Now that Beryl is taking a turn for the assembly. :)

#54 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,383 posts
  • Location:Castaic, California

Posted Wed May 26, 2010 11:56 AM

Mathew

I read ALL the previous post and have come to the conclusion that I shot my mouth off before I knew what your intention was. I thought you were nay saying the interrupt when you were simply avoiding the built in routine. My apologies for assuming your meaning. I'll go back to lurking now............. ;-)


Hey, no problem, it's all good. I'm trying to be factual and simply present my take on things, but I screw something up I totally expect someone to call me out on it. ;-)

Matthew

#55 adamantyr ONLINE  

adamantyr

    Stargunner

  • 1,141 posts

Posted Wed May 26, 2010 12:21 PM

Hey, no problem, it's all good. I'm trying to be factual and simply present my take on things, but I screw something up I totally expect someone to call me out on it. ;-)


For your demonstration, breaking up your program into a bunch of smaller subroutines connected by Branch and Links is pretty good.

My experience with my CRPG design, though, was that you gained a lot of mileage by figuring out what was truly a subroutine, and what could be done by just railing the code all into one large linear sequence. For example, I have multiple steps taken during travel mode. It loads map data, it loads character graphics, it sets up the mobile objects... but it's all one routine, broken down by symbolic labels into smaller pieces. Based on a status value, I redirect to an earlier or later position in the linear sequence, depending on what is needed. For example, status screens have their own character set, so checking for a character set change is a last step, so I can easily restore things without re-loading a map or replacing mobs.

I also put my one-time-only initialization at the very start of the code... and then use that space as buffer afterward. NOT a modern-day programming technique at all, but for a vintage system it makes perfect sense.

Adamantyr

#56 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,383 posts
  • Location:Castaic, California

Posted Wed May 26, 2010 12:41 PM

I also put my one-time-only initialization at the very start of the code... and then use that space as buffer afterward. NOT a modern-day programming technique at all, but for a vintage system it makes perfect sense.

Adamantyr


*That* is a good idea! But, it only works for EA3 programs or those that are loaded. For a ROM cartridge, the code is, well, in ROM. :-) But, for a smaller game, the gain is probably small. For an RPG though, I could see a lot of useful space. Nice tip, thanks!

Matthew

#57 Tursi OFFLINE  

Tursi

    River Patroller

  • 4,753 posts
  • HarmlessLion
  • Location:BUR

Posted Wed May 26, 2010 2:39 PM

Looks like Marc came around to our side eventually. :) My point was exactly as Sometimes99er correctly emphasized (Thank you!). If you are using the interrupt as LIMI 2/LIMI 0, then it would have exactly the same effect to check the status byte yourself, and call your interrupt function yourself (thus bypassing the console ROM code).

MAINLP LIMI 2  * This is the only place the console interrupt code could run anyway!
 LIMI 0

versus

MAINLP MOV @VDPST,R0  * get the VDP status register
 COC @BLANK,R0  * test the vertical interrupt bit (which was reset by the read)
 JEQ NOBLANK
 BL @MYINT      * it was set, so call our custom interrupt code
NOBLANK
* carry on here

It's a little more code inline, but substantially less code executed, especially if you're only interested in your own function. ;)

As for the scratchpad usage, I found KSCAN inconvenient since it uses the lower part of the scratchpad (>8374,>8375). I wrote a really simple ASCII-only replacement for TI Farmer - by disabling ints and using my own KSCAN the whole scratchpad is now available to me.

Note that doesn't HAVE to be a concern. If you store your variables carefully in scratchpad, or even in VDP, you need not worry about it. It's just another way to do things!

Anyway, I don't mean to derail it, I'm just a bit slow to get back in the conversation. ;)

So back to the show! ;)

Edited by Tursi, Wed May 26, 2010 2:40 PM.


#58 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Wed May 26, 2010 2:47 PM

I'd be interested in seeing that KSCAN alternative. :) I'm not quite ready to use it yet, but I have a feeling I'll be needing all the extra memory I can get with this game. :) This is such a cool thread. :thumbsup:

#59 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 3,910 posts
  • Location:Denmark

Posted Wed May 26, 2010 3:17 PM

I think the console KSCAN uses this time delay

Time delay
0498 020C LI 12,>04E2 Loop counter
049A 04E2
049C 060C DEC 12
049E 16FE JNE >049C
04A0 045B B *11
That's probably why it is considered so slow. The reason for this I think is to let the keyboard matrix settle after being pulled (voltage).

If you're making you're own KSCAN without a delay, you should use it like once on the VDP interrupt only. If you keep reading keys to detect a keypress, the real hardware might return wrong keys as I understand it.

:)

#60 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,383 posts
  • Location:Castaic, California

Posted Wed May 26, 2010 3:38 PM

The "game loop" code I posted (and that I'll be going over in detail soon) polls the VDP status. This is almost verbatim from the FlyGuy II code:
FSM10
       CLR  @VSYNC            * VSYNC indicator only active for a single cycle
       CLR  R1
       MOVB @VDPSTA,R1        * Reading clears the VDP sync indicator
       COC  @VSTAT,R1
       JNE  FSM20             * No VSYNC, skip updating the TICK

       INC  @TICK             * Increment the tick
       INC  @VSYNC            * Set the VSYNC indicator

*      Branch to the current state
FSM20
       B    *R15              * SWITCH R15

Matthew

#61 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,383 posts
  • Location:Castaic, California

Posted Wed May 26, 2010 3:51 PM

That's probably why it is considered so slow. The reason for this I think is to let the keyboard matrix settle after being pulled (voltage).

If you're making you're own KSCAN without a delay, you should use it like once on the VDP interrupt only. If you keep reading keys to detect a keypress, the real hardware might return wrong keys as I understand it.

:)


The 99/4A does not debounce the keyboard?? Hmm. Well, I can already think of a better way to do this than using a delay loop. This will be an interesting topic when I get to keyboard handling.

A side note, Tursi's PS2 keyboard adapter should, by default, be providing key debouncing since he is controlling the 99/4A's keyboard connector with an iC instead of mechanical switches.

So, no debouncing... That would explain why the 99/4A has such sporadic keyboard input.

Another side note: for anyone who does not know what debouncing is: when you have a mechanical input to a computer or electronic circuit, like a key being pressed, the physical connection of the switch is electrically noisy. The metal contacts coming together and breaking apart can cause spikes and jumps, which to a fast computer can look like the switch was pushed multiple times. Debouncing is *usually* taking a sample of the switch over time, until it is the same value for a given amount of time, before accepting the input. The sample time depends, but it usually something faster than a human but very slow to the computer, like 1ms or so.

Matthew

#62 Tursi OFFLINE  

Tursi

    River Patroller

  • 4,753 posts
  • HarmlessLion
  • Location:BUR

Posted Thu May 27, 2010 2:26 PM

As far as I understand the 99/4A does do debouncing, using the delay routine that Sometimes99er posted. I've seen it run when you press a key in the Classic99 debugger. ;)

Fair point that my code does not do debounce, but for the purposes I'm using it (game input only) it should be fine, but you're right that I should try it on real hardware. In theory the only error from lack of debounce that you should see is repeated keys on a single keypress - the wrong key should not come up.

Correct that the PS/2 keyboards do their own debounce internally, you don't need to worry about it from the other end of the cable.

To see my KSCAN code just go grab the TI Farmer assembly source from the TI Farmer thread - all my XB support functions are in there, and Owen, just for you I left in the XB lines of code, so you can see how I translated them. Not that my approach is the only way, but I thought you might appreciate it.

#63 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,383 posts
  • Location:Castaic, California

Posted Fri May 28, 2010 12:05 AM

The code I posted earlier is a basic game loop skeleton, but it could really be used for any assembly program I suppose. There seems to be a lot to it, but most of the code at this point is simply support routines.

If you are following along, you can download the source and compile it with the E/A cartridge (either on the real 99/4A or with Classic99), and run it.

Over the next few posts I'll break down the code and finally start adding some interesting features. So, time to get started.
      DEF  MAIN
"DEF" is an assembler and loader directive that specifies where our program begins. The E/A or XB loader will add this name to the REF/DEF table so our code can be called. I used the label "MAIN" because that is pretty universal in the world of C, Windows, Unix, MAC, etc. programming as the name of a processes entry point.

**
* VDP Memory Map
VDPRD  EQU  >8800             * VDP read data
VDPSTA EQU  >8802             * VDP status
VDPWD  EQU  >8C00             * VDP write data
VDPWA  EQU  >8C02             * VDP set read/write address
These are assembler directives that let us use labels instead of numbers. Any place in the code where you see VDPRD, the assembler will replace with >8800, etc. These are the hardware memory mapped locations for accessing the VDP. These values are used by the VDP routines I posted previously and included in the complete code download.

**
* Workspace
WRKSP  EQU  >8300             * Workspace
R0LB   EQU  WRKSP+1           * R0 low byte required by VDP routines
R1LB   EQU  WRKSP+3           * R1 low byte
R2LB   EQU  WRKSP+5           * R2 low byte
More equates to specify the workspace and the addresses of the low bytes for R0, R1, and R2. These come in handy particularly when dealing with the VDP routines because it is the MSB of a register that is sent to the VDP and the value we need a lot of the time is in the LSB of another register. The R0LB label is required by the VDP routines, the others are optional.

**
* VRAM Base Locations (must match the values set up in the
* set video mode subroutine.)
NAMETB EQU  >0000             * Name table base
PTRNTB EQU  >2000             * Pattern generator table base
COLRTB EQU  >0300             * Color table base
Equates to use when calculating VRAM addresses. Using the equates allows the VDP tables to be moved without having to change a lot of code. We specify the base address of the various tables here and use the labels in our calculations. These labels must match the table locations set up in the "set video mode" subroutine.

**
* Scratch pad RAM use - Variables
*
*           >8300             * Workspace
*           >831F             * Bottom of workspace
STACK  EQU  >8320             * Subroutine stack, grows down (8 bytes)
*           >8322             * The stack is maintained in R10 and
*           >8324             * supports up to 4 BL calls
*           >8326
TICK   EQU  >8328             * 1 tick every 16.6ms (rolls after 18.2 mins)
VSYNC  EQU  >832A             * 1 when VSYNC is detected, otherwise 0

* Random Number Memory Map
RAND16 EQU  >83C0             * 16-bit random number
RAND8  EQU  >83C1             * 8-bit random number
Here we are setting up equates that specify memory locations in the scratch pad RAM that we will be using. The WP will be loaded with >8300 and will use 32 bytes for the 16 general purpose registers.

Next will be 8 bytes used for the subroutine stack which will support a call depth of 4 (remember, addresses are 16-bit.)

The TICK count will be incremented every time the VDP issues a VSYNC which happens 60 times a second on NTSC consoles, and 50 times a second on PAL consoles. Assuming NTSC, that would be an update every 16.6ms, and since there are 65536 values in a 16-bit value, that means the counter will roll over every 18.2 minutes. This is fine since we are simply using it to determine how much (if any) time has elapsed since a previous event.

The VSYNC variable will be set to 0 unless the VSYNC signal was received, at which point it will have a value of 1 for a single pass through the game loop. We can use this variable to quickly check for and synchronize to the VSYNC.

RAND16 and RAND8 store the random numbers generated by our random number generator subroutine. The address >83C0 is used because that is what the console uses to store a random number "seed" in the form of the amount of time the use took to "press and key" from the master title screen. This makes for a really good seed and there is no reason not to use it.

As our program grows, we will be reserving more and more of the scratch pad RAM.

**
* Runtime Constants
* In an EA3 program these will be in 8-bit RAM, in a cartridge they
* will be in 8-bit ROM.
*
VSTAT  DATA >8000             * VDP vsync status
NUM01  DATA 1                 * 16-bit number 1
These are assembler directives and simply reserve and initialize memory as specified. DATA reserves 16-bit values, BYTE and TEXT reserve 8-bit values. I'm using them as constants because in a cartridge they will be in a ROM file and therefore unchangeable. In a program designed to be loaded (like this one), we could actually write to these values since they will be in RAM, either the low 8K or high 24K of the 32K RAM expansion.

In both cases we are using memory locations to hold the data even though we are treating them as unchangeable values. So why not just use equates (you may be asking)? Good question. The reason is because we can use these memory locations in instructions where an immediate values cannot be used. Remember, and equate is just a "search and replace", but these labels represent real memory locations. For example, take the NUM01 above. There are a lot of times when you need to compare a memory address to a number and the "immediate" instructions only work with registers. The the value "one" comes up a lot, as do other values which you will see as the program grows. There are several ways to code the check, and in the example code it is used to test if VSYNC is 0 or 1:
       C    @VSTAT,@NUM01
-or-
       MOV  @VSTAT,R1
       CI   R1,1
-or-
       MOV  @VSTAT,@VSTAT   * This trick uses the CPU "compare to zero"

       CI   @VSTAT,1        * ILLEGAL
The "MOV" trick is okay, but only lets us test if the register is zero or not zero. If we specifically need to test among other values, then it won't help. Also, MOV requires 4 memory accesses minimum but C only needs 3, so C will be faster.

**
* Program execution starts here
MAIN   LIMI 0
       LWPI WRKSP
This is where execution of our program will start. First thing we do it shut off interrupts and leave them off. Next the WP is set up with the address we specified via the equate, which is >8300.

*      Initialize the call stack and Finite State Machine (FSM)
       LI   R10,STACK         * Set up the stack pointer
In this code R10 is used as a stack pointer. Since the TMS9900 CPU does not have stack support in the form of a real stack register, we will make our own. A stack is just a convention used to store and retrieve temporary data.

To set up a stack you simply set aside some memory and load a register with the first address. If we had a stack pointer we would load that, and use "push" and "pop" instructions. But we don't, so I picked R10 and the "pushing" and "popping" have to be done manually.

So, when we place a value on the stack (push), the data is copied to where the stack pointer (R10) is pointing, then the stack pointer is incremented or decremented depending on if your stack "grows" up or down in memory (up being towards bigger addresses.) In our case the stack grows up. It starts at address >8320 and ends at address >8327 (8 bytes):
            MSB   LSB
R10 --+->  >8320 >8321
 grows+->  >8322 >8323
 "up" +->  >8324 >8325
      +->  >8326 >8327
Since we will be using our stack to store addresses, we will always "push" 16-bit values (words) on the stack, and remove (pop) 16-bit values off the stack. When the values are popped off the stack, the data where the stack pointer is pointing is copied to some designated register (or another memory location), and the stack pointer adjusted the opposite direction of a push (so decremented in our case.)

Using a stack like this allows us to have a few levels of subroutine calls (one advantage of BLWP over BL is that you don't need a stack, but you do need an entirely new workspace for each BLWP level.) There are three branching instructions in the TMS9900:

* BLWP: Branch and Load Workspace Pointer
* BL: Branch and Load
* B: Branch

We won't be using BLWP, so I'll leave it as an exercise for you to look it up. The B instruction is very simple, it unconditionally branches to the designated address. The B instruction is just like the unconditional jump instruction JMP, except JMP is restricted to jumps within -128 to +127 "words" away from the current location. This is because the location to jump to is stored as part of the JMP instruction's opcode as an offset, and there are only 8-bits to store the offset value (and the range of an 8-bit value (one byte) is 0 to 255 or -128 to +127.)

However, the B instruction's opcode is immediately followed by a complete 16-bit value (one word) that specifies the address to branch to, so it can branch to any *evenly* addressable location in the 64K range of the TMS9900 CPU. Instructions are always on even addresses. So, use B when you need to jump far, and JMP when you are within 127 words (the assembler will let you know if you try to JMP too far.) The main thing to remember is, B uses 4-byte to encode the instruction, JMP only uses 2.

So, that leaves us with BL. The "branch" part is just like the B instruction. However, the "load" part of the instruction is what lets us use this instruction for calling subroutines. To call a subroutine we need to remember where we are, jump to the address where the subroutine starts, then return to where we left off. So, to remember where we are before branching to a subroutine, we need to store the value in the program counter (PC), and that's exactly what the "load" part of BL does. The current PC value is placed in R11 (this cannot be changed, and whatever was in R11 is wiped out) and the branch is taken.

Now we are sitting in our subroutine and when we are done we need to "return" to the code that called the subroutine. Since we were careful not to destroy R11, it still holds the address of where we were before the BL call. Thus, we issue a B instruction using indirect addressing on R11, like this:
       B    *R11
Note: The assembler has a pseudo instruction "RET" that will be replaced with "B *R11". So any place you see "RET", it is the same as writing out B *R11.

The stack comes in to play when we need to call a subroutine from within a subroutine. Think about it, if we call a subroutine with BL, then that subroutine calls another with BL, the original return address is blown away unless we save it:
       BL   @SUB1    <--- stores current PC in R11
. . .
SUB1   code
       code
       code
WIPE   BL   @SUB2    <--- stores current PC in R11, blowing away previous return value
       code
       B    *R11     <--- original return address is gone, R11 holds the address at WIPE
. . .
SUB2   code
       code
       code
       B    *R11     <--- returns to SUB1
To fix this, we have to store R11 in any subroutine that needs to call another. For assembly language, 2 or 3 levels is usually all you need. Any more than that and you need to rethink your program's organization. Thus, I set up a stack to support 4 levels of calls. In "bottom level" subroutines, i.e. those that don't need to BL to any other routine, you do not have to deal with the stack. Thus, in any subroutine that needs to call another subroutine, you do this:
       BL   @SUB1       <--- stores current PC in R11
. . .
SUB1   MOV  R11,*R10+   <--- "push" R11 onto the stack and use auto-increment to adjust stack
       code
       code
       BL   @SUB2       <--- stores current PC in R11, blowing away previous return value
       code

       DECT R10         <--- adjust stack pointer (pop)
       MOV  *R10,R11    <--- copy address back to R11
       B    *R11        <--- returns to original calling location
. . .
SUB2   code             <--- subroutine does not call any others, no stack required
       code
       code
       B    *R11        <--- returns to SUB1
I hope this is clear. If you are not familiar with the addressing mode of the TMS9900, you should read up on them a little so you better understand what is going on.

       LI   R15,STINIT        * Initial state, one-time initialization
       CLR  @TICK             * Clear the tick counter
In this code, R15 is used as the finite state machine (FSM) state variable. This just sets the initial value and clears the TICK counter.

A FSM is very simple really and you deal with them every day and don't realize it. For example, a stop light is a FSM. Basically a FSM has "states", and depending on the current state there are a fixed number of other states you could go to, meaning a fixed number of possibilities from where you are.

So, for a stop light, this would be the FSM:
state = red
timer = 10

forever

  dec timer

  when state is red:
    if timer = 0 then
      state = green
      timer = 10
    end if

  when state is green:
    if timer = 0 then
      state = yellow
      timer = 5
    end if

  when state is yellow:
    if timer = 0 then
      state = red
      timer = 10
    end if

end forever
Notice that it is illegal to go from yellow to green or from green to red. Once in a given state, based on the current state and external input, you decide the next state which includes staying in the current state. In a game for example, the "run game" state is maintained until all the lives are gone, at which point you would switch to the "attract mode" state or "enter initials" state if their score was high enough. So, the initial state for our game loop is STINIT or "state initialize".

**
*      Finite State Machine (FSM)
FSM00
       CI   R15,STQUIT        * WHILE R15 != STQUIT
       JNE  FSM10
STQUIT BLWP @>0000            * Quit
This is the top level "forever" loop that contains the state machine. R15 holds the current state, and as long as it is not equal to STQUIT, we will jump to FMS10. BLWP @>0000 performs a power-on reset. >0000 is the address the CPU loads when power is applied, so we are doing the same thing. Currently there is no condition to set R15 to STQUIT, so to end the program you have to power off the console (the QUIT key won't work since interrupts are disabled and it is the console ISR that checks for that key combination.)

FSM10
       CLR  @VSYNC            * VSYNC indicator only active for a single cycle
       CLR  R1
       MOVB @VDPSTA,R1        * Reading clears the VDP sync indicator
       COC  @VSTAT,R1
       JNE  FSM20             * No VSYNC, skip updating the TICK

       INC  @TICK             * Increment the tick
       INC  @VSYNC            * Set the VSYNC indicator
This is the guts of the game loop! Not much to it is there? This stuff is just not really that complicated. First we clear the VSYNC indicator since it is only active for a single loop through the FSM. We also clear R1 because it is about to get the value of the VDP status register and only the MSB will be modified, and when we check the status with COC we need to make sure the LSB is clear.

The MOVB @VDPSTA,R1 reads the VDP's status register into the MSB or R1 and also clears the register in the VDP. COC (compare ones corresponding) checks the VSYNC indicator from the status register. VSTAT was set to >8000 so we are only testing the most significant bit in R1. If there is no VSYNC then we skip forward to the FSM. If the VSYNC indicator was set, we increment the TICK count and set the VSYNC variable to 1.

*      Branch to the current state
FSM20
       B    *R15              * SWITCH R15
This is the FSM selection. R15 always holds the address of where the code is for the current state. This instruction simply jumps to the current state, which initially is STINIT.

*      One time initialization
*
STINIT
       BL   @GMODE            * Set the graphics mode
       BL   @LSCS             * Load standard character set
       BL   @OTINIT           * One time initialization

       LI   R15,STRUN         * Set next state
       B    @FSM50            * BREAK
This is the one-time initialization. First we set up the graphics mode, then load a decent character set (I really don't like the default character set!), and finally call a one-time initialization function that will to program specific stuff (since GMODE and LSCS are both pretty generic and meant to be reused.)

Finally we update R15 with the new state, which will be the "run" state. When the current state is done, we use a branch to jump to the bottom of the FSM for any additional processing that may need to happen. All states should jump to a single location! Just because this is assembly language does not mean we don't need to follow good program flow. We are basically performing code similar to C's WHILE loop and SWITCH statement.

*      Main state when things are running, game is playing, etc.
*
STRUN
       BL   @PLOT
       B    @FSM50            * BREAK
This is the whole "run" state. It calls PLOT which does all the work. Also, we never leave this state since we are not doing any user input and the program is just not complicated enough yet.

*      Every state jumps here when complete so any necessary out-of-state
*      logic or decision making can happen if necessary.
FSM50

FSM99
       B    @FSM00            * WEND
*// MAIN
This is the bottom of the FSM. You would place any "out of state" processing here if necessary, then the code branches back to the top. The FSM uses branch because as your game (or whatever you are writing) grows, the bottom of your FSM will be too far from the top to use JMP.

*********************************************************************
*
* <subroutine skeleton>
*
SKEL
       MOV  R11,*R10+         * Push return address onto the stack

*      Subroutine code here ...

       DECT R10               * Pop return address off the stack
       MOV  *R10,R11
       B    *R11
*// SKEL
This is a skeleton subroutine to copy and paste when adding new subroutines. It contain the code necessary to manage the stack.

Here is the BASIC code we are reproducing below:
100 CALL CLEAR
110 CALL SCREEN(5)
120 A$="007E7E7E7E7E7E7E"
130 FOR I=40 TO 64 STEP 8
140 CALL CHAR(I,A$)
150 NEXT I
160 CALL COLOR(2,8,1)
170 CALL COLOR(3,6,1)
180 CALL COLOR(4,10,1)
190 CALL COLOR(5,15,1)
200 X=INT(RND*32)+1
210 Y=INT(RND*24)+1
220 C=INT(RND*4)*8+40
230 CALL HCHAR(Y,X,C)
240 GOTO 200

The PLOT subroutine. I got the idea from the Raspberry thread where the example of filling the screen with 4-color squares was used to demonstrate the speed. While even in assembly we can't get as fast as the Raspberry demo running on a GHz speed CPU, we do pretty good.
*********************************************************************
*
* Plot a random character
*
PLOT
*      Only draw on the VSYNC
       C    @VSYNC,@NUM01     * If VSYNC is not active, return
       JEQ  PLOT01
       B    *R11
PLOT01
       MOV  R11,*R10+         * Push return address onto the stack

I want to point out that this subroutine has an initial check to see if VSYNC is active, and if not it simply returns. That means this subroutine will only run once every 16.6ms, or 60 times a second. Thus, the screen takes a little while to fill up completely with squares. If the VSYNC is active, we jump down to the push the return address on the stack because the rest of the subroutine will call other subroutines (the random number generator and VSBW.)

To see how fast assembly language can be, after you run this code once, comment out those first 3 lines so the routine runs every time it is called. The screen fills up in a few seconds! It's pretty cool and makes a nice effect.

*      Get a random screen location
       BL   @RANDNO           * Get a random number (in R5)
       LI   R3,768
       CLR  R4                * Dividend will be R4,R5
       DIV  R3,R4             * Make a number between 0 and 767
       MOV  R5,R0             * Move to R0 for the VDP routine
       AI   R0,NAMETB         * Adjust to the name table base
This code gets a 16-bit random number in R5 and divides it by 768 to get a screen location. Remember that the screen is really a linear block of memory 768 bytes long, so to get an X,Y location we only need 1 number, not 2! Once we have our number, we stuff it in R0 to prepare for the call to VSBW (which requires R0 to contain the VRAM address to write to.) We also add the name table offset to R0 so we are writing to the correct location. This is where using the equates come in handy. We can generate our screen location as a 0-based index, then add that to the real base address of the name table.

*      Get a random character 40, 48, 56, 64
       BL   @RANDNO           * Get a random number (in R5)
       SRL  R5,14             * Make a number between 0 and 3
       SLA  R5,3              * Multiply by 8 (number is now 0, 8, 16, 24)
       A    @CHR040,R5        * Add to the base character
       MOV  R5,R1
       SWPB R1                * Remember, the MSB goes to the VDP!
This is doing the same thing as above except we are getting a random number between 0 and 3 and using that to select 1 of 4 characters to display. The character value goes into the MSB of R1 for VSBW.

       BL   @VSBW

       DECT R10               * Pop return address off the stack
       MOV  *R10,R11
       B    *R11
*// PLOT
With R0 and R1 set up, we write the byte to VRAM which displays the character on the screen. Then we clean up the stack and return to the FSM.

DEFTBL
CHR040 DATA 40,8
       DATA >007E,>7E7E,>7E7E,>7E7E
CHR048 DATA 48,8
       DATA >007E,>7E7E,>7E7E,>7E7E
CHR056 DATA 56,8
       DATA >007E,>7E7E,>7E7E,>7E7E
CHR064 DATA 64,8
       DATA >007E,>7E7E,>7E7E,>7E7E
DEFEND
This is a table-based tile (character) pattern setup to help make things easier. The format is:

Tile name (0 to 255), number of pattern bytes
Pattern Data, ...

You can use a label to set up references to specific tiles if necessary. The generic labels here should be replaced with something meaningful. This data will be in 8-bit RAM for an EA3 program, and in 8-bit ROM for a cartridge. The "name" and "length" values could be bytes, but using full words makes the code easier. If you have a lot of individual definitions, you may consider changing to BYTE.

COLTBL DATA >7050,>90E0
The color data is so small, 32 bytes total for all 255 tiles, that using a table layout was over kill.

I tend to place the pattern DATA close to the subroutine that reads them, at least until the code is working, by which time I'm used to it being where it is, so I keep it there...

*********************************************************************
*
* One-Time Initialization
*
OTINIT
       MOV  R11,*R10+         * Push return address onto the stack

*      Initialize tile pattern definitions
       LI   R1,DEFTBL         * Start of defintion table
OTI01  MOV  *R1+,R0           * Move the character code into R0
       SLA  R0,3              * Mul by 8 to adjust offset into PGT
       AI   R0,PTRNTB         * Add pattern generator table base
       MOV  *R1+,R2           * Move the byte count into R2
       BL   @VMBW
       CI   R1,DEFEND
       JNE  OTI01             * Loop until end of table
This code loads the pattern data from the table above. Since we are writing multiple bytes to the VDP via VMBW, R1 has to hold the address in CPU RAM of the data to write to the VDP, so we load that address into R1 first.

The first word in the table is the starting character that we are going to write a pattern for, so we move that value to R0 and auto-increment R1 past that word. The next word in the table the in number of bytes to write starting at the character code identified by the 1st word. So, the count goes to R2 and R1 is auto-incremented past that word. Now R1 is pointing at the start of the actual pattern data, R2 holds the count, and R0 holds the starting character.

Now, R0 needs two modifications. First, since each character requires 8 bytes of pattern data, we have to multiply the character code by 8 to get the proper offset into the pattern generator table. So we do that with SLA (shift left arithmetic). In case you do not know, shifting binary values left multiplies by 2, and shifting right divides by 2. This works the same way as moving the decimal point in decimal numbers multiplies or divides by 10. So, shifting left 3 positions multiplies by 8 (2x2x2). Then we add the pattern table base to R0 which is the final VRAM location for the specified character's pattern data.

Then we call VMBW to write the data, and finally check if we are at the end of the table. If not, we go back and start over reading the character code to write pattern data for, the number of bytes that follow, and the next set of pattern data.

Note that if you were defining patterns for consecutive characters, you would just include the pattern data and set the "count" value accordingly. You don't have to set up each character. In this case, the characters were spaces 8 apart to get each one in a different color group.

*      Set colors
       LI   R0,COLRTB+5       * Start with color set 5 (char 40)
       LI   R1,COLTBL
       LI   R2,4
       BL   @VMBW
This simply writes the color data and should be self explanatory by now. If not, ask questions...

GMODE
       MOV  R11,*R10+         * Push return address onto the stack

       CLR  R0                * M3 is bit 6 and is off for Graphics I
       BL   @VWTR

*      This is the "busy" register
       LI   R0,>01E0          * 11100000 Graphics I
       BL   @VWTR             * 16K,No Blank,Enable Int,M1,M2,0,8x8,No Mag

       LI   R0,>0200          * Name Base Table to >0000 - >02FF (768 bytes)
       BL   @VWTR

       LI   R0,>030C          * Color Table to >0300 - >0320 (32 bytes)
       BL   @VWTR

       LI   R0,>0404          * Pattern Generator Table
       BL   @VWTR             * >2000 - >2800 (2048 bytes)

       LI   R0,>0507          * Sprite Attribute Table
       BL   @VWTR             * >0380 - >03FF (128 bytes)

       LI   R0,>0605          * Sprite Pattern Table
       BL   @VWTR             * >2800 - >2C00 (1024 bytes)

       LI   R0,>0380          * Disable all sprite processing by writing
       LI   R1,>D000          * >D0 (208) to the vertical position of the
       BL   @VSBW             * first sprite entry

*      Set colors
       LI   R0,>07F4          * R7 is the text-mode color and border color
       BL   @VWTR             * White on bark blue

       LI   R0,>0300          * Start of color table
       LI   R1,>F400          * White on dark blue
       LI   R2,>0020          * All color table entries (32 bytes)
       BL   @VSMW
This is a complete "set the VDP" subroutine. The comments should let you know what's going on. Basically it runs through every VDP write-only register and sets each to a specific value, which is the only way to know what is in the registers since they are write only. Sprites are disabled and finally the background (border) color is set. Also, all the character sets are defaulted to the same foreground/background color scheme.

LSCS
       MOV  R11,*R10+         * Push return address onto the stack

       LI   R0,>2000          * Start at the space character
       LI   R1,SCS1
       LI   R2,SCS1E-SCS1
       BL   @VMBW

       DECT R10               * Pop return address off the stack
       MOV  *R10,R11
       B    *R11
*// LSCS
This loads the "standard character set" from the data I posted very early on in this tread. The data is also included in the complete source .zip download. This is older code that I copy and pasted so you can see it does not use the equates we set up for the VDP table locations. Note how R0 is loaded with a value that assumes the pattern generator table is at >2000. It is in this case, but we should really fix this to be consistent, and the comment is wrong, the data starts with character >00, not the space >20 (32 decimal).
       LI   R0,PTRNTB
There, that fixes it. :-)

I think that is it except for the RNG and VDP routines which have been covered already (the RNG has its own thread.) Next time I'll be adding support for reading the joystick so we can get some user input and I'm going to develop a "scrolling within a window" so Owen will have something to mess with.

Side Note: While I appreciate the feedback everyone has given, no one is asking questions... So, either everyone knows all this already, or no one is trying out the code. Either way, I'll continue to post, but I'd like to know if I'm going over stuff people want to learn about, or if this is helping anyone at getting started with assembly? I'm trying to get into the guts of the game stuff, but there was a lot of necessary boring evil that had to be gone through first.

Matthew

Edited by matthew180, Fri May 28, 2010 12:11 AM.


#64 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Fri May 28, 2010 12:13 AM

Excellent explanations!!!!

#65 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 3,910 posts
  • Location:Denmark

Posted Fri May 28, 2010 1:47 AM

Marvellous. Absolutely brilliant. Generally very close to my own present TI style/framework, so I can't complain much. My "GMODE" is still "rolled out", while Mark Wills had a loop set all VDP registers years ago and it's actually saving quite a few bytes there. I don't know why I never got around to do it like that. I'll post the differences next week, if nobody else does.

I always wondered why JMP and B didn't exchange automatically whenever needed/possible and maybe with a note in the compile status output. Sometimes JNE etc. becomes out of range too. It's 2 extra bytes coming into play every time, and sometimes it all counts. Would be nice to leave this (trouble) to some later stage (optimization).

Hehe, I like you're using my little demo there. Nice spillover effect. Any small tricks of trade will have impact on my code.

:thumbsup:

#66 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,383 posts
  • Location:Castaic, California

Posted Fri May 28, 2010 8:33 AM

Marvellous. Absolutely brilliant. Generally very close to my own present TI style/framework, so I can't complain much. My "GMODE" is still "rolled out", while Mark Wills had a loop set all VDP registers years ago and it's actually saving quite a few bytes there. I don't know why I never got around to do it like that. I'll post the differences next week, if nobody else does.

That is a good idea, I didn't think about putting the registers and values into DATA statements and setting them in a loop. It would certainly be smaller. However, for the tutorial it might have been more confusing and having it unrolled is probably good for learning and understanding. I'll roll it up into a loop with a DATA statement in a future evolution of the code.

I always wondered why JMP and B didn't exchange automatically whenever needed/possible and maybe with a note in the compile status output. Sometimes JNE etc. becomes out of range too. It's 2 extra bytes coming into play every time, and sometimes it all counts. Would be nice to leave this (trouble) to some later stage (optimization).

All the "jump" instructions are limited the same way. Their opcodes are specified like this:
  0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|            OPCODE             |         DISPLACEMENT          |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

mnemonic  opcode     meaning
-------------------------------------
  JEQ    00010011  Jump equal
  JGT    00010101  Jump greather than
  JH     00011011  Jump high
  JHE    00010100  Jump high or equal
  JL     00011010  Jump low
  JLE    00010010  Jump low or equal
  JLT    00010001  Jump less than
  JMP    00010000  Jump unconditional
  JNC    00010111  Jump no carry
  JNE    00010110  Jump not equal
  JNO    00011001  Jump no overflow
  JOC    00011000  Jump on carry
  JOP    00011100  Jump odd parity

BLWP, BL, B have this format:
  0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|            OPCODE                     |   Ts  |       S       |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Where Ts is a source address modifier, and S is the source address register (Td and D being for a destination.) The modifiers allow the addressing modes available on the TMS9900 as follows:
Ts or Td    S or D         Addressing mode
---------+-----------------------------------------
   00    | 0,1,...15 | Workspace register
   01    | 0,1,...15 | Workspace register indirect
   10    |     0     | Symbolic
   10    | 1,2,...15 | Indexed
   11    | 0,1,...15 | Workspace register indirect auto-increment
So, any time you see an instruction with Ts,S and/or Td,D all the addressing modes are available for that operand (either the source or the destination.) Since the the branch instructions have a Ts,S in the opcode, the address to branch to is going to be represented by either a memory location or a register, both of which are a full 16-bits, and hence do not have the "distance" limitation of the jump instructions.

Hehe, I like you're using my little demo there. Nice spillover effect. Any small tricks of trade will have impact on my code.

It is a cool little demo. It just struck me, kind of like FlyGuy did, and I wanted to see what assembly language could do with it.

Matthew

#67 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 3,910 posts
  • Location:Denmark

Posted Fri May 28, 2010 9:16 AM

That is a good idea, I didn't think about putting the registers and values into DATA statements and setting them in a loop. It would certainly be smaller. However, for the tutorial it might have been more confusing and having it unrolled is probably good for learning and understanding. I'll roll it up into a loop with a DATA statement in a future evolution of the code.

Mark had the data, not the registers in DATA. The loop was the VDP registers, sort of. The DATA was hence only 8 bytes.

;)

#68 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 3,910 posts
  • Location:Denmark

Posted Fri May 28, 2010 9:23 AM

Sometimes when JNE becomes out of range, I just quickly change it to something like
JEQ $+
B “original label”
I just tought it would be nice if the assembler could do that back and forth. I have to get into all those instruktion bits when compiling directly.

:)

#69 InsaneMultitasker OFFLINE  

InsaneMultitasker

    Stargunner

  • 1,693 posts

Posted Fri May 28, 2010 9:55 AM

Beware self-modifying code... I often do things like this versus using a stack... ;)

SUB1   MOV  R11,@SUB1RT+2
       ...
SUB1RT B  @0


Matthew - this is an awesome thread you've got running :) This is giving me some great ideas for a few programs I'd have trouble writing without getting out of my dsr/utility/input-driven serial-event processing mindset :)

Edited by InsaneMultitasker, Fri May 28, 2010 9:59 AM.


#70 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,383 posts
  • Location:Castaic, California

Posted Fri May 28, 2010 10:59 AM

Beware self-modifying code... I often do things like this versus using a stack... ;)

SUB1   MOV  R11,@SUB1RT+2
       ...
SUB1RT B  @0

Now THAT is totally awesome! I always forget that we can modify any memory address on our little machine (too much time coding on stupid "modern" computers I guess.) I have no problems with code like this, it is fast, compact, and totally understandable. For those who might not understand what is going on, I'll go over it in detail in another post. I'm going to have to change to this method of subroutine calling I think. :-)

Matthew - this is an awesome thread you've got running :) This is giving me some great ideas for a few programs I'd have trouble writing without getting out of my dsr/utility/input-driven serial-event processing mindset :)


Thanks. Personally I get a lot from reading other people's code; see how they solved the problem and what little bits of cleverness I can get out of it. It does not have to be low level stuff either. Take FlyGuy for example and the way Codex generated a complete level from a single number. Totally awesome.

Definitely if you get stuck in a certain mind set, trying to write something completely different based on example code can help get you out of a rut. I can't wait to see what you come up with!

Matthew

#71 InsaneMultitasker OFFLINE  

InsaneMultitasker

    Stargunner

  • 1,693 posts

Posted Fri May 28, 2010 1:48 PM

Beware self-modifying code... I often do things like this versus using a stack... ;)

SUB1   MOV  R11,@SUB1RT+2
       ...
SUB1RT B  @0

Now THAT is totally awesome! I always forget that we can modify any memory address on our little machine (too much time coding on stupid "modern" computers I guess.) I have no problems with code like this, it is fast, compact, and totally understandable. For those who might not understand what is going on, I'll go over it in detail in another post. I'm going to have to change to this method of subroutine calling I think. :-)

Two caveats I might mention:
1. The return will 'fail' if the code runs from a Read-only memory device - you can't modify ROM in-line.
2. If you use this trick in every subroutine, you can call any sub from another sub. However, if a sub doesn't call another sub, you will incur a small, small performance hit for each iteration. For a while I lazily did the former - or was it for consistency?

Just returned home with some coffee and a slice of Cheesecake Factory cheesecake. If I can survive my food coma, I may just get into the TI stuff this afternoon. :cool:

#72 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Sat May 29, 2010 1:32 AM

So I'm making some cool progress and it's very much thanks to this thread. :) Started going through and just typing everything in. It's exciting to see something work after it's been assembled. :) Im trying to learn this KSCAN stuff right now so I can make this thing a navigable map. I am really hoping to make that happen later today. Just got done playing and I'm about to crash hard. :) sleeeepy time!!! I'm hoping to wake up refreshed and ready to code, as I intend on making some significant progress!!! Thanks to Matthew for this thread and all the talented programmers who participate. If I ever get this off the ground and (maybe someday) playable, it will be due to you guys... Matthew, Marc, Adamantyr, sometimes, Mark, Tursi, and the rest. How damn lucky are we to have this kind of talent!!??

#73 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Sat May 29, 2010 11:52 AM

As far as I understand the 99/4A does do debouncing, using the delay routine that Sometimes99er posted. I've seen it run when you press a key in the Classic99 debugger. Posted Image

Fair point that my code does not do debounce, but for the purposes I'm using it (game input only) it should be fine, but you're right that I should try it on real hardware. In theory the only error from lack of debounce that you should see is repeated keys on a single keypress - the wrong key should not come up.

Correct that the PS/2 keyboards do their own debounce internally, you don't need to worry about it from the other end of the cable.

To see my KSCAN code just go grab the TI Farmer assembly source from the TI Farmer thread - all my XB support functions are in there, and Owen, just for you I left in the XB lines of code, so you can see how I translated them. Not that my approach is the only way, but I thought you might appreciate it.




THANKS TURSI!!!! =) I am taking a glance now. I'm having a bit of a problem however... here's my source for my scroll. It's not working properly. It draws the 14x14 window, but displays a bunch of insanity instead of my map... it will move up and down fast as greased lightning, but the side to side are slow, and I think the math is bad somewhere... Any help you can give me would be great, guys... I'm really hoping to create a nice little walking tour of this world... so far, I just don't have the stuff. Please let me in on some efficiency help too. I know this can be done in a smaller code than what I've done here. =) Thanks!!!
DEF  START
   	REF  VSBW,VMBW,VWTR,KSCAN
WS 	EQU  >8300                  	* Workspace in scratch-pad
COLTAB EQU  >0380
CHRPAT DATA >0800
START  LWPI WS             			* Load workspaces
   	LI   R0,>0701       			* Set screen to white
   	BLWP @VWTR
   	CLR  R0             			* Clear screen
   	LI   R1,>2020       			* Inefficient but effective
   	LI   R2,768
CLOOP  BLWP @VSBW
   	INC  R0
   	DEC  R2
   	JNE  CLOOP
   	LI   R0,COLTAB              	* Populate color table
   	LI   R1,CLRSET
   	LI   R2,32
   	BLWP @VMBW
   	LI   R1,PATSET              	* Load PATSET address into R1
   	LI   R2,8           			* Set R2 to 8
PLOOP  MOV  *R1+,R0                	* Move value at R1 into R0, increment R1
   	CI   R0,>FFFF       			* Check if >FFFF (end patterns)
   	JEQ  DRWMAP         			* If so, jump to mapdraw
   	SLA  R0,3           			* Multiply R0 by 8 (offset in pattern table)
   	A	@CHRPAT,R0     			* Add pattern location to base
   	BLWP @VMBW                  	* Write pattern
   	AI   R1,8           			* Increment R1 by 8
   	JMP  PLOOP                  	* Loop pattern writing
	LI   R12,MAPDAT     			*usable register to hold mapdat "status"
DRWMAP LI   R5,14                  	* Draw map
   	LI   R0,35                  	*Starting screen position
   	MOVB R12,R1   		*Load current mapdat location into R1
   	LI   R2,14                  	*Set print length
DLOOP  BLWP @VMBW                  	*print it
   	AI   R1,88                  	*carriage return for map data
   	AI   R0,32                  	*carriage return for screen
   	DEC  R5             			*loop counter
   	JNE  DLOOP                  	*if R5 is not 0, jump back to DLOOP to continue
****END DRAW SCREEN ROUTINE
	LI R1,>0100       			*check for joystick 1 input
	MOVB R1,@>8374              	*check for Y return
LP	BLWP @KSCAN    	*check for input
	CLR  R1             			*clear register for Y return check
	MOVB @>8376,R1              	*move Y return value into MSB of R1
	CI   R1,>0400       			*is the Y value "4"? (up)
	JNE  T1             			*if not, go to the next comparison
	AI   R12,-88                	*move was "up", so subtract 88 from R12
	JMP  DRWMAP         			*jumps to the draw routine
T1 	CI   R1,>FC00       			*compares MSB of R1 to >FC (down)
	JNE  T2             			*if not, go to the next comparison
   	AI   R12,88         			*move was "down" so add 88 to R12
	JMP  DRWMAP         			*jump to draw routine
T2	MOVB @>8377,R1            	*move X-return byte into R1
	CI   R1,>0400       			*is the X value "4"? (right)
	JNE  T3             			*if not, go to the next comparison
	INC  R12                    	*move was "right" so add 1 to R12
	JMP  DRWMAP         			*jump to draw routine
T3	CI   R1,>FC00       			*final comparison... is the X value >FC? (left)
	JNE  LP             			*it was not, therefore no motion happened, jump to KSCAN
	DEC  R12                    	*move was "left" so subtract 1 from R12
	JMP DRWMAP                  	*jump to draw routine

Attached Files



#74 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Sat May 29, 2010 11:54 AM

Oh and just as a disclaimer... I'm using all the built in routines (VMBW, etc) just til I can get the hang of it. =) I definitely understand the usefulness of having custom routines, now that it's been explained to me by Matthew

#75 InsaneMultitasker OFFLINE  

InsaneMultitasker

    Stargunner

  • 1,693 posts

Posted Sat May 29, 2010 12:40 PM

        LI   R12,MAPDAT                         *usable register to hold mapdat "status"
DRWMAP  LI   R5,14                       * Draw map
        LI   R0,35                      *Starting screen position
        MOVB R12,R1             *Load current mapdat location into R1
        LI   R2,14    
Quick observation: Your MOVB is suspect... did you intend a MOV? ;)




0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users