Jump to content

Photo

Assembly on the 99/4A


513 replies to this topic

#26 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,375 posts
  • Location:Castaic, California

Posted Sat May 22, 2010 10:28 PM

Thanks retroclouds. I'm glad someone is getting something from these posts. :-)


Okay, on to the TMS9918A Video Data Processor, or VDP from here on out.

The VDP does a few things for the 99/4A:

1. Generates the video we see.
2. Creates the "GROMCLK" (GROM Clock) signal that is used by the TMS9919 sound generator.
3. Provides 16K of RAM that can be used for general purpose data storage.
4. Generates the VSYNC interrupt signal that is sent to the TMS9901.

The VDP is an 8-bit device and in the 99/4A is wired to the upper half (MSB) of the TMS9900's data bus. The VDP is also memory-mapped in the 99/4A and responds to 4 specific addresses:
VDPRD  EQU  >8800             * VDP read data
VDPSTA EQU  >8802             * VDP status
VDPWD  EQU  >8C00             * VDP write data
VDPWA  EQU  >8C02             * VDP set read/write address
Like many of the memory-mapped devices in the 99/4A, the VDP addresses are not fully decoded and so the VDP will respond to other addresses as well, but these are the universally excepted ones.

The VDP is an asynchronous device, which means it does not run on the main 99/4A clock of 3MHz. Also, the VDP has no way to communicate back to the 99/4A that a write operation is complete or that read data is ready on the bus. It all comes down to timing and has been the topic of many long threads, i.e. how fast can you reliably write to the VDP? I'm not going to go into it here because it takes a lot of explaining, none of which gets us any closer to writing games. However, I will cover basically what happens when you read and write.

Oh yeah, first I have to explain a little where the VDP "sits" in the whole scheme of things. Using high tech ASCII characters, this is what it looks like (I'm using code tags to keep stuff lined up):
TMS9900 <--+
           |
 [16-bit data bus (bi-directional)]
           |
           +--> 256-byte scratch pad RAM
           |
           +--> ROMs
           |
         <<|>> wait state generator
           |
           +--> 16 to 8-bit multiplexer <--+
           |                               |
           |                               +--> cartridge port
           |                               |
           |                               +--> PEB port
           |
           +--[upper 8-bits]--> TMS9918A <--> 16K Video RAM
This is only showing the data bus, and it is not totally complete, but it is just to get the idea across of how the VDP hooks to the CPU and where the 16K of VDP RAM (VRAM) is located. You can see that everything other than the scratch pad RAM and ROMs incur the wait states, also that to access the 16K of VRAM we have to go *through* the VDP.

The VDP's CPU interface consists of 3 input control pins (inputs to the VDP) and a bi-directional 8-bit data bus. The control pins are wired to the CPU's *address* bus in such a way that when you read or write to the special memory-mapped addresses listed above, the 3 control pins will have the proper signals to perform the specific operation.

To get data to and from the VDP, there are two kinds of reads and three kinds of writes:

1. Read the status register
2. Read a data byte
3. Write to a register
4. Write to (set up) the VDP address register
5. Write a data byte


The VDP has eight *write-only* registers that we use to set up the video mode and various "tables" in the VRAM (i.e. sprites, patterns, etc.), but we'll cover that later. There is one read-only register called the status register that we use to test for (and clear) the vertical sync signal. To control where data is read and written in the VRAM there is an auto-incrementing address register that we can set. So, to sum up, we have:

* One read-only status register
* Eight write-only registers to set operating modes and data table locations
* One auto-incrementing address register

The status and write-only registers are 8-bits, while the address register is 14-bits which is required to address all 16K of the VDP RAM. Setting up the address register requires us to write to the VDP twice, but we'll get to that in a moment.


The status register gives us four pieces of information:

1. If the VSYNC occurred.
2. If any two sprites coincide.
3. If there are 5 or more sprites on a line.
4. The number of the 5th sprite on a line.

The flag positions in the status register are set up like this:
Bit 0   1    2    3    4    5    6    7
+----+----+----+----+----+----+----+----+
|    |    |    |    |    |    |    |    |
|  I | S5 |  C |  fifth sprite number   |
|    |    |    |    |    |    |    |    |
+----+----+----+----+----+----+----+----+
Once any of these flags are set in the status register, they stay set until the status register is read, which then resets all the flags in the status register. So, reading the status register on a regular basis is necessary if you want to detect when any of these events happens more than once. The VSYNC is a good example. In case you're not clear, the VSYNC (vertical sync) is a pulse that is generated by the video timing logic and happens 60 times in 1 second (for NTSC.) Its primary function is to tell the monitor to sweep back to the starting position and get ready to display another video frame. The VDP makes this signal available to us in two ways:

1. Via the status register.
2. Via an interrupt output.

Unfortunately the VSYNC interrupt output generated by the VDP goes to the TMS9901 and triggers the console's interrupt service routine which resides in ROM. There is really not a whole lot we can do about that due to the way the 99/4A is wired, or without going through a lot of crazy hoops with the TMS9901 (and you really don't want to do that.)

However, what we can do is disable the interrupts from being delivered to the CPU with the familiar "LIMI 0" instruction, then test for the VSYNC signal by reading the VDP's status register in a loop (like in our game loop is a good place.) While this is not as convenient as using the interrupt, it does keep the console ISR out of our hair. Why is that important? Because the console ISR does a lot of stuff, and besides taking up time doing things we don't care about during out game, the ISR uses the heck out of the 256 bytes of scratch pad RAM. So, if we have our game variables set up in scratch pad, letting the console ISR run would trash our game.


To read the status register, we use the "VDP read status" address, like this:
        MOVB @>8802,R1
The MSB of R1 now contains the contents of the status register, and the status register is now cleared. Why the MSB of R1? Because if you remember, the VDP is wired to the upper 8-bits of the CPU's data bus, so when moving data between the VDP and a register, the MSB will always be the byte sent. Also, the MOVB instruction, when dealing with a register, always operates on the MSB.

Now we can test the bits in R1 to see if an interrupt occurred, if two sprites collided (note that we don't know *which* sprites, just that two did collide), and if there were 5 or more sprites on a single horizontal line.


Like reading the status register, reading a byte from VDP memory is done via the "read data" address:
        MOVB @>8800,R1
Now the MSB of R1 will contain the byte from VRAM that was pointed to by the VDP's address register. The VDP address register was then auto-incremented so the next time we read we will get the next byte in VRAM. This is a very handy feature and means that once we set up the address register we can read and write blocks of data rapidly.

Note: all these different VDP "ports" (memory mapped addresses) are necessary because of the TMS9900's read-before-write nature.


To write to VRAM, we use the "write data" port:
        MOVB R1,@>8C00
This will write the MSB of R1 to the byte in VRAM at the address pointed to by the VDP's address register. The address register is then auto-incremented so the next write will go to the next byte in VRAM.


The last thing we need to be able to do is set the VDP's internal address register so we can read and write data to specific locations in the VRAM. Setting up the address register is a little more involved because it requires two bytes to represent the 14-bit VRAM address we want, and because the VDP can be told that we are setting up an address for reading or writing. That last point can be a little confusing because, since there is only one address register in the VDP, why do we set it differently for reading or writing?

In actuality, setting the read or write address does the same thing (gets a 14-bit address into the address register), however when setting the read address the VDP will do a "pre-fetch" of the data at the address we set up, then auto-increment the address. Then when we do a read, the pre-fetched byte will be given to us and the next byte will be pre-fetched and the address auto-incremented.

Thus the address actually stored in the address register is actually 1 byte ahead of the byte we are actually getting. The VDP does this to help make sure it can give us the data as quickly as possible, since it has no way to tell the CPU to "wait, I don't have the data yet". The CPU expects the data byte to be available within a certain amount of time, and if it is not, the value you get will not be what is actually in VRAM. The sequence in the VDP goes something like this:

* 1st byte sent to set read/write address via port >8C02
* VDP loads byte to LSB of address register
* 2nd byte sent to set read/write address via port >8C02
* VDP loads byte to MSB of address register
* VDP fetches the byte specified by the address register
* VDP auto-increments the address register

The address register is now 1 address after the address we just set up, but the byte we get will be correct since it was fetched before the auto-increment.

Now, when we read:
* VDP gives us pre-fetched byte
* VDP fetches the byte specified by the address register
* VDP auto-increments the address register


Now, when we set up a "write address", the VDP does not auto-increment the address register because we are providing the data to be written. There is nothing to "pre-fetch", so the address register stays at the address we just sent:

* 1st byte sent to set read/write address port >8C02
* VDP loads byte to LSB of address register
* 2nd byte sent to set read/write address port >8C02
* VDP loads byte to MSB of address register

Now when we write, this happens:

* One byte sent to the VDP write data address port >8C00
* VDP stores the byte at the address in VRAM pointed to by the address register
* VDP auto-increments the address register


None of this is a problem really, but look at what happens when we *do* the opposite of what we told the VDP we were going to do. If we set up a "read" address, then instead of reading we do a write (this is perfectly legal), the address we will be writing to is one higher in memory than where we think. This is because when we set up the VDP read address, the VDP pre-fetched the byte and auto-incremented the address register, assuming we were going to read. If you are not aware of this "feature", then you may wind up pulling your hair out trying to figure out why your screen is 1 tile off or your patterns are screwed up.

Reading after setting up a write address does not really have a damaging affect, other than the byte will not be pre-fetched and will thus take longer for the VDP to get the data. Hmm. I wonder... I might be revisiting this after some testing. It might be possible that, when reading, the VDP always assumes the pre-fetch was performed and just hands us whatever data happened to have been in the pre-fetch buffer. I need to test this.

However, if you always perform a read after setting a read address, or write after setting a write address, then you will always read and write the data you expect, to the address you expect.


So, how does the VDP know we are setting up a read or write address if there is only one memory mapped port "VDP set read/write address" at >8C02? The answer is, the VDP looks at the upper two bits of the 2nd address byte we send. Since the VDP address register is 14-bits, the 1st byte (8-bits) plus 6-bits from the 2nd byte are used to form the address. The two most significant bits of the 2nd byte we send inform the VDP that this is a read or write address:
|               2nd byte                |               1st byte                |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| t0 | t1 | a0 | a1 | a2 | a3 | a4 | a5 | a6 | a7 | a8 | a9 | a10| a11| a12| a13|
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
|  type   |                   14-bit VDP address register                       |

Here is the truth table for the two "type" bits:

00 : Setting up a read address
01 : Setting up a write address
10 : Writing to a write-only register
11 : Illegal / undefined

Here is the code to set up a read address:
WRKSP  EQU  >8300             * Where we will set the WP
R0LB   EQU  WRKSP+1           * Memory address were R0's LSB will be

. . .
       LWPI WRKSP
. . .

       LI   R0,384            * R0 contains the address we want to set up
       MOVB @R0LB,@VDPWA      * Send low byte of VDP RAM write address
       ANDI R0,>3FFF          * Set read/write bits 0 and 1 to read (00)
       MOVB R0,@VDPWA         * Send high byte of VDP RAM write address

A few things to understand here. First, I'm using the fact that the CPU's general purpose registers are memory-based and I'm using a memory-to-memory move to send the low byte. This could be done a lot of different ways, and something common you will see is this:
       SWPB R0
       MOVB R0,@VDPWA
       SWPB R0
       ANDI R0,>3FFF
       MOVB R0,@VDPWA

Remember, we have to send the LSB to the VDP first. The SWPB (SWaP Bytes) instruction will do just that, swap the register's low and high bytes. So the LBS of the address we want to set up is now in the MSB of R0, and sent to the VDP. Then the second SWPB puts the bytes back to their original form and the MSB of the address we want to set up is sent. Personally I don't really like this method, but it works and you might see it in code out there in the wild. There were other reasons the SWPB was used that have to do with timing, but that's a longer story.

The ANDI R0,>3FFF instruction masks out the upper two bits and makes sure they are zero, which indicates to the VDP that we are setting up a read address. If we assume that the programmer will never set a VDP address greater than 14-bits, then the upper two bits will always be zero and we can remove this instruction. This is what I personally do since it is quicker and the VDP functions we are developing will be used a lot in games. Thus, my version of setting up a VDP read address can be reduced to two instructions and assumes the VDP address to set up has been loaded into R0:
       MOVB @R0LB,@VDPWA      * Send low byte of VDP RAM write address
       MOVB R0,@VDPWA         * Send high byte of VDP RAM write address


Setting a write address is exactly the same except we make sure that the two "type" bits are 01 by using an ORI (OR Immediate) instruction to set the bits. Actually this will not work if most significant bit of R0 was already 1, but I assume again that the programmer will not load a set up address into R0 that is greater than the 14-bits used by the VDP:
       MOVB @R0LB,@VDPWA      * Send low byte of VDP RAM write address
       ORI  R0,>4000          * Set read/write bits 0 and 1 to write (01)
       MOVB R0,@VDPWA         * Send high byte of VDP RAM write address


At this point you are ready to rock. You can start reading and writing bytes to the VRAM. The only thing we did not do is specifically set a graphics mode and fix up the various tables in VRAM by writing to the VDP's write-only registers. But, since this post is already long enough, that will come in another post.

Next I'll present a complete set of VDP routines that use the same calling convention (using R0, R1, and R2) as the routines available in the E/A or XB cartridges. We'll even add a new routine of our own that makes VRAM initialization fast and easy.

Matthew

#27 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 3,852 posts
  • Location:Denmark

Posted Sat May 22, 2010 11:58 PM

Very nice. Thank you. :)

#28 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Sun May 23, 2010 1:50 AM

Dude--- I can read the same info a hundred times, but until you or Filip explain it to me, it makes no sense. This is excellent!!! I hope you know these tutorials are ALL going on my site and I hope to make an Assembly Anecdote paperback someday just for myself. :)

#29 retroclouds OFFLINE  

retroclouds

    Stargunner

  • 1,521 posts
  • Location:Germany

Posted Sun May 23, 2010 2:21 AM

This is really turning into something very useful.
I'd love to see all those articles combined in a text-only or PDF for offline viewing.

Thinking about it; you could also add a "cookbook" to go with it.
With recipes/small examples for dealing with the common TI game development tasks: setting up a game loop, check for sprite collision, ...
You could then refer to the corresponding articles for getting all required background information.

Either way, very nice Matthew! :D

#30 Vorticon OFFLINE  

Vorticon

    River Patroller

  • 2,670 posts
  • Location:Eagan, MN, USA

Posted Sun May 23, 2010 6:32 AM

This is really turning into something very useful.
I'd love to see all those articles combined in a text-only or PDF for offline viewing.

Thinking about it; you could also add a "cookbook" to go with it.
With recipes/small examples for dealing with the common TI game development tasks: setting up a game loop, check for sprite collision, ...
You could then refer to the corresponding articles for getting all required background information.

Either way, very nice Matthew! :D


I'm actually copying and pasting them into a Word document because they are so good :)

#31 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,375 posts
  • Location:Castaic, California

Posted Sun May 23, 2010 8:50 AM

Thanks for the feedback guys, that's pretty much the motivation to keep posting. :-)

I was writing all this stuff down in preparation for a book on programming in general, with a focus on the 9900 and 99/4A. However, while I was writing in private, no one was getting the info and I was worried that, like most of my projects, I would not release it until it was perfect and that would end up being *never*. So, I just decided to just start posting and see where it goes.

Hopefully we'll get through the necessary background soon and on to the gaming stuff!

Matthew

#32 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Sun May 23, 2010 11:42 AM

This is very important for me--- these basic explanatory tutorials. Each time you post, and each time I have time to run through a chapter if SPECTRA, I learn more than I did from Molesworth, Lottrup, and the rest. The more I can grasp early on, the more the later tutorials will make sense. Since the language for Beryl Reichardt is as yet undetermined, I am very glad these are getting posted. By the time graphics and character data, etc are finished, I may very well know enough to move forward in assembly. I had started to think--- what if the game engines and such could be put on cart and the map data could be on diskette? Make a package out of it!! Kind of like Tunnels of Doom--- Might even be possible to use cassette instead of diskette--- although that would be kinda tough.... CS1 is slow, but if all the necessary loadable DATA could fit onto cassette in "world-by-world" segments, it's possible. :) Just thinking out loud. Thanks for your posts Matthew!!!

#33 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,375 posts
  • Location:Castaic, California

Posted Mon May 24, 2010 12:27 AM

Okay, the last thing we need to cover on the VDP is the eight write-only registers. Here they are:
REG 0
+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |
|  0  |  0  |  0  |  0  |  0  |  0  |  M3 |  EV |
|     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+
  MSB                                       LSB


REG 1
+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |
|4/16K|BLANK|  IE |  M1 |  M2 |  0  | SIZE| MAG |
|     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+
  MSB                                       LSB


REG 2
+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |
|  0  |  0  |  0  |  0  |NAME TABLE BASE ADDRESS|
|     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+
  MSB                                       LSB


REG 3
+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |
|           COLOR TABLE BASE ADDRESS            |
|     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+
  MSB                                       LSB


REG 4
+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |PATTERN GENERATOR|
|  0  |  0  |  0  |  0  |  0  |BASE ADDRESS     |
|     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+
  MSB                                       LSB


REG 5
+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |
|  0  |   SPRITE ATTRIBUTE TABLE BASE ADDRESS   |
|     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+
  MSB                                       LSB


REG 6
+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     | SPRITE PATTERN  |
|  0  |  0  |  0  |  0  |  0  | GENERATOR BASE  |
|     |     |     |     |     | ADDRESS         |
+-----+-----+-----+-----+-----+-----+-----+-----+
  MSB                                       LSB


REG 7
+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |
|     TEXT COLOR 1      |     TEXT COLOR 0 /    |
|     |     |     |     |     BACKGROUND COLOR  |
+-----+-----+-----+-----+-----+-----+-----+-----+
  MSB                                       LSB
There is a lot of wasted spaces in the registers (lots of zeros), but I suspect that has to do with making the internal circuit more efficient.

Registers 0 and 1 are the most "busy" and set up a lot of the VDP's options, video mode, etc., so let's break it down.

REG 0: EV bit: This bit sets whether or not the External Video is enabled. Bet you didn't know that the 9918A could support external video! However, in the 99/4A the external video input pin is not connected, so we can't use the feature, thus, EV should always be zero.

REG 0: M3
REG 1: M1
REG 1: M2

These three bits control the video mode, which can be Graphics I, Graphics II, Multicolor, and Text mode. The mode is set like this:
M1 M2 M3
0  0  0   Graphics I mode
0  0  1   Graphics II mode
0  1  0   Multicolor mode
1  0  0   Text mode
Why they used 3-bits to represent 4 modes I have no idea (they could have used just 2 bits. Maybe there were originally going to be more than 4 modes...)

REG 1: 4/16K bit: Tells the VDP how much RAM is attached to the VDP. 0 means 4K, 1 means 16K. This bit should always be set to 1 for the 99/4A, unless you only want to use 4K of VRAM.

REG 1: BLANK bit: When set to 0 it causes the display to only show the border (background) color. The console ISR sets this to 0 after no key has been pressed for 300 seconds or so. We will normally want to have this flag set to 1.

REG 1: SIZE: This is how we set the sprite size. A 0 selects 8x8 sprites, a 1 selects 16x16 sprites.

REG 1: MAG: Just like SIZE. A 0 selects 1X sprites, 1 select 2X sprites.


Okay, so that's it for REG 0 and REG 1 and the jumble of flags in them. The rest of the registers have to do with table locations in VRAM, except for REG 7 which sets the foreground and background colors for text mode (and the background (a.k.a. border) color.)

I've pretty much covered how the VDP uses the various tables, so I won't rehash that here (if you don't remember, look back at the posts on page 1 of this thread.) The 9918A is very flexible in that the tiles and sprites are all generated from patterns in VRAM. Most other systems of that era (and even today) have character ROMs, so you can't redefine the tile patterns!

There are 3 primary tables the VDP needs to draw the tiles on the screen:

1. The tile "names"
2. The tile "patterns"
3. The tile "colors"

Each table's location in VRAM must be designated with a 14-bit address, so the bits in the registers are used to form part of the address, with the other bits coming from other source like the current x,y location of the screen raster, the tile name, and other such things. I won't cover the table address construction in detail since that is what the VDP's datasheet is for, but I will try to explain why each "base" address in the write-only register is multiplied by a certain value.

In the 14-bit address that designates where the name table is located, the four bits from REG 2 fit in like this:
REG 2
  0    1    2    3    4    5    6    7
+----+----+----+----+----+----+----+----+
|  0 |  0 |  0 |  0 | n0 | n1 | n2 | n3 |
+----+----+----+----+----+----+----+----+
                      |    |    |    |
                      \/   \/   \/   \/
                    +----+----+----+----+----+----+----+----+----+----+----+----+----+----+
Name table location | n0 | n1 | n2 | n3 |    |    |    |    |    |    |    |    |    |    |
                    +----+----+----+----+----+----+----+----+----+----+----+----+----+----+
                      0    1    2    3    4    5    6    7    8    9    10   11   12   13
The four name table bits make up the upper four most significant bits of the address for the location of the name table. The lower ten bits are formed from part of the x,y location of the raster as it moves across the monitor screen. Thus, if the value in REG 2 is >00 (0000 0000), the name table address will be >0000 (00 0000 0000 0000). If we set the value in REG 2 to >01 (0000 0001), then the name table address becomes >0400 (00 0100 0000 0000). So, just changing the name base register by 1 value, the name base table moved from address >0000 to >0400. If we change REG 2 to >02 then the name base table will be at >0800, and so on.

This is why you will read that the value in the various table set up registers are multiplied be certain values. Because these bits are used to form the upper most significant bits of the table's address in VRAM. To continue the example with the name table, to take 4-bits to 14-bits, we have to shift to the left 10 times, and every shift multiplies the number by 2. So 2^10 is 1024 or >0400. Hmm, that looks familiar. :-) So, whatever value is in the name base table will effectively be multiplied by >0400 to make the name table base address:
              REG 2:         0000 1111
Shift left 10 times: 11 1100 0000 0000
Thus, the name base table can be located in VRAM anywhere on a >0400 boundary. Since the maximum address that can be represented with 14-bits is >3FFF, there are 16 locations in VRAM we can locate the name table:
      REG 2    VRAM Address
1.  0000 0000  >0000
2.  0000 0001  >0400
3.  0000 0010  >0800
4.  0000 0011  >0C00
5.  0000 0100  >1000
6.  0000 0101  >1400
7.  0000 0110  >1800
8.  0000 0111  >1C00
9.  0000 1000  >2000
10. 0000 1001  >2400
11. 0000 1010  >2800
12. 0000 1011  >2C00
13. 0000 1100  >3000
14. 0000 1101  >3400
15. 0000 1110  >3800
16. 0000 1111  >3C00
You make have noticed that >0400 is 1024 in decimal, or 1K, and since there is 16K of VRAM, that means 16 possible locations. Now, the screen is made up of 32x24 tiles which is 768 locations, each of which contains a "name" to indicate what tile to display at that location. A "name" is a single byte, and thus a value between 0 and 255 (>00 and >FF). So, the name table is 768 bytes long and located on a 1K boundary.

Since a lot of the VRAM is not needed in all but the Graphics II mode, you can have multiple name tables set up in VRAM and do double or triple buffering, i.e. display one name table while writing to another, then you can switch the whole display simply by changing the value in REG 2!

The pattern generator table and color table are set up the same way, except the pattern generator only has 3-bits to locate the table, and the color table has 8-bits:
  0    1    2    3    4    5    6    7    8    9    10   11   12   13
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| p0 | p1 | p2 |    |    |    |    |    |    |    |    |    |    |    |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+

  0    1    2    3    4    5    6    7    8    9    10   11   12   13
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| c0 | c1 | c2 | c3 | c4 | c5 | c6 | c7 |    |    |    |    |    |    |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
Thus the pattern generator can only be in 7 possible locations in VRAM, or on 2K boundaries. Since the pattern generator table has to hold patterns for all 256 possible "names", and each pattern is an 8x8 pixel tile, 8 bytes are required for each "name". So, 256 * 8 = 2048 (>0800) bytes, or 2K bytes long.

The color table is the most flexible since all 8-bits of REG 4 are used in making the color table location address. However, the color table is short, only 32 (>20) bytes in length. Each byte in the color table defines the foreground and background color for eight "names" in the name table. This is where the "color sets" come from. The 1st byte in the color table defines the color for "names" 0 to 7. The 2nd byte for "names" 8 to 15, and so on. The color table can be located anywhere in VRAM on a 64 (>40) byte boundary, for a total of 256 possible locations.

The sprite tables work exactly the same way, with the sprite attribute table being almost as flexible (it uses 7-bits) in location as the color table, and the sprite generator table being exactly the same as the pattern generator table with only 3-bits to specify the table location.

Note that any of these tables can overlap and it is up to you to make sure they do not. Or, in some cases you may want them to overlap! For example, you can set the tile and sprite pattern generator tables to the same base value and use the same pattern definitions for both tiles and sprites. This saves VRAM if you need the space. If not, your sprites can use their own set of patterns separate from those used to display tiles!


Writing to the VDP registers is just as easy as setting up the VDP's internal auto-incrementing address (covered in my previous post.) It is actually the same except for one difference, we have to make sure we set the upper two bits of the 2nd byte to "10" (one zero, *not* the number ten) to let the VDP know we want to write to a register. The lower 3-bits of the 2nd byte designate the register we want to write to. And finally, the 1st byte we sent will be the value written into the register. Here is the sequence:

* 1st byte sent to set read/write address via port >8C02
* 2nd byte sent to set read/write address via port >8C02
* VDP checks if the top two bits of the 2nd byte are "10", and if so writes the 1st byte to the register designated by the lower 3-bits of the 2nd byte.
|               2nd byte                |               1st byte                |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
|  1 |  0 |  0 |  0 |  0 | r0 | r1 | r2 |       DATA TO WRITE TO REGISTER       |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
|  type   | must be zero |   register   |
Just like when setting up a read or write address, the two "type" bits indicate what to do. In this case "10" means write to a register. r0, r1, and r2 make a number between 0 and 7:

000 = REG 0
001 = REG 1
010 = REG 2
011 = REG 3
100 = REG 4
101 = REG 5
110 = REG 6
111 = REG 7

This is just normal binary and should be familiar to you by now. Here is the code we can use to write to a register:
       LI   R0,>0201          * Set REG 2 to 00000001 (locate name base table at >0400)
       MOVB @R0LB,@VDPWA      * Send low byte (value) to write to VDP register
       ORI  R0,>8000          * Set up a VDP register write operation (10)
       MOVB R0,@VDPWA         * Send high byte (address) of VDP register
The LSB of R0 contains the value we want to write to the register, the MSB contains the resister to write to.

Okay, that's it for the VDP! Not so hard was it? We will cover sprites in detail a little later, since I'm sure you (like me) are itching to get going on something useful. In my next post I'll present a complete set of VDP functions, then it will be on with doing something more fun.

Matthew

Edited by matthew180, Mon May 24, 2010 12:32 AM.


#34 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,375 posts
  • Location:Castaic, California

Posted Mon May 24, 2010 11:55 AM

Okay, so here are the VDP functions I promised. If you are reading this to learn assembly, then I *highly recommend* you type in this code instead of just copy and pasting. I know it sucks, but when you type each instruction you are forced to look at it in detail and it helps drive home what is going on. Also, you get more familiar with actually writing code, which is what you are trying to do anyway. Personally I always learn something or resolve a problem by typing in the examples.

I put this code at the bottom of my programs, just above the character definitions I posed before is a good place.
*********************************************************************
*
* VDP Single Byte Write
*
* R0   Write address in VDP RAM
* R1   MSB of R1 sent to VDP RAM
*
* R0 is modified, but can be restored with: ANDI R0,>3FFF
*
VSBW   MOVB @R0LB,@VDPWA      * Send low byte of VDP RAM write address
       ORI  R0,>4000          * Set read/write bits 14 and 15 to write (01)
       MOVB R0,@VDPWA         * Send high byte of VDP RAM write address
       MOVB R1,@VDPWD         * Write byte to VDP RAM
       B    *R11
*// VSBW


*********************************************************************
*
* VDP Single Byte Multiple Write
*
* R0   Starting write address in VDP RAM
* R1   MSB of R1 sent to VDP RAM
* R2   Number of times to write the MSB byte of R1 to VDP RAM
*
* R0 is modified, but can be restored with: ANDI R0,>3FFF
*
VSMW   MOVB @R0LB,@VDPWA      * Send low byte of VDP RAM write address
       ORI  R0,>4000          * Set read/write bits 14 and 15 to write (01)
       MOVB R0,@VDPWA         * Send high byte of VDP RAM write address
VSMWLP MOVB R1,@VDPWD         * Write byte to VDP RAM
       DEC  R2                * Byte counter
       JNE  VSMWLP            * Check if done
       B    *R11
*// VSMW


*********************************************************************
*
* VDP Multiple Byte Write
*
* R0   Starting write address in VDP RAM
* R1   Starting read address in CPU RAM
* R2   Number of bytes to send to the VDP RAM
*
* R0 is modified, but can be restored with: ANDI R0,>3FFF
*
VMBW   MOVB @R0LB,@VDPWA      * Send low byte of VDP RAM write address
       ORI  R0,>4000          * Set read/write bits 14 and 15 to write (01)
       MOVB R0,@VDPWA         * Send high byte of VDP RAM write address
VMBWLP MOVB *R1+,@VDPWD       * Write byte to VDP RAM
       DEC  R2                * Byte counter
       JNE  VMBWLP            * Check if done
       B    *R11
*// VMBW


*********************************************************************
*
* VDP Single Byte Read
*
* R0   Read address in VDP RAM
* R1   MSB of R1 set to byte from VDP RAM
*
VSBR   MOVB @R0LB,@VDPWA      * Send low byte of VDP RAM write address
       MOVB R0,@VDPWA         * Send high byte of VDP RAM write address
       MOVB @VDPRD,R1         * Read byte from VDP RAM
       B    *R11
*// VSBR


*********************************************************************
*
* VDP Multiple Byte Read
*
* R0   Starting read address in VDP RAM
* R1   Starting write address in CPU RAM
* R2   Number of bytes to read from VDP RAM
*
VMBR   MOVB @R0LB,@VDPWA      * Send low byte of VDP RAM write address
       MOVB R0,@VDPWA         * Send high byte of VDP RAM write address
VMBRLP MOVB @VDPRD,*R1+       * Read byte from VDP RAM
       DEC  R2                * Byte counter
       JNE  VMBRLP            * Check if finished
       B    *R11
*// VMBR


*********************************************************************
*
* VDP Write To Register
*
* R0 MSB    VDP register to write to
* R0 LSB    Value to write
*
VWTR   MOVB @R0LB,@VDPWA      * Send low byte (value) to write to VDP register
       ORI  R0,>8000          * Set up a VDP register write operation (10)
       MOVB R0,@VDPWA         * Send high byte (address) of VDP register
       B    *R11
*// VWTR

These functions are used exactly like the ones TI came up with for the E/A and XB carts. R0 holds the VRAM address of where to read or write. R1 holds the value to write, or receives the value that was read. Or, in the case of the multiple-byte read and write functions (VMBR / VMBW) R1 holds the CPU address of where the data is read from or written to. R2 holds the count of the number of bytes to read or write when using VMBR or VMBW. The functions I provided are commented, and you should recognize the register use if you have messed with any assembly before.

The main difference about my functions is that I do not use BLWP to call them. Why not? Well, it comes down to memory use and speed (and I am a speed freak.)

BLWP means "Branch and Load Workspace Pointer". It is a way to "branch" to the designated subroutine, but it also loads the workspace pointer with a new value. That means the called subroutine will be using a different workspace base address and thus will not clobber any of the values you have in your own workspace.

When the subroutine is complete, it returns and restores the workspace we had set up before the instruction. While this may sound good and all, there are a few downsides to BLWP:

* There has to be a new workspace somewhere, the address of which is loaded into the workspace pointer. That means either someplace in scratch pad RAM (the desired location for any workspace) or in 8-bit RAM (which is slow slow slow.) And, if the new workspace is in the scratch pad RAM, then that is another 32 bytes of scratch pad we cannot use for our game. It also means we have to know *what* that new workspace pointer value will be, so we need extra knowledge about the subroutine and how it will work.

* The instruction itself is slow because of what it has to do. It is in the top 6 slowest instructions at 26-clock cycles best case scenario.

* There is some additional setup to call subroutines with BLWP.

* The instruction and setup takes additional memory.

All that just to not have R0, R1, and R2 unaffected by the VDP subroutines. To me it is not worth it, so I simply use BL (branch and load) to call the subroutines and understand that the values in R0, R1, and R2 may be modified by the subroutines. But that is okay, I loaded those registers before I called the subroutine, so I know what was in them.

This is where register use really comes in to play, and how you decided to use the 16 general purpose registers. When I was just starting out with assembly (which was also when I was learning to program all together), I tried to keep certain values in certain registers because I did not really translate the idea of variables (like in XB) to assembly. For example, if I would do stuff like: R4 is the X location of the player, R5 the Y location, R6 the score, etc. Needless to say, I ran out of registers really quick!

That is the wrong way to think about registers though. You must have variables in your program and those variables are stored in RAM. The registers are used to manipulate the variables, run loops, control flow, call subroutines, etc. My code habits and register use are now somewhat like this:

R0, R1, and R2 are always used for VDP interaction.

R3, R4, R5, and R6 are typically general purpose and always in flux. I always use R3 and R4 for the MUL and DIV instructions for some reason.

R7 is used when I call the random number subroutine. I think I do this because of reading the Tombstone City code back in the day.

R8 and R9 are just extra.

R10 holds my pseudo stack pointer to help in calling a few levels of subroutines.

R11 is used by the BL instruction to save the current PC value so the subroutine can return.

R12 is used by the CRU instructions and necessary for keyboard and joystick checking.

R13, R14, R15 are just extra unless you are going to use BLWP, in which case they are all used.


These are just my conventions, and I recommend you develop your own as you use assembly language. Everyone will do things a little differently and that's okay.


The other thing I wanted to cover is the new VDP subroutine that does not have a TI equivalent. Several people have come up with the same thing independently, and it is a natural progression once you start understand what is going on.

The new subroutine is called VSMW which stands for "VDP Single Byte Multiple Write" and is included in the code above. It is useful when you want to write the same byte value to multiple VRAM addresses, like when initializing VRAM, i.e. clearing the screen.

You will typically see "clear the screen" examples written something like this:
       LI   R0,>0000          * Start at upper left corner of the screen, assume name table is at >0000
       LI   R1,>2000          * Write >20 (32 decimal), remember the MSB is sent to the VDP
       LI   R2,768            * 768 screen locations (32x24 tiles)
CLS    BLWP @VSBW             * Write the space
       INC  R0                * Next screen location
       DEC  R2                * Decrement the counter
       JNE  CLS               * If not done (zero), jump to CLS

The main problem is, VSBW sets up the VDP address "every time" it is called. Now, knowing what we do about the VDP and its auto-incrementing address register, once we set the address to >0000 and write the first byte, the VDP's address register will auto-increment and already be pointing to the next address we need to write to. It does not need to be set again, but this is what happens when you use VSBW.

So, why not use VMBW? Because VMBW requires us to provide an *address* of where to copy data from, not a single value to write repeatedly. VMBW is for copying blocks of data from CPU RAM to VRAM.

Let's write a custom CLS program to see what all needs to be done:
       LI   R0,>0000          * Start at upper left corner of the screen, assume name table is at >0000
       LI   R1,>2000          * Write >20 (32 decimal), remember the MSB is sent to the VDP
       LI   R2,768            * 768 screen locations (32x24 tiles)
* Set the VDP address ONCE
       MOVB @R0LB,@VDPWA      * Send low byte of VDP RAM write address
       ORI  R0,>4000          * Set read/write bits 14 and 15 to write (01)
       MOVB R0,@VDPWA         * Send high byte of VDP RAM write address

CLS    MOVB R1,@VDPWD         * Write the >20 byte to VRAM, VDP's internal address register auto-increments
       DEC  R2                * Decrement our counter
       JNE  CLS               * If not done (zero), jump to CLS
We still have to keep track of the number of bytes we have sent to the VDP, but we don't have to set the address again, we know it is auto-incrementing in the VDP, and after 768 writes, every location in the name table will contain >20 and the screen will be clear.

So, what we need is a cross between VSBW and VMBW, a subroutine that will write the same value like VSBW, but set up the address once and write it multiple times, like VMBW. That's what VSMW does, and our final code segment would look something like this:
       LI   R0,>0000          * Start at upper left corner of the screen, assume name table is at >0000
       LI   R1,>2000          * Write >20 (32 decimal), remember the MSB is sent to the VDP
       LI   R2,768            * 768 screen locations (32x24 tiles)
CLS    BL   @VSMW             * Write the space 768 times
Kind of nice (I think.) Note that we are using BL instead of BLWP, which means the call to the subroutine will be faster. Also, we no longer have a loop to deal with since that is taken care of for us by VSMW.

So there you go, a set of handy and fast VDP routines that you can use in your games. I have been using these for a few years and have tested them recently in the FlyGuy II game (see the thread in this forum if you are interested.) You also got to see how to access the VDP directly, which is sometimes necessary in certain situations, and you should not be scared to do stuff like that, it is perfectly okay. The general VDP routines are just that, general. If you have a special case or time critical loop, by all means use the VDP directly!

In the next post we can finally get some code that you can compile and get something on the screen! I can't wait! :-)

Matthew

#35 retroclouds OFFLINE  

retroclouds

    Stargunner

  • 1,521 posts
  • Location:Germany

Posted Mon May 24, 2010 12:11 PM

As always; very cool :cool:

#36 adamantyr OFFLINE  

adamantyr

    Stargunner

  • 1,103 posts

Posted Mon May 24, 2010 12:50 PM

Not just a speed freak, but an absolute enemy of the ISR. :) Perhaps you should break down what the ISR is doing and how to do everything it does without actually using the ROM-based routine?

In my CRPG, I keep the BLWP's for my video routines. The main reason I do this is because I don't want the position and counts to be destroyed and have to be restored every time. I could probably write around that, but I want to minimize code space usage. In this instance, BLWP is a minor trade-off of speed for space.

I do use direct VDP addressing when I have a serious need for speed, like when I update a portion of the screen that requires it to "jump" addresses rather than just linearly write. (Like when plotting the right-rail statistics and the battle map.) I also pre-process all the display data into a CPU memory buffer prior to writing, so the video routine is not doing any comparison other than loop checks.

Adamantyr

#37 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Mon May 24, 2010 1:57 PM

Matthew, THANK YOU!!!! This is great stuff! I'll be typing these in one by one tonight. :)

#38 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,375 posts
  • Location:Castaic, California

Posted Mon May 24, 2010 4:10 PM

Not just a speed freak, but an absolute enemy of the ISR. :) Perhaps you should break down what the ISR is doing and how to do everything it does without actually using the ROM-based routine?


Haha! I don't like the console ISR for sure, but it does have its place. If you are doing XB, then it is useful to take care of the house keeping and such. But IMHO it has no place in a game or assembly language program. Its "offerings" are not enough to warrant using it.

What the ISR does, okay, here is the basics:

* First is has to try and figure out *what* caused the interrupt in the first place. This is due to all interrupts in the 99/4A being tied together! Way to go 99/4A designers. The 9900 supports 16 levels of interrupts, but that does us absolutely no good. So, the ISR checks if the interrupt was the VDP. If not, it checks DSR interrupt routines. If not, the cause was the cassette, so you are dumped to the cassette routine.

* If the cause was the VDP, then:

** The sprite movement is processed.
** The sound is processes.
** The QUIT key is checked.
** The check is made to see if the screen can be disabled due to no recent key press.
** The "user ISR" hook is checked, and if active, control is passed to the address specified.

All that and the ISR tromps all over the scratch pad RAM. The only thing that might be useful to an assembly programmer is the sound playing, but we can easily check the VDP status register and update our own sound lists every 1/60th of a second. The XB sound player I wrote for Owen demonstrates all this and it is really pretty easy.

The auto sprite movement, well, good for XB I guess, but in a game you are going to be controlling all of that anyway, and you can be more efficient.

Blanking the screen is not typically something you want to do in a game, and that is really easy to do as well if you need to.

I'll be demonstrating most of these things in the posts to come.

Matthew

Edited by matthew180, Wed Jul 24, 2013 8:00 AM.


#39 marc.hull OFFLINE  

marc.hull

    Stargunner

  • 1,113 posts
  • Location:Oklahoma CIty.

Posted Mon May 24, 2010 9:36 PM

Not just a speed freak, but an absolute enemy of the ISR. :) Perhaps you should break down what the ISR is doing and how to do everything it does without actually using the ROM-based routine?


Haha! I don't like the console ISR for sure, but it does have its place. If you are doing XB, then it is useful to take care of the house keeping and such. But IMHO it has no place in a game or assembly language program. Its "offerings" are not enough to warrant using it.

What the ISR does, okay, here is the basics:

* First is has to try and figure out *what* caused the interrupt in the first place. This is due to all interrupts in the 99/4A being tied together! Way to do 99/4A designers. The 9900 supports 16 levels of interrupts, but that does us absolutely no good. So, the ISR checks if the interrupt was the VDP. If not, it checks DSR interrupt routines. If not, the cause was the cassette, so you are dumped to the cassette routine.

* If the cause was the VDP, then:

** The sprite movement is processed.
** The sound is processes.
** The QUIT key is checked.
** The check is made to see if the screen can be disabled due to no recent key press.
** The "user ISR" hook is checked, and if active, control is passed to the address specified.

All that and the ISR tromps all over the scratch pad RAM. The only thing that might be useful to an assembly programmer is the sound playing, but we can easily check the VDP status register and update our own sound lists every 1/60th of a second. The XB sound player I wrote for Owen demonstrates all this and it is really pretty easy.

The auto sprite movement, well, good for XB I guess, but in a game you are going to be controlling all of that anyway, and you can be more efficient.

Blanking the screen is not typically something you want to do in a game, and that is really easy to do as well if you need to.

I'll be demonstrating most of these things in the posts to come.

Matthew



perhaps you should explain how to turn off those undesirable features of the ISR so that someone doesn't get scared to use the hook....

#40 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,375 posts
  • Location:Castaic, California

Posted Tue May 25, 2010 10:14 AM

perhaps you should explain how to turn off those undesirable features of the ISR so that someone doesn't get scared to use the hook....


Right, I forgot. You can disable the sprite, sound, and QUIT key processing by setting specific bits in address >83C2. The console ISR will still be executed though, and any DSR interrupts can still run, the screen will still be blanked if no key is pushed, and the ISR still has a lot of code to run through making the checks, and it still requires specific use of the scratch pad RAM.

Here are the bits to use to disable what we can (if you are going to use the ISR):
Address >83C2:
   0     1     2     3     4     5     6     7
+-----+-----+-----+-----+-----+-----+-----+-----+
| ALL | SPR | SND | QUIT|       DON'T CARE      |
+-----+-----+-----+-----+-----+-----+-----+-----+

ALL: set to 1 to skip sprite, sound, and QUIT key processing
SPR: set to 1 to skip sprite processing only
SND: set to 1 to skip sound processing only
QUIT: set to 1 to skip the QUIT key test
Even though an interrupt is nice to have, *IMO* is it not worth using in a 100% assembly program. I'm not trying to advocate one way or another, I'm simply trying to give the learner the information to make their own decisions. I hope no one is ever *scared* to try anything in their programs. Nothing will blow up, so look at it as an adventure into the unknown. Honesty that's what make programming fun (at least for me.)

I simply dislike how books explain using sound by setting certain values in various memory addresses, then there is some code like this:

LIMI 2
LIMI 0

And the explanation says "... quickly enable, then disable interrupts so the sound can be heard ..." WHAT?? Never could I find an explanation about what the heck interrupts had to do with the sound and how that worked. Now I do know, it was the ISR's sound player code, which is not such a big deal either. Very simple actually and just as easy to do it ourselves so we have better control (especially for games.)

When using assembly with XB, then definitely yes, the ISR and the user hook are your lifeline. But I wrote an 8-hour post about that already.

Matthew

#41 retroclouds OFFLINE  

retroclouds

    Stargunner

  • 1,521 posts
  • Location:Germany

Posted Tue May 25, 2010 12:33 PM

Nice write-up.

And yes, turn off the ISR if your going 100% assembly language. It will make your life a lot easier.

Actually I had the first early Pitfall prototypes completely running from the ISR hook.
I was so convinced I was doing the right thing. What was I thinking??!! :-o

#42 Tursi OFFLINE  

Tursi

    River Patroller

  • 4,690 posts
  • Location:BUR

Posted Tue May 25, 2010 2:40 PM

Nice write-up.

And yes, turn off the ISR if your going 100% assembly language. It will make your life a lot easier.

Actually I had the first early Pitfall prototypes completely running from the ISR hook.
I was so convinced I was doing the right thing. What was I thinking??!! :-o


That's actually not so unusual! :) Many of Nintendo's NES titles run entirely from vertical blank (see the thread here on the Mario Bros source, for instance).

However, I've been thinking about it since reading Matthew's post this morning. At first, I was thinking that I'd write a contrary view to give both sides, but after a lot of thought, and indeed my current experience with the TI Farmer port, I think I agree. For a game, you are better off disabling interrupts and not using any console functions. For one, it frees up the entire scratchpad RAM for you (at only 256 bytes, it's silly to 'reserve' parts of it, especially if you aren't using the ISR code and just wanted the interrupt itself).

Since most TI games run with interrupts disabled, and do LIMI 2/LIMI 0 when they are ready to deal with the ISR, you can easily replace that with a check of the VDP status bit yourself.. if it's set, then a blank has occurred and you can process your end of screen details. If it's not, then there was nothing to do yet, carry on. You gain even more doing it this way in that the ISR function no longer wastes half your blanking time deciding whether to do anything. ;)

So although this is indeed a new stance for me, I also agree. ASM titles on the TI are better off NOT using the console interrupt code. ;) The benefits far outweigh the small inconvenience of checking it yourself.

#43 Opry99er OFFLINE  

Opry99er

    Quadrunner

  • 8,246 posts
  • Location:Cookeville, TN

Posted Tue May 25, 2010 3:31 PM

That's initially how I thought it worked--- build the program around the ISR.. It seemed to make alot of sense--- but now I am beginning to understand that it's not quite that cut and dried. I am also beginning to see that if I wanted to do "Honeycomb Rapture" in assembly, it would be easy as pie. :) However, I would not use the ISR to accomplish it, now that I know more about how this stuff works. I'm looking forward to learning more as I get better in my assembly demos. So far I've been able to successfully cobble together a few working demos--- none of which are games or even CLOSE to games... But with these tutorials, I'm starting to see that the "big bad" assembly language is really nothing more than applying standard logic (just as in XB) to a more sophisticated and compartmentalized set of requirements. Physically moving values to and from registers and doing the VDP writes and reads manually. It's actually fun. :)

#44 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,375 posts
  • Location:Castaic, California

Posted Tue May 25, 2010 7:19 PM

FlyGuy II pretty much runs only during the vertical retrace, but only because there is really not much to the game. The game loop polls the VDP status register and branches in one of two ways depending on if the VSYNC bit was set. The branch that gets taken when the VSYNC was not detected could be used to do lots of other game stuff other than drawing the screen, and actually I should use that time to chose the new locations for the spiders. If I ever expand the spider AI, that would be the time to run that code.

I'll be providing a basic vsync-polling game loop framework soon.

Matthew

#45 marc.hull OFFLINE  

marc.hull

    Stargunner

  • 1,113 posts
  • Location:Oklahoma CIty.

Posted Tue May 25, 2010 10:33 PM

Nice write-up.

And yes, turn off the ISR if your going 100% assembly language. It will make your life a lot easier.

Actually I had the first early Pitfall prototypes completely running from the ISR hook.
I was so convinced I was doing the right thing. What was I thinking??!! :-o


That's actually not so unusual! :) Many of Nintendo's NES titles run entirely from vertical blank (see the thread here on the Mario Bros source, for instance).

However, I've been thinking about it since reading Matthew's post this morning. At first, I was thinking that I'd write a contrary view to give both sides, but after a lot of thought, and indeed my current experience with the TI Farmer port, I think I agree. For a game, you are better off disabling interrupts and not using any console functions. For one, it frees up the entire scratchpad RAM for you (at only 256 bytes, it's silly to 'reserve' parts of it, especially if you aren't using the ISR code and just wanted the interrupt itself).

Since most TI games run with interrupts disabled, and do LIMI 2/LIMI 0 when they are ready to deal with the ISR, you can easily replace that with a check of the VDP status bit yourself.. if it's set, then a blank has occurred and you can process your end of screen details. If it's not, then there was nothing to do yet, carry on. You gain even more doing it this way in that the ISR function no longer wastes half your blanking time deciding whether to do anything. ;)

So although this is indeed a new stance for me, I also agree. ASM titles on the TI are better off NOT using the console interrupt code. ;) The benefits far outweigh the small inconvenience of checking it yourself.




You are forgetting about the ability to schedule in real time.....

#46 Tursi OFFLINE  

Tursi

    River Patroller

  • 4,690 posts
  • Location:BUR

Posted Tue May 25, 2010 11:21 PM

You are forgetting about the ability to schedule in real time.....


I don't think so...?

#47 marc.hull OFFLINE  

marc.hull

    Stargunner

  • 1,113 posts
  • Location:Oklahoma CIty.

Posted Wed May 26, 2010 7:40 AM

You are forgetting about the ability to schedule in real time.....


I don't think so...?


Perhaps I did not understand your verbiage so some clarification.....

If your all's stance is that the Console ISR leaves something to be desired, then I agree.

It seemed to me though that there was a consensus forming that the interrupt function was not of value in games. If that is the general opinion then I'd have to disagree. Having some experience in writing games that make use of the VDP interrupt I can say for a fact that it makes life much easier and some tasks actually do-able especially integrated music and sound effects.

The scratch pad argument I can understand to a point. If you have code that needs to run at full speed such as a bitmap scroll then by all means that becomes valuable real estate but AFAIK the ISR only humps about 36 bytes of the 256 available and only 4 of those are non contiguous. So it's an easy task to restore those 2 words and move about your business.

The speed argument depends on your MO. If you are running out of CPU RAM and need to squeeze every bit of juice out of the console then the ISR could cause a little drag. At this point it becomes a benefits vs cost argument. If on the other hand you are running out of the cart space and are constantly having to fetch and store data to VDP or read from GROM then I think your ISR speed argument becomes a matter of stepping over a dollar to save a dime.

I understand we all have out motives and favorite configurations for the way we do things and I am not trying to stir up a mess but since I am all ready on this soap box....

If a game programmer is interested in writing games with lots of integrated sound and graphics that operate at a reasonable speed then the best course of action is to get out of the cart space and take advantage of the 32K of CPU RAM, not remove a valuable resource like interrupts to save a few Ms.

End of diatribe, feel free to burn me at the stake for cart heresy ;-)

#48 matthew180 OFFLINE  

matthew180

    River Patroller

  • Topic Starter
  • 2,375 posts
  • Location:Castaic, California

Posted Wed May 26, 2010 9:02 AM

Marc,

Your reasons are perfectly fine. I'm simply presenting things "my way" (note the subtitle of the thread) and I'm sure what I do will go against everyone else who has ever coded in assembly language at some point (see my first post.) Our individual coding styles are products of our habits, experiences, and practices, and will be as unique as there are people in the world. However, in my opinion too many books and programmers depend on the ISR and E/A VDP routines like they are some sort of magic, and that's simply not the case. But no matter if you use the ISR or not, you should absolutely know what it is doing and how.

The trade-offs and reasons for using the ISR, or not, will absolutely depend on the situation, project goals, constraints, etc. And yes, plenty of games use the ISR for sprites and sound, Tombstone City and MunchMan are two examples. However, both of those programs also use the bad practice of "delay loops" to adjust timing, and thus would become unplayable on an upgraded machine (32K of 16-bit RAM or a clock accelerated machine for example) which is becoming more important these days.

One thing I want to point out though, the ISR uses more than a few words of scratch pad RAM. The first thing it does is set the GPL workspace, so you lose 32 bytes at the top from >83E0 to >83FF. You also have to respect all the various memory locations checked by the ISR, like the time-out counter, ISR control word, user hook, sound control addresses, any DSR's that might run, etc. There is much more use of the scratch pad than you indicated. The E/A manual gives a decent breakdown, and there is not much left over, especially if you are interfacing with XB (might as well kiss scratch pad goodbye.)

Matthew

#49 marc.hull OFFLINE  

marc.hull

    Stargunner

  • 1,113 posts
  • Location:Oklahoma CIty.

Posted Wed May 26, 2010 9:50 AM

Marc,

Your reasons are perfectly fine. I'm simply presenting things "my way" (note the subtitle of the thread) and I'm sure what I do will go against everyone else who has ever coded in assembly language at some point (see my first post.) Our individual coding styles are products of our habits, experiences, and practices, and will be as unique as there are people in the world. However, in my opinion too many books and programmers depend on the ISR and E/A VDP routines like they are some sort of magic, and that's simply not the case. But no matter if you use the ISR or not, you should absolutely know what it is doing and how.

The trade-offs and reasons for using the ISR, or not, will absolutely depend on the situation, project goals, constraints, etc. And yes, plenty of games use the ISR for sprites and sound, Tombstone City and MunchMan are two examples. However, both of those programs also use the bad practice of "delay loops" to adjust timing, and thus would become unplayable on an upgraded machine (32K of 16-bit RAM or a clock accelerated machine for example) which is becoming more important these days.

One thing I want to point out though, the ISR uses more than a few words of scratch pad RAM. The first thing it does is set the GPL workspace, so you lose 32 bytes at the top from >83E0 to >83FF. You also have to respect all the various memory locations checked by the ISR, like the time-out counter, ISR control word, user hook, sound control addresses, any DSR's that might run, etc. There is much more use of the scratch pad than you indicated. The E/A manual gives a decent breakdown, and there is not much left over, especially if you are interfacing with XB (might as well kiss scratch pad goodbye.)

Matthew


I am getting the feeling we are all on the same page here Matthew, Perhaps just coming at it from different angles so let me state this plainly for the record....

I agree that in assembly games that the console routines auto motion and sound processing are not very well done and should most likely be turned off. I believe they are tied to a VDP read/modify/VDP write fashion and therefore doing much more work than a routine working on RAM and writing to VDP all at once.


I completely disagree that the VDP interrupt hook should not be used as it is too valuable a tool in a game environment. I think this is the source of the confusion.

Additionally I am fairly certain that use of the interrupt routine only uses the high 32 bytes for the ISR workspace as well as 2 words for the time out counter and one other one that I can't recall right now. I don't have my explorer manual and can't remember exactly how big my Scratch Pad code was so I can't be certain. Most of the others are part and parcel of the high 32 bytes for the ISR workspace. I am fairly certain of this because of a game I am writing that uses most of the scratch pad for a scroll routine only requires mending in two words after the ISR.) As far as DSR's and XB that is most likely moot point in Assembly gaming.

I think we are all more or less of the same mind here, perhaps not getting our points across very well. I only jumped in because the general tone seemed to be that it is unwise to use the VDP interrupt at all because of the speed and memory impact. If I misread (which is looking like what I did) then I apologize. If I didn't misread then perhaps you should weigh the facts again because you would be throwing out a valuable tool.

Either way I agree everyone has their own style and we can all coexist (except for Kentucky hill-billies ;-) At any rate I have enjoyed your write ups and even stole your VWTR routine for my own use.

Keep up the good work !

Marc

#50 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 3,852 posts
  • Location:Denmark

Posted Wed May 26, 2010 11:17 AM

Instead of using the ISR like this

LIMI 2
LIMI 0

You could simply check the VDP status register yourself.

Both methods have the ability to schedule in real time (sync with the frame update).

;-)




0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users