Jump to content

donlebeau

Members
  • Posts

    29
  • Joined

  • Last visited

Recent Profile Visitors

2,069 profile views

donlebeau's Achievements

Space Invader

Space Invader (2/9)

5

Reputation

  1. Well, I tried to set things up so that they were self-throttling. I did that a few different ways. The first way was how I handled the countdown counters. They were decremented during the video interrupt and if they reached zero a request would be put into a queue. Later on in the background the request would be pulled out of the queue and processed. Only after that would the countdown counter be set so that the video interrupt could start counting it down for the next time. A few interrupts could go by while the request was being processed and no harm would be done. The second way was how I managed the queues themselves. In the game updating the video graphics was always the top priority because if that slows down the player notices it immediately. So I always processed the draw and front rotation stuff more often than anything else. A simplified version of my background loop went something like this: - Process all the Draw requests - Process some Think requests - Process all the Font Rotation requests - Process all the Draw requests - Process half of the Create Object requests - Process up to three Sound requests - Process all the Font Rotation requests - Process all the Draw requests - Process up to three Sound requests - Process all the Font Rotation requests - Process all the Draw requests - Process all the Create Object requests - Process all the Font Rotation requests - Process all the Draw requests - Process up to three Sound requests - Process all the Font Rotation requests - Process all the Draw requests - Back to top My actual loop was around 100 steps long and I spaced the workload out over time. I had calculated how long each operation would take so I knew how many operations of each type I could do every second. I used that information to average out the workload over time. At the end of each loop, I would evaluate the workload in each queue and might do some special things to catch up. Like if the Sound queue was filling up, I might process four at a time next time through the loop. Some operations like rotating the fonts and making the sounds took almost no time at all, so I could safely do a lot of them at once. Drawing took a lot of time and was the most time critical, so I serviced the Draw queue as often as possible. You'll notice that I don't run the Think modules very often. In the 100 step loop I only serviced them four times. Thinking takes time, but you really don't have to do it very often. A ship can decide it needs to get closer and then just check it position once a second until it gets close enough to decide to do something else. Many of the decisions are trivial and don't take any time to do: "Am I out of ammo? Start dodging.". The Think Modules were divided up into chunks and any time I got a request in the Think queue for a ship I ran one chunk for that ship. Most of these chunks took very little time to run, but a few of them - like the ones that needed to check Line of Sight - took a long time. So I created the concept of Big Thinks and Little Thinks. Big Thinks took a long time and were more strategic. Little Thinks took almost no time and were more tactical or logistical. For example, A Big Think would decide that it can see the player and wants to fire at him. Then a number of Little Thinks would run to make the ship fire until it ran out of ammo or it was time for another Big Think to check the LoS again. Notice that when I processed the Think queue, I processed "some" requests. What I actually did was keep processing the Think requests until either the Think queue was empty or I had processed a Big Think. As soon as I processed a Big Think, I knew I had used up a lot of time so I saved the rest of the work for later. That spaced out the workload in the Think queue nicely and it also had an interesting byproduct: Because I always checked LoS by using a Big Think before a ship started firing, it automatically spaced out the firing times so that I wasn't creating a lot of new objects at the same time. I enhance that effect by only processing half of the Create Object queue right after I run the Think Modules. If a lot of fire requests were made, I would space them out evenly over the next few video interrupts. I used to have problems with the game "pulsing" before I started doing it that way. It had a lot of slow downs when the worjk would pile up. During the video interrupt, I also control the number of requests I put into each queue during a single interrupt. Maybe on a certain interrupt ten ships are scheduled to think. After I decrement the countdown counters and make the requests for the first five, I stop checking the counters and save the rest for later. "Later" will be two interrupts from now because I alternate even and odd ships every interrupt. When I first started doing that, I had problems with some ships always getting delayed and getting behind, so I change the direction that I process the ships in from low to high and then high to low. If I had eight ship slots, here's how they would be processeed: Interrupt 1: 1,3,5,7 Interrupt 2: 2,4,6,8 Interrupt 3: 7,5,3,1 Interrupt 4: 8,6,4,2 Since the AI was responsible for firing bullets, when it slowed down because of load, the ships would shoot less often and in a few seconds the number of objects on the screen would drop and everything would speed up again. What was nice about this is that the player never really notices that it's happening because there's so much stuff flying around on the screen. To compliment this, all the ships checked how many objects were on the screen and if it was starting to get full they would manouver instead of firing.
  2. Hi everybody! Sorry it's been so long. Real Life has been crazy and I haven't had time to post. I'm stuck on an airplane right now so I have some time. Ring Buffers in Gauntlet: Now that we've covered ring buffers in general, let's look and how I used them to keep everything schedule in Gauntlet. The timing of everything in the game is driven off of the video refresh interrupt which occurs sixty times a second. During each interrupt, I decide if there's any "work" to do and request that work to be done later on by stuffing a request in a ring buffer. There is one ring buffer for each type of "work" that the game can do - drawing on the screen, making a sound, rotating a font, running a think module, or creating a new object. That makes five ring buffers (It's been a while, but I'm pretty sure there's five. Age.) Each of these queues holds 256 bytes and the Put and Get indexes are located in Page Zero. The data stored in each queue is always an object (ship) index, and by putting that object index into a queue, I'm requesting to do that queue's function on that object. For example, if I want to draw ship number 7 on the screen I stuff a 7 into the Draw queue. Some time later on, the background routine will check the Draw queue, see the request, and draw the ship on the screen. For sake of discussion, let's refer to each object as a "ship" and it's index number as it's "ID". The player ship is ID 0 and is usually handled seperately than using the queues. In many cases, there are seperate routines to handle the player ship's functions and the other ship's functions. Each ship has a countdown counter in it for each of the queues. During the video interrupt, that counter is decremented and if it reaches zero, that ship's ID is stuffed into the appropiate queue. After the queue processes the request, it puts a count into the ship's countdown timer so that it will be schedules for processing again later on. I update every object 30 times a second, so I alternate processing even and odd ship ID's every interrupt. This allows each ship to stay on the screen for at least one full video interrupt before it is redrawn. Otherwise, the ships tend to flash because they're being drawn while the screen is updating. One of the advantages of ring buffers is that they can accept data from multiple sources and that data can be removed by multiple routines. This feature is used in Gauntlet to keep everything syncronized. For example, let's look at the Draw queue. First let's talk a little bit about the drawing routine. The draw operation always consists of two parts. First the ship is erased from it's old position on the screen, then it is drawn in it's new position. While the ship is being erased, the pizels are being compared to the last font that was drawn at that position. If any pizels are missing or a different color, it means that something overwrote that pixel and a collision has been detected. Likewise, when the ship is drawn at the new position, each pixel is checked before drawing on it to ensure that it is black, otherwise a collision is detected. So one of the main functions of the draw routine is to detect collisions. When a ship moves, it is erased from one position and drawn in a different one. Many times we want to check for a collision even when a ship is not supposed to move. In that case, the ship is erased and redrawn in the same location and the results of the collision are checked. So even a ship that never moves, like a turret, is constantly being redrawn in order to check for collisions. In short, the only difference between a moving ship and a stationary one are the coordinated of the old and new locations given to the draw routine. The Draw queue takes data from three sources and is processed by one Draw routine. The three sources are: 1. The video interrupt. 2. The Font Handler routine. 3. The Think Modules. During the video interrupt, the Movement Ring Counters (rememember them?) are updated and if one of them rools over, the ship moves. The coordinates of the new location as saved in the ship's data and it's index is put into the Draw queue. When the Draw routine is executed, the ship will move because it's new and old coordinates are different. This draw request comes from interrupt space. Also during the video interrupt, the Font Rotation counter is decremented. If it reaches zero, a request will be put into the Font Handler's queue so the the fonts can be updated. When the Font Handler processes the request, it may determine that it's time to either refresh the font (to check for collisions) or to display a new font (for animation). Either case will cause it to put a request into the Draw queue. This draw request comes from background space. The last place that requests could come from is the Think Modules. Just like the Font Handled, there is a Think counter that is decremented during the video interrupt. If it hits zero, a request is put into the Think queue for later processing. When the Think Module runs, there's a number of situations that can cause the ship to be redrawn. For instance, if it "explodes", a font list for an explosion is loaded into the ship's data and a Draw request is made. This draw request is also comes from background space. As you can see, the Draw ring buffer allows me to take asyncronous requests from both the foreground and background and handle them sequentially. Basically, the Gauntlet program consists of two parts: The video interrupt sets the timing for everything and puts requests into the queues. The background routine sits in a loop monitoring the queues and calling the handlers whenever there's any work to do. The background task also checks the workload across the queues and does some load leveling, but outside of that it's pretty dumb. All the real work is done by the five queue handlers.
  3. I think Gauntlet runs on the later machines, but I'm not sure. Never heard that it didn't. By the time those machines came out I had already moved on to other things. I'm not actually "turning off" the OS. I just stopped using the parts of it that I didn't need. Like for the video, I used Atari's video interrupt code to drive the video controller and daisy-chained it with my own routine to handle the Gauntlet stuff, so I didn't touch the page zero locations that Atari's video interrupt code needed. But I didn't use any of Atari's draw or graphics routines, so I could use all those page zero locations. Some pieces of the OS I never used, like the disk utilities, so I could freely use all the page zero locations associated with them. At first I was pretty timid about using any of the "reserved" Atari locations. It just took me a while to realise that I could use most of them as long as I didn't need the functions I broke. The only stuff I left alone were the locations needed by the video interrupt and the keyboard. I drove the sound chip and read the joystick ports directly. Gauntlet doesn't use anything else - it takes the joystick and keyboard as input and uses the video and sound as output, so everything else was up for grabs. I wrote the Game on an 800 and tested it on both the 400 and 800. When the 1200 came out, a friend tested it on that. I don't know what other machines it runs on. As long as Atari didn't move around the data in the Reserved area, it should be ok.
  4. Sorry, haven't had a lot of time to post lately between work and moving. Here's some random (an hopefully useful) stuff I wrote just before I got into the ring buffer discussion: Page Zero Memory: This was the single biggest challenge that I faced during development. It was a constant problem and it didn't go away until I took some pretty drastic measures. The 6502 has a special area of memory called "Page Zero" memory. It's a block of 256 bytes that can be addressed with a single byte address instead of the normal two bytes. Page zero instructions are almost twice as fast as their normal counterparts, so you try to put all of your time critical variables in page zero. Of course Atari also puts all their time critical variables and they leave you about 50 bytes to use for your own programs. Sheesh, it's like they think they own the machine or something! Of course I ran out of page zero space very early in the project - on the second week, It was a constant challenge to find more page zero space or to find usuable alternatives. At first, I searched for variables that were outside of the "safe" zone but weren't actually used by the OS. I wrote a utility that would constantly monitor the page zero memory block and tell me what locations changed. I'd run this in the background while I ran all kinds of applications and see which of the "reserved" locations Atari actually used. Then I had my utility constantly change those locations while I ran all the applications again to see if anything broke. That netted me another 50 or so bytes. Initially I "played nice", but I still needed more room. So I started looking at the functions I didn't need to run Gauntlet and then stole all their page zero locations. I had written my own graphics and sound systems which drove the hardware directly, so those went first. Then the floppy disk system and a few others that weren't used after the game started running. That's where I was when the shareware version shipped. After that I got pretty desperate. I completely rewrote my draw routines to be faster AND use less page zero memory. Took about two months of head banging, but I ended up with something that was really slick. It still wasn't enough, so I completely turned to the dark side. I realized that Atari had no business using ANY of MY precious page zero space! I was my machine now and I didn't need any of their stupid software anyways! So I gutted all the Atari code I could and wrote my own to drive all the hardware directly. I think that the only piece of Atari software that I ended up keeping was the keyboard driver becasue it just wasn't worth rewriting. Outside of that there is no Atari code active when Gauntlet is running and I use all of Page Zero except for the 50 or so bytes that the hardware needs. Random stuff: When I started working on the game, I bought the Atari Assembly Language Cartrige and immediately overflowed the symbol table. So i had to buy their Professional Assembler on disk for $90. Found out that the disk was copy protected, so my first project was to defeat the copy protection using a hex editor since I didn't want to risk using my $90 disk and ruining it. Good thing too, because I wore out about five assembler disks over the course of the project. Ah, those were the days! When objects are cloaked, they are drawn on the screen using all black pixels. Collision detection still works because they erase pixels on the object they collide with and they detect the non-black pixels of other objects. A lot of the ships have black pixels as part of their fonts so they look good and the collision detection still works. I didn't figure out that trick until pretty late in development so the earilier ships I designed are "uglier" than the later ones. Unfortunately, cloaked objects can't damage each other, but it's not really a problem since you can't see them "cheating". Once in a while you'll see two of them uncloak on top of each other and promptly explode, but that't the price you pay for being sneaky.
  5. Ring Buffers step 4 - Make it work reliably and fast: Whew! We're finally at the fun section! I'd like to cover a couple of general tips first and then focus more on the 6502 specific stuff. - Make the ring buffer functions "inline" or use macros. In C you can declare your functions as "inline" so that instead of them being called the code is inserted directly in line with the surrounding. This saves all the overhead of calling and returning from the function and pushing and popping the parameters off the stack. In assembly language you can make a macro instead of a routine to handle your ring buffer. You save the on function overhead and macros ensure that all the places that use the ring buffer have the same code. You can just put in ring buffer Put and Get instructions directly in your code without using macros, but that's dangerous. Say you have five places that put stuff into a the buffer and you didn't use macros. If you decide to add a feature like a counter and don't catch one of those places, it's going to take you a long time to figure out what's going on. - Know exactly what type of functionaly you'll need and only add those essential features. I can't stress this one enough! For a ring buffer, every feature adds overhead everytime a piece of data enters and leaves the buffer, so it piles up fast. Almost all the optimization you do will be done at design time. That's when you determine what features you absolutely need and how time critical everything must be. The first question you have to ask yourself is "How big should the buffer be?". This is the most important question for number of reasons and depending on you application it will be the easiest or hardest to answer. For example: 1. If you have a set numeber of items that will be stored in your buffer, the answer is easy - make the buffer larger than the number of items you need to hold. 2. If there is an unlimited number of items - like from a communication stream - you must calculate how many items could possibly be in the buffer at any one time based on the rate that they arrive and the rate that you can take them out of the buffer. This can get tricky because you have to account for things like flow control, or if there's no flow control, you're math better be correct or you're going to have problems. 3. If there is going to be flow control, you must determine how fast the flow control reacts and adjust you high and low water marks to accomedate that behavior. For example, if it takes 10 items before the flow is shut off, make sure your high water mark gives you space for them. 4. Know the charasteristics of you input and output data streams. For instance, your input might receive blocks of 256 items. If you implement flow control on that input, you might have to leave room for a whole block of data above your high water mark. Or you might need to size your buffer as a even multiple of 256 to handle the blocks smoothly. 5. Beware of "deathgrips" when designing your flow control. Suppose your output takes commands of 20 characters at a time - it waits until there's 20 characters in the buffer and then takes them all. You implement flow control on this buffer to throttle the input and decide to set the high water mark at fifteen. . You are dead. How? Fifteen characters come in and you hit the high water mark, so tell the input to stop. Now the input stops - forever. Your output looking at the buffer count and is waiting for 20 characters. It has only fifteen so it won't take any characters out of the buffer. You will never reach your low water mark and turn on the input until you start taking items out of the buffer. Hence you're in what's called a Deadly Embrace or a DeathGrip. This is an obvious example of this type of problem, but I've seen so very subtle variations of it so it's something to look out for. 6. No amount of clever design can account for you taking out data slower than you get it. If you can't throttle the input, make sure your process is consuming data faster than it gets it. This may seem obvious, but I've seen too many people try to handle a throughput problem by first making bigger buffers, and then after that doesn't work, try some elaborate double buffering scheme or some other solutio that only postponed the inevitable. Magic Buffer Sizes: The size of the buffer has the biggest impact on the ring buffer performance because some buffer sizes allow you use tricks to eliminate a lot of buffer overhead. Rule of thumb: After determining your minimum buffer size requirement, you want to increase it to the next practical magic buffer size, if possible. What's "Magic Buffer Size"? It's a buffer size that lets you take advantage of the properties of binary to eliminate or optimize the wrapping of your Put and Get indexes. Let's go back to our example of PutData() with a buffer size of 100: #define TESTBUFSIZE 100 char TestBuf[TESTBUFSIZE]; int TestPut = 0; void PutData(char value) { TestBuf[TestPut++] = value; TestPut = TestPut MOD TESTBUFSIZE; } We put the character in the buffer and then increment and wrap the TestPut index. Let's increase the buffer size to then next "Magic Buffer Size" of 256 and see what happens: #define TESTBUFSIZE 256 char TestBuf[TESTBUFSIZE]; int TestPut = 0; void PutData(char value) { TestBuf[TestPut++] = value; TestPut = TestPut MOD TESTBUFSIZE; } Look at that! I'll give you a moment to regain your composure because of the awesomness of this example... What? It doesn't look that much different? But I used the "Magic Buffer Size" and now it's blindingly fast... Oh, it isn't? Oh, I'm sorry. it's how you use the Magic Buffer Size that makes it fast, and I haven't done anything yet. First, what makes the Magic Buffer Size of 256 so great? It happens to fall on a byte boundry, or in other words, it's eight bit wide. What's also eight bits wide? A char (or byte). A byte holds a value of 0 to 255. Here's what's cool: If you have a byte with 255 in it and you add 1 to it, what does it do? It wraps around to 0! That's because a byte holds eight bits and it takes nine bits to "say" 256 in binary (100000000b), and the ninth bit just falls off the end. So bytes can be used as self wrapping indexes. That means that you don't have to wrap the index for a 256 byte byffer if you use a byte to hold the index. Let's see how the code looks now: #define TESTBUFSIZE 256 char TestBuf[TESTBUFSIZE]; char TestPut = 0; void PutData(char value) { TestBuf[TestPut++] = value; } We've changed the TestPut index to a char, and then were able to completely eliminate the wrapping line. You can do this for both byte (256) or word (65536) sized buffers, although word sized buffers are less used because they're pretty big. The rule of thumb is that if you can use either a 256 or a 65536 sized buffer, use it and then make your indexes either bytes or words so that they're self wrapping. That's the way to make the fastest ring buffers. What can you do if you can use either of these sizes? The next best thing is to round up to the next bit boundry and change the way that you wrap your Put and Get Indexes. For example, say that your minimum buffer size is 100 and you don't have the space to allocate 256 bytes for this buffer. The next higher bit boundry from 100 is 128 (which is held in six bits). Here's out first pass at this new approach: #define TESTBUFSIZE 128 char TestBuf[TESTBUFSIZE]; byte TestPut = 0; void PutData(char value) { TestBuf[TestPut++] = value; TestPut = TestPut MOD TESTBUFSIZE; } Ok, the only things that we changed was the TESTBUFSIZE and we made TestPut a byte because 128 fits easily into a byte. To optimize, we can change the way that we wrap the TestPut index. Right now we use MOD, but we're going to change it to masking bits. (This optimization is really faster and cleaner in 6502 assembly language, but you still pick up some speed in C, although the code doesn't look any "faster".) What we want to do is emulate the "self-wrapping" behavior of bytes. They work because when you increment a number, if it gets too big, the highest bit "falls off the end" and the number wraps back to zero. We can do the same thing by masking off the highest unused bits after we increment the value. Here's the code: #define TESTBUFSIZE 128 #define TESTWRAPMASK 0x7F char TestBuf[TESTBUFSIZE]; byte TestPut = 0; void PutData(char value) { TestBuf[TestPut++] = value; TestPut = TestPut & TESTWRAPMASK; } The first thing that we did was define TESTWRAPMASK to be the lower seven bits of the byte (7F in Hex or 0111111 in binary). Seven bits gives us a value range of 0-127. Then instead of wrapping the testPut index using the MOD instruction, we can AND it with TESTWRAPMASK to lop off the unused bits. (Again this simplifies the code in assembly language and speeds it up, although the C code doesn't look much different.) So your first choice is to use the byte or word boundry to make a "self-wrapping" index. If you can't do that, see if you can make your buffer size on te next highest bit boundry and use masking to wrap your indexes. Luckily the bit boundries span a bunch of useful sizes, so you can do this a lot. For instance, if you need a 1000 byte buffer, you can size it as 1024 bytes, and the performance gain more than makes up for you "wasting" the extra 24 bytes. There's no way you can get the same performance gain by "spending" that 24 bytes elsewhere, either in memory or code. - One more tip, in the 6502, try to use Page Zero for Put and Get indexes and other buffer controls. This is going to get you a big performance gain. If you have a small buffer and can fit it into page Zero, that's even better, but you usually don't have the space to do that. Addressing Gymnastics: For those rare people who can modify the hardware for performance purposes - you can make the hardware do the index wrapping for you. Here's an example from a typesetter I worked on. It's a good example of how you can "think outside the box" and use some very simple concepts to get some pretty impressive results. The typesetter had an embedded 68000 controller and luckily I had a good rapport with the hardware designer and we worked together to get the system as fast as possible. A page of data 3mb of data. These pages came across as packets of 256 bytes. I could process this data faster than it came in, but I didn't have a lot of spare CPU overhead. We decide that it would be really cool to offload the CPU by DMAing the data directly into the buffer - which uses no CPU cycles. (DMA = Direct Memory Access. In effect, you tell the DMA controller to move the block of data form one location to another and it uses unused buss cycles to move the data without affecting the CPU.) It was a great idea, but here's the problem: The DMA controller just moves the entire block of data and it has no concept of Put Indexes or buffer wrapping or any of that stuff. It's very fast and very dumb at the same time. If you happened to tell it to move the block of data too close to the end of the buffer, you were hosed because it would smear whatever data lived right after your buffer. So here's what we did. We made a special 4mb area in the address space for our ring buffer. We tied the address lines together so that we had four consecutive 1mb blocks of address space that all addressed the same 1mb block of physical memory. In effect I had one 1mb block of memory mirrored into four consecutive 1mb address ranges. In other words, if I started filling up the memory with 4mb, it would overwrite the same 1mb area four times. Usually you fix problems like this, but in this case it was what we wanted. I allocated my ring buffer as 4mb long and my indexes were words, but I never actually had to wrap them. Here's how it worked: Suppose I told the DMA controller to move a block of data near the end of the "first" 1mb of the buffer. It would move the block of data into the buffer and some of the data would overflow the buffer and go into the "second" 1mb block. But since the address lines were tied together, it actually "overflowed" back into the beginning of the buffer. In effect, the address lines would do the address wrapping for me and I could "wrap" the index of the DMA controller without actually having access to it. That meant that I could blast a block of data into the buffer and it would self wrap if it went off the end. I could also use another DMA channel to pull the data out of the buffer, which I happened to take in 512 byte blocks. Because of the mirrored address space, this operation was also "self wrapping". Why did I choose to use 4 1mb blocks for the buffer? Because a page consisted of 3mb of data and because I could keep up, I could reset my Put and Get pointers to zero, stream in the entire page and process it, and then reset them back to for the next page. That's the only index wrapping that I did myself - once per page insead of six million times per page. I used a buffer count to tell when there was enough data in the buffer to DMA out my 512 byte block. Again I didn't have to update this count for every byte, I just added 256 to it every time I DMAed in a block and subtracted 512 from it every time I pulled one out. The initial design calculations said we would use about 30% of the CPU time just moving the date from the communications port to the formatting area. After using this technique, we used 0% of the CPU and the data arrived almost instantly. Anyways, that's about it for ring buffers. Next we'll see how they were used in Gauntlet and why they were central to the design.
  6. Interesting approach Heaven. The thing I always liked about doing the state machine with the tests and branches is that it's very readable and easy to follow. It's also extremely fast as long as you don't have a huge number of states. If the number of states gets too big, then I would go to either an array of vectors where I keep the state as an index into the array, or a "Sleep" system with a task manager like I did in Gauntlet. As far as copying the monster variables in and back out, in Gauntlet I found that most of the time it faster to just do it in place instead of moving them. To move each variable in and out, you have two regular memory calls and two page zero calls, so it takes roughly the same time as doing three regular memory calls just to load up and save a variable. That means that if you don't access that variable more than three times during your routine, it's actually slower to move it into page zero first than to just access it directly. Add to that the fact that you're moving in all 32 bytes each time and probably not using half of them during each call, and that extra overhead really piles up. In Gauntlet I used buffers for each variable, like XPOS[100] to store the X position for my 100 ships. Then to access it I would just load the ship number into the X register and use LDA XPOS,X to get at the info directly. It was faster that way and since I wasn't operating on copies of the data, I could have the interrupt and background routines operating on different parts of the ships data structure at the same time. Some variables were owned by the interrupt and others by the background routines and any routine could get at the data in the same ship without checking if the object was locked by another process or worrying that the data would be overwritten because another process was working on it's own copy and was just about to write it back. I also allowed me to use the data index number as a "ship ID" that I could them pass around everywhere. Once a routine got the "Ship ID", it had access to all it's data directly. I then used the evil ring buffers to queue these Ship IDs between processes, so for example when the video interrupt determined that it was time to redraw a ship, it just stuffed the Ship ID into the Draw queue and it was done. Later on the Draw module would grab the Ship ID from its queue and would do the work. Again, that's what I did for Gauntlet. Doesn't make that approach "better" but it suited Gauntlet better because I had over 50 variables per ship and had to be able to access them from the video interrupt and from the background simultaneously.
  7. Hi potatohead, Glad you're getting something out of it. It's actually fun writing this stuff up and kinda neat to see it laid out in an organized fashion instead of just floating around in my head. I was going to cover state machines somewhat when I discussed the AI and think modules. The method I used there was basically a task manager approach and is one way to implement a state machine. It worked great for the 6502 and I've used it elsewhere as well. Maybe ten years ago I ran across another method to implement a state machine in C that was amazingly simple, fast, and bombproof at the same time. I was writing a camera controller for a videoconference system at the time and saw how it was done in the old code. It was one of those "Man, I've been doing it the hard way all these years!" type of discoveries. It's just one of those simple approaches that you normally wouldn't think of to implement something so potentially complex. Since you asked, I'll cover it after I show how I did the Gauntlet AI. It's one of those "basic toolkit" techniques you find yourself using all over the place, so I'm hoping someone here finds it useful. doctorrclu: My lawyer said two things (between chuckles). The first was that Atari had more lawyers than I had relatives. The second was that in his experencewith the little guy sueing a large company like Atari, their first line of defense is to tie things up legally and constantly delay every step in the process by getting extensions to wear you down. He said not to expect to see the inside of a courtroom for at least two years and after spending almost a hundred thousand in legal fees. Two lawyer friends of mine pretty much said the same thing, you've got to have deep pockets and lots of persistance to win a case like this. But you gave me an idea. Next time someone asks where the name Gauntletak came from, I'll say it's Klingon for Gauntlet.
  8. Shamus: Actually Atari was trying to cash in on the popularity of Gauntlet with their crappy elf game. Glad you're getting something out of the ring buffer discussion. I had a ring buffer bug in Gauntlet that drove me crazy for almost two months. With just the right conditions, very odd things would happen. I had accidently allocated my buffers as 255 elements instead of the 256 that I was using (I had indexes of 0-255 in my head at the time.) This made each buffer "share" it's end locations with it's neighbors. So stuff I put into the Think queue could end up in the Clear queue and stuff I put into the Draw queue could end up in the Fire queue. This one drove me crazy! The game would be running fine and all of a sudden one of the ships would just vanish, or a bullet would launch a big ship out of itself. It seemed like requests were getting sent to the wrong queues, and it took me a while to realize that the "odd things" always happened to functions in neighboring queues. As soon as I looked at the queues themselves, it was obvious what I had done. And I felt pretty stupid for not checking that when I first saw the problem. And I feel for you for have into rip out all that clever code. Had to do that too many times. And it hurts more when it's clever stuff you did. doctorclu: (From the Demo speil) >> Taknukes.. basically Nukes right? And the mobber was the Robot ship?<< Yeah, by the time I published the registered version I had forgotton what I had called them in the demo blurb. >>Over time you learn that is when they die the fastest << Shush you! >>Found it interesting that flares got points for blowing up the landscape here too, but since you only got a limited number, not as noticable.<< Guess that bug was in the demo version too. Ammo the the player fores isn't supposed to give you points when you fire it. There's a flag that some dummy was supposed to set when he decided to let the player fire flares. Of course that was one of those 2am "Wouldn't it be cool if you could shoot these" type of ideas that was really easy to throw in and I never went back and checked if I did it right. TRIDEX: Odd fact: I was reading a book in Gordan Dickson's Dorsai series at the time and it mentioned Tridex and how it was used to clear landing sites and trees. It sounded cool having something that exploded along a plane, so I put it in. It's still my favorite ammo type. Sorry, I mentioned it before, but Gauntlet is such a memory hog that I can't just let 12k worth of memory just sit around. As soon as the game starts I wipe the text area clean and use it for something else. That's why I said in the manual "Any time you want to read the manual portion of this text, just reboot the game and type "M"." You couldn't see the manual after you started playing because I ate it. Here's answers to your questions: 1) GAUNTLETAK.. what does the last "AK" stand for? I wanted the new name to have the word Gauntlet in it for consistancy and I tried to think up ssothing that wasn't too lame. Gauntletak is the most less lame of the bunch. I started with "Gauntlet Attack" and shortened it. Tried Gauntattack and that sucked. The "letak" in Gauntletak sounds like "attack" and the "Tak" reminded me of Tactics, so I kept it. But it still makes me gag everytime I see it. 2) The level SCREENS were the same, though on the GAUNTLETAK version they were more polished. I think I was able to fit more terrain data in the registered version so I could make them more detailed. Don't remember doing it though... 3) In the demo you get BOLT and whatever weapon you choose at the start. Not like the full version where you get all weapons, and all weapons are UNLIMITED! I always hated running out of ammo, so I changed it. But then you could just spew missiles so I fixed that by creating the jammer ships. That kept the ballance. 4) The demo version greets you on the first screen with mines and kinda builds up. The full version greets you with mines and things you might not see till 4 screens in on the demo version. In the demo version the order of battle is set for every screen. In the full version, it's randomized so you don't know what to expect and it's a lot more replayable. The full version also adds more enemies sooner because I figured anyone buying it would be a fairly good player and I didn't want to bore tham. 5) Those jammer ships are not in the registered version. LOL! I hate those. Though brillant. But you can fire unlimited missiles at them. Except for the <50 registrations I got: No donations. No friend referals. No high score entries. It was like I had thrown Gauntlet into a huge black hole. Most of the registrations came in in the first three months and then I got maybe one a month for the next year. After that it dried up completely and I stopped renting the P.O. box after that. I had moved and once a month I would drive two hours to find it empty. Not fun. I got maybe 50 letters from broke teenagers telling me what a good game it was and I cherished those because it meant that the game wasn't a total waste of time and that some people were really enjoying it. Actually the time line was: 1. I copywrited the name Gauntlet six months before releasing it. 2. I release Gauntlet by sending it out to every BBS I could find. 3. One month later, Atari comes out with their Gauntlet arcade game. 4. I call a lawyer and tell him I'd like to sue Atari for copywrite infringement. 5. I can still hear him laughing today. 6. Figured I have to change Gauntlet's name so it wouldn't get confused with the arcade game. I think my personal best was around 32K, but I didn't know about the flare trick. If I did know about the flare trick my score would have been still around 32k because I would have fixed it. About 20K - 30K were the avarage scores from the Atari Users group I was in. These guys were "playtesters" so they couldn't enter the contest. The odd thing was I had banked the Gauntlet money in it's own account and if someone entered the contest, they would have won and I would have sent it to them. (They would have ended up making more from Gauntlet than I had. That's a pretty odd thought.) I really was hoping Gauntlet would do well and I had set everything up properly for tax purposes. The money was sitting there until I finally shut down the P.O. Box after the letters stopped coming in.
  9. Cograts Doctorclu! An amazing effort for sure! Actually, you are. The suicidal ships never made it into the game, only the smart ones did. My post was worded poorly and it sounded like I let a lot of the losers in. I didn't. I was just surprized at the number of ships I did that actually worked. So you earned every point.
  10. Wow, this is pretty cool. Don't remember what my highest score was, but I don't think I could touch you guys! Man, I don't remember half of those ships! I had a bunch of them I was trying out and only the smart ones got in. Didn't realize so many of them made the final cut. A lot of the designs seemed smart, but in actual practice they were only good at killing themselves. Tell you what. To make it interesting, an autographed copy of the title page drawing goes to the winner. Be the first person on your planet to own one. Oops.
  11. Great discussion. I actually fall on both sides of this issue and there's no hard and fast rule. Each project is different. I'm very aware of that anytime I type out the "mantra". But it does make people stop and think about when and why to optimize, which is a Good Thing. David makes some really valid points. It's experience that tells you when and where to optimize. In my case, a lot of that experience ammounted to "I'll never do that again!" Sometimes it's hard to explain to the company president that he's going to lose the sale because you made the code too clever and you can't possibly put that new feature in on time. They never seem interested in how clean the code is and how fast it runs. Some people... I hope nobody gets the impression from my posts that I think that optimization is a bad thing. It isn't. Doing it prematurely, needlessly, or cryptically is a bad thing. I've worked with a some programmers who write the "fastest" code first and then spend way too much time rewriting sections of it because they didn't have the whole picture before they worried about throughput. The major optimizations should be done either during design like David said, or after the application is "working" and you can identify the real bottlenecks. Again, there's stuff the you can do along the way, but usually the biggest gains are when you're looking at the overall picture, either while your designing the system or after you have something to actually measure performance on. David also has the best line: "Optimizing blindly can be worse than not optimizing at all." I should make a poster out of that! I've run projects that turned into nightmares because one of the programmer totally optimized his section (which was nowhere near the time critical path) and then later on everything he wrote had to be thrown out and rewritten because a customer requirement changed and his code was totally inflexible. Most of the projects I've worked on lately have to be very flexible because each customer has different requirements. In that environment, optimization is second to flexibility. Anyways, it's great talking about this stuff and discussing different approaches. Oh HappyMonster, I actually used ring buffers in Gauntlet. Trust me. Really. I'm not kidding (I actually didn't intend to spend so much time on ring buffers, but I didn't think I could describe how I used them without everyone knowing what they were. Didn't think it was going to take so long, but they're useful enough to be worth spending the time on. They're central to the core of Gauntlet and are basically the part that I built the rest of the game around. Sorry for the slower updates. I'm at my new house and still don't have internet yet, so I'm writing my posts during lunch. Hoping to get past this ring buffer stuff and back onto the fun stuff soon...
  12. Ring Buffers step 3 - Make it work reliably: So far, we have a ring buffer that works, but it's pretty dangerous. We can tell if it's empty, but have no way to tell if it's full or almost full. That may be ok under one of the following conditions: 1. You can always take items out of the buffer at least as fast as they get put into it. 2. There is only a set number of items that can be in the buffer, and the buffer is large enough to hold all of them. If neither of these conditions apply, you must somehow control how fast the buffer gets filled. This is important because once a ring buffer overflows, you don't just lose a couple of items, you lose an entire buffer's worth of data! If the buff size is 100 and you stuff 101 items into it, the Put index will equal the Get index and you will now have an "empty" buffer. Stuff 102 items into it and the Put index will pass the Get index and you will have 1 item left in the buffer. The first step to controlling how fast the buffer gets filled is to keep track of how much data is in the buffer. Let's add a counter to our test buffer: #define TESTBUFSIZE 100 char TestBuf[TESTBUFSIZE]; int TestPut = 0; int TestGet = 0; int TestCount = 0; bool BufferIsEmpty() { return ( TestPut == TestGet ); } bool BufferHasSpace() { return ( TestCount < TESTBUFSIZE ); } void PutData(char value) { TestCount++; TestBuf[TestPut++] = value; TestPut = TestPut MOD TESTBUFSIZE; } char GetData() { TestGet = ++TestGet MOD TESTBUFSIZE; return TestBuf[TestGet]; TestCount--; } main() { Char i; char temp; for ( i = 1, i <= 10; i++ ) { PutData( i ); } while ( !BufferIsEmpty() ) { temp = GetData(); Printf("\nRead data value %d. There are %d Items left in the buffer", temp, TestCount); } Printf("\nDone!"); } Ok, we've added a new variable TestCount which will keep track of how much data is in our buffer. There are ways to calculate this on the fly, but all of them are klutzy and slow and not worth the effort. Using TestCount is easy, we increment it whenever we put data into the buffer and decrement it whenever we take data out of the buffer. We've also added a BufferHasSpace() funtion to see if our buffer has any space left in it. Note that it checks if there's a maximum of 99 items in the buffer even though the buffer size can hold 100 items. Your ring buffer normally holds one less than the size of the buffer. This is because the BufferIsEmpty() function checks if the Put and Get indexes are equal, and they will be equal whne the buffer is empty or full. You can use either the Put vs Get test or the TestCount to determine if the buffer is empty and each has it's advantages. If you've added TempCount use that because it's faster and your buffer can hold one more element - change the test in BufferHasSpace() to be ( TestCount = TESTBUFSIZE ) to get that extra byte. If you haven't added a buffer counter, just use the Put vs Get test. Why not always use the counter? Because it's a big overhead hit. You're incrementing and decrementing the counter for every byte of data that goes through the buffer. That adds up, if you pass a megabyte of data through you buffer, you're performing 2 million extra operations. Depending on the CPU, that may be too costly. By adding the counter, the routine that's putting data into the buffer can check if it's full before calling PutData(). Unfortunately, that keeps the buffer from overflowing, but it's usually not good enough. What happens if the buffer full? You usually can't hold on to the data until "later" unless you have another buffer somewhere, so you're only option is to drop the data on the floor and hopefully the buffer won't be full by the time the next piece of data arrives. Since most applications can't afford to lose data, you usually have to implement... Flow Control: If your data source can possibly overrun your buffer, you have to be able to tell it to slow down before it's too late. Most communications devices have the ability to turn on and off the source by some form of handshaking, but few of them happen instantly. If data has already arrived when the buffer is full, it's too late. You must determine when the buffer is "getting too full" and tell the input to stop sending and then tell it to start sending again when there's "enough room". For this example, let's make two imaginary functions to start and stop the input and call them StartInput() and StopInput(). We don't really care what they do, but somehow they tell the sender to start and stop sending the data. For better performance We're also going to add a flag called InputStopped to keep track of what state the input is in. The next thing we'll need is a point to tell us when to stop the input. How full do we want the buffer to get before we stop teh input? This point is called the HighWaterMark. After the buffer hits the HighWaterMark, we want to empty it out to make room for more data. There's a point where we'll consider it "empty enough" to start accepting data again. This point is called the LowWatermark. Depending on your application, the high and low water marks will be adjusted to work well with your input stream. If, for instance, your input will send about ten more characters before reacting to the StopInput() call, you must tell it to stop the input before you have less than ten spaces left in the buffer. Or if you input sends bursts of 50 charcters, you won't tell it to start sending until you have room for 50 characters. For our example, let's assume that we'll stop the input when our buffer is 90% full and then start it again when it 10% full. Here's the code with the flow control added: #define TESTBUFSIZE 100 char TestBuf[TESTBUFSIZE]; int TestPut = 0; int TestGet = 0; int TestCount = 0; bool InputStopped = false; #define HighWaterMark 90 #define LowWaterMark 10 bool BufferIsEmpty() { return ( TestPut == TestGet ); } bool BufferHasSpace() { return ( TestCount < TESTBUFSIZE ); } void PutData(char value) { TestCount++; TestBuf[TestPut++] = value; TestPut = TestPut MOD TESTBUFSIZE; if ( InputStopped == false) { if ( TestCount >= HighWaterMark ) { StopInput(); InputStopped = true; } } } char GetData() { TestGet = ++TestGet MOD TESTBUFSIZE; return TestBuf[TestGet]; TestCount--; if ( InputStopped == true) { if ( TestCount <= LowWaterMark ) { StartInput(); InputStopped = false; } } } Ok, first let's look at the PutData() function. After buffering the character, we check if the input has been stopped. If it hasn't, we check if the buffer is too full and tell the input to stop if it is, and then set out InputStopped flag to true. We test if ( InputStopped == false) first becasue if we already told the input to stop, we don't want to call StopInput() again. This is a bit of an optimization becasueit's faster to test ( InputStopped == true) than to test ( TestCount <= LowWaterMark ). Also calling StopInput() repeatedly is probably not a good thing. We don't know what this function does, so let's just call it once to ensure we don't waste any time or break anything. (Some communications protocols send messages to the transmitter and you want want to spam them needlessly with redundant packets). The GetData() function does the same thing only in reverse. After it gets a byte of data it checks to see if the input was stopped. If it was, it checks to see if it's time to start it up again. If the buffer is empty enough, it calls StartInput() and clears InputStopped. Now the buffer will start filling up again. You now know enough to handle about almost every ring buffer application you'll ever run across. We haven't optimized anything yet and we'll cover some ways to do that in the next section. After that we'll look at how ring buffers were used in Gauntlet (finally!) along with some examples of why they were necessary.
  13. Bennet, you must have been reading ahead. Will get to that in the "reliably" and "fast" sections. Yup, Gauntlet used five of those 256 byte queues. Some of them were massively huge compared to the amount of data they actually handled, but I needed the speed more than anything else. One of the reasons that Gauntlet was such a memory hog is that usually to make something faster you have to use more memory or make duplicate optimized sections of code. Both approaches eat up your available memory pretty fast. Sizing buffers on a byte or word boundry is always my first choice. If I can afford the space I just do it. It's just so much faster and cleaner than the other methods. If I can't do that, I'll use the AND'ing method if I can get away with it. You get a big performance boost on older CPUs, but not as much on today's machines with their prefetches and single cycle math operations. It's probably not any faster than using MOD on the current crop of CPU's, but the code looks cleaner and it just feels better when you're being "clever". It's kind of weird because I got into programming on the "ground floor". and watched some dramatic changes over the years. When I first started, you programmed in machine language - not "assembly language", you did the work of the assembler yourself on paper and then entered all the bytes in using front panel switches. They they made assemblers which saved you a ton of work. You didn't have any disassemblers or source code debuggers but that was ok because most of us could just read the machine code directly. The big leap came with the C compilers. For the first time one line of code could put out many assembly language instructions. You were getting further away from the hardware and it was not a comfortable feeling. With machine or assembly you knew exactly what instructions the CPU would run and how many cycles it would take. Now it wasn't obvious and a lot of use would print listings with the assembly code interspaced with the C code just so we could see what was "really going on". After a while we knew exactly what code the C compiler would spit out and we would write "the most efficient code" which sometimes looked pretty odd. But we knew exactly what the CPU was running and that was all that mattered. If you wanted to do anything time critical, you still used assembly language because of it's speed and predictability. But for everything else C worked just fine. Then along came C++ with it's objects and overriding and all kinds of great programming features. Now we're in a world where a simple statement like "a = b;" could spew out thousands of assembly language instructions. We're so far from the hardware that we don't even thing about it. And CPUs have gotten so fast and compilers so good that code efficiency almost doesn't matter. I work on time-critical stuff today and it's creating objects on the fly, passing them around like there's no tomorrow, calling down ten levels to get some status (using text strings!!!) and all kinds of outlandishly inefficient stuff. But when I run this stuff against a performance meter it doesn't even register - it's just that fast. Sometimes we look at the stuff we're writing today and we just have to say to ourselves "Get over it" and move on. The world has moved on I guess you have to get shed those dinosaur roots sometimes...
  14. Yeah, it's the same thing. Thanks for point that out! Last night I was wondering if I was wasting my time on the ring buffer description since there were probably some good tutorials already on the net. I searched around and found a number of them , but they were all either totally theory or way too cryptic to be understandable. Also the "conventional" way to use Put and Get indexes makes it impossible to do the CPU Architecture optimizations that actually make ring buffers useful. Made me feel a little better knowing that the stuff I'm writing isn't just a re-hash of other information that's already available. I kind of feel bad for anyone that tries to learn about ring buffers using one of these other turorials. Even the stuff that was understandable was flat out wrong (if you ever want to optimize the performance). It also strikes me as strange becasue I'm not presenting anything "new" here. I know a lot of other engineers that do it this way and we share tips that we've picked up over the years. I guess that most of the people who really know this stuff never have time to write turorials, so it's left to academics to write long winded white papers packed with theory and llittle practical experience. That shouldn't really surprise because I've worked with too many PhD's who spout endless trivia about "software design practices" but couldn't program their way out of a paper bag. Those were the guys that wrote specs that caused fist fights. Anyways, I'm moving this weekend so I won't be able to finish working on the next section until next week. Had hoped to get it finished by now, but it's still half done. Have a nice weekend everybody! Go <Insert your favorite team here>!!!!
  15. Ring Buffers step 2 - Make it work: Now that we have our ring buffer, we can start making it work. But before we continue I wanted to address nameing conventions. We've been working with a sample ring buffer, but most applications have more than one. To keep my code readable, I name each of my ring buffers after the process that it gets the data from it. For example, in Gauntlet the ring buffer used to hold think module requests is named like this: #define THINKBUFSIZE 100 char ThinkBuf[THINKBUFSIZE]; int ThinkPut = 0; int ThinkGet = 0; Since the ring buffer is named after the consumer of the data, you immediately know you are making a request to the Think Module when you see the data getting put into a ring buffer. Scary performance rant: As a side note, ring buffers are very flexible and each one is usually optimized for a specific use. Because of this, there's a good chance that your running code won't have any "generic" ring buffers in it. For instance the ring buffer that you use to get data off of a serial port will be vastly different that the one that you use to request and object's action. Each of these will have only the features that they require and nothing extra. Because of their nature, every feature that you add to a ring buffer has a huge impact on it's throughput. Many times the performance of the ring buffers is what makes or breaks system. With today's blindingly fast CPU's and optimizing compilers, the problem isn't as bad as it used to be. You can write horrendously inefficient code and got away with it most of the time, but "back in the day" every CPU cycle counted and some people (like me) made a living pushing every ounce of perfomance out of the existing hardware. Yes, today you can have generic ring buffers with all the bells and whistles in your toolbox. I have one that I built using the STL library that I use everywhere. It's horribly inefficient, but for most applications I can just drop it in and it performs well enough. But if I drop it in and it kills the syste performancee, then I have to write a custom one without all the unessasary fat. (BTW, If anyone's interested in my STL ring buffer code, let me know and I'll send it to you. If you have to ask what STL is, don't worry, you won't need it. ) The point is that you must understand ring buffers enough to know what features you really need and how to optimize the ring buffer if it ends up being too slow for your application. An important thing to keep in mind as we go forward is that every ring buffer performance and reliability issue is solved during design time. You must know your performance requirements and your ring buffer's place in the architecture first in order to determine the required functionality, buffer size, flow control, and minimum throughput. It is possible to calculate all this stuff ahead and determine if your ring buffer will "work" before even writing a line of code. Sometimes you might have to take drastic measures like redesigning the hardware to get the performance that you need. That's the kind of stuff you better know ahead of time, not later on when the hardware is cast in stone. Using ring buffers: Ring buffers follow these four basic rules: 1. You put data into into them by storing the data at the Put indexed location and then incrementing the Put index. 2. You get data out of them by incrementing the Get index and reading the data out of the Get indexed location. 3. Both the Put and Get indexes must wrap around to the beginning of the buffer every time they pass the end. 4. If the Put index equals the Get index, the buffer is empty. That's everything you need to know to implement a basic ring buffer. Let's see how this looks in C. For clarity, I've written this as wordy as possible. We can distill it down a bit after we discuss some common C notation. (I want to make sure everyone understands what is going on and nobody is stumbling around because of the C language itself.) #define TESTBUFSIZE 100 char TestBuf[TESTBUFSIZE]; int TestPut = 0; int TestGet = 0; bool BufferIsEmpty() { if ( TestPut == TestGet ) { return True; } else { return False; } } void PutData(char value) { TestBuf[TestPut] = value; TestPut = TestPut + 1; if ( TestPut > TESTBUFSIZE ) { TestPut = 0; } } char GetData() { TestGet = TestGet + 1; if ( TestGet > TESTBUFSIZE ) { TestGet = 0; } return TestBuf[TestGet]; } Ok, we've defined a ring buffer and created functions to find out if the buffer is empty, to put data into it, and to get data from it. Let's explain what we did and clean up the code as we go along. First is the function to determine if the buffer is empty: bool BufferIsEmpty() { if ( TestPut == TestGet ) { return True; } else { return False; } } This function IsBufferEmpty() compares the TestPut index to the TestGet index and returns True if they are same and False if they are different. This follows rule number 4 above. The function returns a boolian value which can be either True or False. This code can be cleaned up because the test in the "if" statement ( TestPut == TestGet ) also returns a True or a False. Most programmers would just return the result of the test directly and would rewrite the function as: bool BufferIsEmpty() { return ( TestPut == TestGet ); } ... which means the same thing and is a lot cleaner (trust me on this). So we have a function to determine if the buffer is empty. It can be used like this: if ( BufferIsEmpty() ) { printf("The Buffer is empty\n"); } else { printf("The Buffer is not empty\n"); } Now let's look at the function to put data into the buffer: void PutData(char value) { TestBuf[TestPut] = value; TestPut = TestPut + 1; if ( TestPut > TESTBUFSIZE ) { TestPut = 0; } } We pass in a byte value. The first line "TestBuf[TestPut] = value;" puts that value into the buffer location indexed by the Put index. The next line "TestPut = TestPut + 1;" increments the Put index. That implements rule number 1 on how to put data into the ring buffer. Because we incremented the Put index, we must now check to see if we should wrap it around to the beginning of the buffer. This is rule number 3. If we don't do this, it's not a ring buffer anymore. It's a memory fill routine. The most obvious way to do this is to test if the Put index is greater than the size of the buffer and to set it to zero if it is. That's what the code starting with the "if" statement does. Let's clean up this code a bit. In C you can increment a variable directly. The statement "TestPut = TestPut + 1;" can be rewritten as "TestPut++;". The "++" means to increment the variable. Where you put the "++" is also important. If it appears before the variable name "++var" the variable is incremented before the variable isuse. If it appears after the variable name "var++" the variable is incremented after the variable is used. For example: int xvar = 5; int X = ++xvar; int yvar = 5; int y = yvar++; In this example x = 6 and y = 5. Both xvar and yvar end up as 6. Got it? We now have: void PutData(char value) { TestBuf[TestPut] = value; TestPut++; if ( TestPut > TESTBUFSIZE ) { TestPut = 0; } } We can furthur clean up the code by putting the "TestPut++" inside the brackets in the first line. This works because TestPut will be used before it is incremented. The code will function exactly the same as the original. Here's our final cleaned up version: void PutData(char value) { TestBuf[TestPut++] = value; if ( TestPut > TESTBUFSIZE ) { TestPut = 0; } } Now let's look at the function that gets the data out of the ring buffer: char GetData() { TestGet = TestGet + 1; if ( TestGet > TESTBUFSIZE ) { TestGet = 0; } return TestBuf[TestGet]; } Using the same code cleanup methods we get: char GetData() { if ( ++TestGet > TESTBUFSIZE ) { TestGet = 0; } return TestBuf[TestGet]; } Notice that the statement "if ( ++TestGet > TESTBUFSIZE )" increments the Get index before it uses it. Then it wraps it to the beginning of the buffer if necessary and after wrapping it, it uses it to return the contents of the buffer. It's important the wrap the Get index first before using it or you will be reading data from off the end of the buffer. Now we have a working ring buffer, but before we continue I'd like to clean it up somewhat. The method that we use to wrap the indexes is simple, but it's not the best method. There is a faster and cleaner way to do it if the CPU supports it. The 6502 doesn't, but I'm showing this to you because any current CPU does and it's a huge performance increase: Here's the old way and the new way side-by-side: Old: if ( Index > BUFSIZE ) { Index = 0; } New: Index = Index MOD BUFSIZE; We're using the MOD operation to wrap out buffer. The Mod operator divides the first value by the second value and returns only the remainder as result. Here's some of the values Mod will return using 100 for the BUFSIZE: 1 MOD 100 returns 1 99 MOD 100 returns 99 100 MOD 100 returns 0 101 MOD 100 returns 1 As you can see, MOD will wrap the index just the way we want it to. It is also faster because it is an internal CPU operation instead of a branch like and "if" statement is. It took me years before I saw this trick, I always used to do it with the "if". Here's the final (I mean it this time!) ring buffer code along with some test code to run it: #define TESTBUFSIZE 100 char TestBuf[TESTBUFSIZE]; int TestPut = 0; int TestGet = 0; bool BufferIsEmpty() { return ( TestPut == TestGet ); } void PutData(char value) { TestBuf[TestPut++] = value; TestPut = TestPut MOD TESTBUFSIZE; } char GetData() { TestGet = ++TestGet MOD TESTBUFSIZE; return TestBuf[TestGet]; } main() { Char i; char temp; for ( i = 1, i <= 10; i++ ) { PutData( I ); } while ( !BufferIsEmpty() ) { temp = GetData(); Printf("\nRead data value %d", temp); } Printf("\nDone!"); } If you have any questions so far, let me know. After this we're going to add a feature or two and then we'll "Make it work reliably". Right now it works, but it's nowhere near bombproof (can you figure out why?). Thanks again to Potatohead and Goochman who are just about to tell me how to make the code blocks work.
×
×
  • Create New...