Jump to content


  • Content Count

  • Joined

  • Last visited

  • Days Won


Everything posted by gregallenwarner

  1. Units for microseconds are denoted by 'us', with the letter 'u' approximating the Greek letter mu. Hope that helps. This is actually the very assumption I've been challenging in my head before coming here with my original post. There actually IS an atomic test and set instruction in the 9900! The atomic test-and-set is the key to a non-blocking synchronization method, allowing threads to attempt to acquire a mutex without blocking, meaning, if the thread was unsuccessful in acquiring the mutex (because some other thread jumped in there before it was able to), the thread can be aware of this and branch off somewhere else other than its critical section and keep working on non-shared data, then come back and attempt to acquire the mutex again later. With the existing blocking methods, the unsuccessful thread simply blocks when it cannot acquire the mutex, completely unaware that it is temporarily halted in its task. The other benefit to test-and set is that threads no longer need to bother with modifying the Interrupt mask anymore. You can safely synchronize threads without any threads having to muck around with LIMI instructions! So what is this atomic test-and-set that supposedly already exists in the 9900? I've actually already mentioned it at the top of this post: SLA R0,1 That's Shift Left Arithmetic, right? Yes, but it can also serve as an atomic test-and-set for non-blocking synchronization. Here's how: Suppose we set up a mutex in memory somewhere. Let's call the label MUTEX. In this system, mutexes must be a 16-bit word, initiallized to >8000 by the kernel. A set bit in position 0 indicates the mutex is free, and is not owned by any thread. Now to acquire the mutex, the user thread must set this bit to zero, so that other threads will see that bit 0 is clear, and know that the mutex is owned by someone else. But how can we do this atomically? And furthermore, the test-and-set operation atomically changes the value AND tests what the value used to be before the change! Here's where Shift-Left-Arithmetic comes in. The user thread shifts this mutex one bit to the left. The '1' in bit position 0 falls off the end, and zeros shift in from the right. This has the effect of setting the mutex bit to 0, locking the mutex. (Additionally, since this mutex exists in memory, we need to quickly relocate our WP to that memory location, perform the shift, and immediately move our WP back again, like so...) MUTEX DATA >8000 ... ... * Acquire the mutex LWPI MUTEX SLA R0,1 LWPI WP * Whatever our local workspace is So even if multiple threads try and synchronize on this mutex at the same time, each thread is performing SLA R0,1 on this memory location, so only new zeros are ever getting shifted in, no matter how many threads attempt to grab it. As long as your multitasking kernel is sure to keep every thread's workspace local to itself, this works fine. But what about the test portion of the procedure? Well, it's already been performed by the SLA instruction! We want to know what was the value of mutex bit zero before the shift occurred. The SLA instruction stores the value shifted out from position 0 in the carry bit of the Status register! Bingo! We have all we need! Since a context switch preserves the Status register local to each thread, and because SLA is atomic, we have assurance that only 1 thread will ever acquire the mutex! If two threads are in contention for the same mutex, because we are shifting by one bit each time, only 1 thread's status register will grab the '1' bit in position 0 as it falls off the left side of the register. Everyone else will get zeros. So now we can use JNC to jump somewhere else if we were unlucky enough to miss the mutex! MUTEX DATA >8000 ... ... * Acquire the mutex LWPI MUTEX SLA R0,1 LWPI WP JNC OTHER ... ... * Critical section of code ... OTHER ... * Do something else if we didn't acquire the mutex. If non-blocking synchronization isn't needed, we can use this same method to simulate blocking, by jumping back to try and reacquire the mutex again if we missed it. MUTEX DATA >8000 ... ... GRAB LWPI MUTEX SLA R0,1 LWPI WP JNC GRAB ... ... * Critical section of code ... Releasing the mutex is simple, Once the thread who owns the mutex is absolutely sure it is finished accessing shared memory, it simply releases the mutex by writing >8000 back to the mutex: * (From local workspace) * Release the mutex LI R0,>8000 MOV R0,@MUTEX * ( -or- From mutex workspace) LI R0,>8000 LWPI WP * Back to local workspace That's what I've been working on. I've not tested this yet on real hardware, but that's why I wanted to confirm some theory about interrupts with you all here, and it seems to make sense. Non-blocking synchronization is a key and revolutionary factor when it comes to more modern approaches to parallelism, and now it's possible on the 9900 thanks to the atomic test-and-set instruction that's been hiding right below our noses the whole time! And it's faster than surrounding mutex-acquiring code with LIMI 0 and LIMI 2, since LWPI takes only 10 clock cycles vs. LIMI's 16 cycles. Let me know what you all think of this theory.
  2. Thanks for all the info! So, just to recap, any instruction in the TMS9900, no matter if the instruction itself is comprised of multiple steps of memory access, these instructions are treated as an atomic operation by the interrupt handling circuitry in the CPU. I'm investigating the nature of non-blocking multitasking in the TMS9900, managed by a preemptive multitasking kernel, and to my knowledge, it's never been done before, due to the lack of an atomic Test-and-Set operation, such as TCMB and TSMB in the TMS99105. Has anybody investigated non-blocking multitasking in the TMS9900?
  3. Interesting, I had never heard about the interrupt mask being decreased by 1. Does this mean that, if I use LIMI 2 as convention always dictated, can my interrupt handler be interrupted by yet another interrupt? Since 2 decreases to 1, and all interrupts are hardwired as 1 in the TI? Should we be using LIMI 1 to ensure interrupts can't interrupt interrupts?
  4. I have a question regarding the nature of TI interrupts. What is the precise nature of when an interrupt can occur? Here's my example: Suppose I have a single instruction: SLA R0,1 Is this instruction atomic? The TMS9900 manual states that when an interrupt is raised, the CPU services the interrupt after the completion of the current instruction. But I'm not clear on what constitutes a single instruction. With the shift instruction above, the CPU needs to read in the memory word at the location of the Workspace Pointer. Then, it needs to shift that word by 1 place, affecting the Status bits. Then, it needs to write the result back out to memory, which in and of itself consists of another read-before-write, thanks to the TI's architecture design. Are each of those steps treated as individual instructions by the CPU? In other words, if an interrupt is raised at any point inbetween these steps, is it possible for the shift instruction to be interrupted halfway through? Or does the CPU treat this whole process as one atomic instruction, uninterruptible until its completion?
  5. Can you explain this recent "discovery" of scanline effects in more detail? What was discovered about the TI that lets you do this? I was under the impression that the video hardware did not issue scanline interrupts, only field interrupts 50/60 times a second.
  6. I'll dig up my spec document for you all, but off the top of my head, I believe I multiplexed the *WE and *CRUCLK line. Consider that you're never writing to both memory and a CRU device at the same time. My TI-side interface would multiplex these two signals, and then the PEB-side would look at the other lines to figure out whether this was a memory write or a CRU write, and pulse the appropriate line. That's a very simplified explanation, but I've got more details to explain the rest once I post my spec doc. If the goal is to remove the RAM from the PEB, then there would certainly be an opportunity to combine 32K of RAM onto my TI-side interface card. 32K static RAM chips are a few cents, and you can get them in tiny surface mount packages that take up almost no room. Would be worth designing it right in, along with a jumper to disable it in case you prefer to use a SAMS in the PEB.
  7. Something like this doesn't really serve a need for me, since I am not wanting to tie myself to a strict 32K RAM expansion, as I hope to build myself a SAMS clone someday for the full 1MB of RAM in the PEB. But certainly I can see this being of use to others. Something that worries me is standing the PEB's firehose cable up on it's edge like that. It's a rather bulky and heavy cable, and I'm afraid its weight would put a considerable strain and torque on the TI's edge connector. The firehose cable's TI-side connector, terrible as it is, was designed with a little foot to keep it elevated so it doesn't strain the connector. Things like strain relief are always a big part of designing a connector.
  8. There would definitely need to be an interface on the TI's end to do the translation/multiplexing that allows 44 pins to be cut down to 40 (fewer than that, actually, since not all positions in an IDE cable are usable), but with modern PLD's, this logic won't take up very much space at all.
  9. I have been writing a specification over the past year for connecting the TI to the PEB using a standard off-the-shelf 40-pin IDE hard drive ribbon cable. My plan is to buy a 3-foot round IDE cable (the twisted pairs are separated and bound together in a rubber moulding) to make the appearance much more tidy, and use that to connect the two. I had to do some clever things to get everything to fit on the smaller IDE cable, but it should be doable. My specification is pretty much complete, just need to build a prototype and test it. Might be useful, as IDE cables are a dime a dozen and easy to find anywhere, eliminating the need for any special connectors or cables.
  10. Hey, all! Thanks for the renewed interest in this project. Here's the long overdue update: I recently got married and moved, and my new house had to accommodate both my stuff and my wife's, so the basement area which I plan on converting into my TI lab still hasn't been set up yet. However, this project is still alive and well in my head. I am toying with the idea of building a replacement PEB interface card. I am also well into the final stages of designing a way to connect the PEB to the TI using an ordinary 40-pin IDE hard drive ribbon cable. (Yes, I know that's not enough pins, I'm using some clever techniques to make it work.) The ultimate plan is to be able to buy an off-the-shelf 3-foot hard drive cable and use it to connect your PEB. You can even get round IDE cables to make your desk less messy. So my thinking is this: the PEB-side connector card for this new interface doesn't need to be very big with today's modern chips, so there's enough room to relocate my Speech-to-PEB circuit to this new card. In other words, the new TI-to-PEB connection card would have an optional card edge connector where you could stick a speech synth, if you had one. So it'd kill two birds with one stone. How's that sound? Would using a standard IDE cable be better for the PEB? And having the connection card pulling double duty with the Speech Synth to save PEB slots as well?
  11. It's not only possible, it's very practical. FPGA's are perfect for a small batch where you don't want to wire together a mountain of discrete logic, and you don't have the funds to roll an ASIC chip. Grab yourself a cheap FPGA dev board and try it out!
  12. Probably not the issue, but does it have anything to do with the interrupt line? I recall when I performed my HDX mod, you need to cut the card's interrupt line, as the card won't have a valid interrupt handler loaded when you first install it. It can be reconnected after loading the DSR. Just a shot in the dark.
  13. I'd like to extend the capabilities of the linking loader for my own purposes. For one, I'd like to implement the assembler directives, PSEG, DSEG, and CSEG, to make use of them. I saw on Thierry Nouspikel's site that he had a bit of code that patched in that functionality, but I'd like to have that functionality rolled into the loader itself, not just a patch job. I also want to remove the linking loader's reliance on other EA utilities, such as DSRLNK, VSBW, VSBR, etc. I'd like for it to be completely stand-alone.
  14. I'm interested in studying how the Option 3 Linking Loader works, and so I was wondering if anybody possessed a copy of the EA Utility routines disassembly or source? I have a dump of the low memory, but disassembling it by hand is turning out to be a PAIN! Or if somebody knows an easy way to disassemble code, I'd be very appreciative. Thanks!
  15. Why not an FPGA? Like Matthew said, with an FPGA, you could get the whole system in one chip. Plus, CPLD's are so expensive, and when you consider the cost per logic block, FPGA's blow CPLD's way out of the water.
  16. Here's a quick TI-BASIC program which I just wrote to implement Langston's Ant, which is a type of cellular automaton, similar to Conway's Game of Life. If you don't know what cellular automata are, they are very simple machines created from very simple rule sets, which display complex emergent behavior. Langston's Ant is played on an infinite grid of black or white cells. The rule is, the "ant" lives on one cell at a time. It checks the color of the cell, if the color is white, it flips the color to black, turns right, and moves forward one step. If the cell was black however, it flips it to white, turns left instead, and walks forward one step. Then the whole sequence repeats indefinitely. My implementation here isn't on an infinite grid, unfortunately, due to the TI's 32x24 cell screen buffer, but nonetheless you can start to see some of the interesting patterns begin to emerge as the program runs. I made the screen loop back on itself horizontally and vertically, but the simulation can run for a good few thousand steps before the ant wraps around and collides with the pattern being generated. You can play with certain aspects of the program, such as changing the rules, changing the playing field so it doesn't wrap around on itself, or perhaps creating an initial starting pattern to have the ant traverse over. 100 CALL CHAR(32,"0000000000000000") 110 CALL CHAR(33,"FFFFFFFFFFFFFFFF") 120 CALL COLOR(1,2,16) 130 CALL SCREEN(16) 140 CALL HCHAR(1,1,32,768) 150 X=16 160 Y=12 170 DX=0 180 DY=-1 190 CALL GCHAR(Y,X,C) 200 IF C=32 THEN 210 ELSE 360 210 CALL HCHAR(Y,X,33) 220 NDX=-DY 230 DY=DX 240 DX=NDX 250 X=X+DX 260 Y=Y+DY 270 IF X>32 THEN 280 ELSE 290 280 X=X-32 290 IF X<1 THEN 300 ELSE 310 300 X=X+32 310 IF Y>24 THEN 320 ELSE 330 320 Y=Y-24 330 IF Y<1 THEN 340 ELSE 190 340 Y=Y+24 350 GOTO 190 360 CALL HCHAR(Y,X,32) 370 NDY=-DX 380 DX=DY 390 DY=NDY 400 GOTO 250
  17. With a 74LS612 memory mapper. See this page for the schematics: http://www.unige.ch/medecine/nouspikel/ti99/superams.htm 1 MB > 32 KB
  18. coolio, I'm a computer programmer by profession. Java mostly, though I'm pretty capable in a number of languages. It's tough finding a local programmer position since so many programming jobs get outsourced overseas these days. I'm checking in some hospitals around here, since they often have in-house programmers. Thanks for the best wishes guys! I'll let you know if anything develops.
  19. Hey, just to give you guys an update on this project: Back in June I got laid off from my job, so I've had to pack up my house and move, and I've been on the job hunt ever since. Got some good prospects though... However, that means I'm currently without a workshop, so this project unfortunately will need to be placed on hold till I have a suitable place to set everything back up. The development portion is pretty much finished, so all that's left is ordering parts and manufacturing. I thank you guys for your patience on this. It hasn't been easy with all of life's complications being thrown my way. I don't want to make excuses though. I'll be sure to focus on getting this project completed for you guys soon as I find out where I'll be setting up my new lab. Thanks again!
  20. I've got a weird PEB. On the front, it's a push-button. However, on the back, I have the nicer short fingers of the rocker switch version, but I also have the easily accessible fuse of the push-button version. Best of both worlds.
  21. As for the audio, most TV's and monitors with VGA inputs have a 1/8" (3.5mm) stereo audio jack associated with the VGA port. I believe the TI's AV cable uses an RCA plug for the audio. Correct me if I'm wrong. If you can find an RCA to 3.5mm adapter, you should be able to connect up the sound with no need for a separate external speaker.
  22. For whatever it's worth, in my opinion, the F18A is a fine device, completely surpassing any and all expectations. That's just my opinion. What I see in all the talk about the hypothetical F38A and the back and forth discussion is ultimately an expectation that overlooks the fact that the F18A is Matthew's hobby and labor of love. In all truthfulness, I feel the reason he built it wasn't to serve others, rather he built it for himself because it was something he wanted to do. The fact that it's a product he can sell and others can get benefit out of it is tangential. At least that's the way I see it. Matthew may not even know it, but he is directly responsible for where I am today in electronics. Before discovering the F18A, I knew nothing about FPGA's, and couldn't even fathom ever understanding how to design for them, let alone build a fully functioning product. Now, FPGA's are all I care about, and I'm obsessed with learning as much as I can about them. Today, I still can't see myself making something as complicated as the F18A, but I know now that it's within my reach, solely because of the inspiration I received by reading Matthew's blog. Matthew had an idea that he wanted to create, and he took it upon himself to sit down, teach himself FPGA programming, and he built his idea. He went out and did it. That's been incredibly motivating to me. I'm motivated to build things that I would like to see exist. Like the speech synthesizer adapter, if it's something I want, I go out and build it. If others are interested in it as well, then that's a nice addition. It feels good to make something that others are interested in. But ultimately, it's my project, and I get to decide what features and requests do and don't get rolled into the final product. The same goes for Matthew's F18A. Sure, it fails to meet some of the expectations of users who would like to see it implement more of the 9938's feature set, but it's Matthew's project, and he's under no obligation to implement anybody's requests. If he were on your payroll, then that would be a different story. But for now, any requests he does choose to implement should be seen as a bonus, not as a selective resistiveness to people's requests. Matthew's not here to dance to everyone else's tune. He's simply making something that he wants to make, and if anybody else is interested in it as well, then they get to share in the enjoyment of it. If somebody wants an F38A, as Matthew has already said, go out and do it. There's nothing stopping you. That realization has motivated me to get to where I am today, and will get me to where I will be in the future. That's just my opinion on the matter. Also, just wanted to let Matthew know that he's a direct inspiration to me in my work.
  23. Ha ha! Well, between work and having to be out of town for family, it's been tough to find time on the weekends to finish testing these new prototypes. I'm gonna force myself to do it one night this week, so I can get these out for field testing. I'll keep you guys up to date, and hopefully post another video soon. -Greg W.
  • Create New...