Jump to content


  • Content Count

  • Joined

  • Last visited

  • Days Won


gregallenwarner last won the day on May 11 2015

gregallenwarner had the most liked content!

Community Reputation

200 Excellent

About gregallenwarner

  • Rank
    Chopper Commander

Profile Information

  • Gender

Recent Profile Visitors

4,428 profile views
  1. Units for microseconds are denoted by 'us', with the letter 'u' approximating the Greek letter mu. Hope that helps. This is actually the very assumption I've been challenging in my head before coming here with my original post. There actually IS an atomic test and set instruction in the 9900! The atomic test-and-set is the key to a non-blocking synchronization method, allowing threads to attempt to acquire a mutex without blocking, meaning, if the thread was unsuccessful in acquiring the mutex (because some other thread jumped in there before it was able to), the thread can be aware of this and branch off somewhere else other than its critical section and keep working on non-shared data, then come back and attempt to acquire the mutex again later. With the existing blocking methods, the unsuccessful thread simply blocks when it cannot acquire the mutex, completely unaware that it is temporarily halted in its task. The other benefit to test-and set is that threads no longer need to bother with modifying the Interrupt mask anymore. You can safely synchronize threads without any threads having to muck around with LIMI instructions! So what is this atomic test-and-set that supposedly already exists in the 9900? I've actually already mentioned it at the top of this post: SLA R0,1 That's Shift Left Arithmetic, right? Yes, but it can also serve as an atomic test-and-set for non-blocking synchronization. Here's how: Suppose we set up a mutex in memory somewhere. Let's call the label MUTEX. In this system, mutexes must be a 16-bit word, initiallized to >8000 by the kernel. A set bit in position 0 indicates the mutex is free, and is not owned by any thread. Now to acquire the mutex, the user thread must set this bit to zero, so that other threads will see that bit 0 is clear, and know that the mutex is owned by someone else. But how can we do this atomically? And furthermore, the test-and-set operation atomically changes the value AND tests what the value used to be before the change! Here's where Shift-Left-Arithmetic comes in. The user thread shifts this mutex one bit to the left. The '1' in bit position 0 falls off the end, and zeros shift in from the right. This has the effect of setting the mutex bit to 0, locking the mutex. (Additionally, since this mutex exists in memory, we need to quickly relocate our WP to that memory location, perform the shift, and immediately move our WP back again, like so...) MUTEX DATA >8000 ... ... * Acquire the mutex LWPI MUTEX SLA R0,1 LWPI WP * Whatever our local workspace is So even if multiple threads try and synchronize on this mutex at the same time, each thread is performing SLA R0,1 on this memory location, so only new zeros are ever getting shifted in, no matter how many threads attempt to grab it. As long as your multitasking kernel is sure to keep every thread's workspace local to itself, this works fine. But what about the test portion of the procedure? Well, it's already been performed by the SLA instruction! We want to know what was the value of mutex bit zero before the shift occurred. The SLA instruction stores the value shifted out from position 0 in the carry bit of the Status register! Bingo! We have all we need! Since a context switch preserves the Status register local to each thread, and because SLA is atomic, we have assurance that only 1 thread will ever acquire the mutex! If two threads are in contention for the same mutex, because we are shifting by one bit each time, only 1 thread's status register will grab the '1' bit in position 0 as it falls off the left side of the register. Everyone else will get zeros. So now we can use JNC to jump somewhere else if we were unlucky enough to miss the mutex! MUTEX DATA >8000 ... ... * Acquire the mutex LWPI MUTEX SLA R0,1 LWPI WP JNC OTHER ... ... * Critical section of code ... OTHER ... * Do something else if we didn't acquire the mutex. If non-blocking synchronization isn't needed, we can use this same method to simulate blocking, by jumping back to try and reacquire the mutex again if we missed it. MUTEX DATA >8000 ... ... GRAB LWPI MUTEX SLA R0,1 LWPI WP JNC GRAB ... ... * Critical section of code ... Releasing the mutex is simple, Once the thread who owns the mutex is absolutely sure it is finished accessing shared memory, it simply releases the mutex by writing >8000 back to the mutex: * (From local workspace) * Release the mutex LI R0,>8000 MOV R0,@MUTEX * ( -or- From mutex workspace) LI R0,>8000 LWPI WP * Back to local workspace That's what I've been working on. I've not tested this yet on real hardware, but that's why I wanted to confirm some theory about interrupts with you all here, and it seems to make sense. Non-blocking synchronization is a key and revolutionary factor when it comes to more modern approaches to parallelism, and now it's possible on the 9900 thanks to the atomic test-and-set instruction that's been hiding right below our noses the whole time! And it's faster than surrounding mutex-acquiring code with LIMI 0 and LIMI 2, since LWPI takes only 10 clock cycles vs. LIMI's 16 cycles. Let me know what you all think of this theory.
  2. Thanks for all the info! So, just to recap, any instruction in the TMS9900, no matter if the instruction itself is comprised of multiple steps of memory access, these instructions are treated as an atomic operation by the interrupt handling circuitry in the CPU. I'm investigating the nature of non-blocking multitasking in the TMS9900, managed by a preemptive multitasking kernel, and to my knowledge, it's never been done before, due to the lack of an atomic Test-and-Set operation, such as TCMB and TSMB in the TMS99105. Has anybody investigated non-blocking multitasking in the TMS9900?
  3. Interesting, I had never heard about the interrupt mask being decreased by 1. Does this mean that, if I use LIMI 2 as convention always dictated, can my interrupt handler be interrupted by yet another interrupt? Since 2 decreases to 1, and all interrupts are hardwired as 1 in the TI? Should we be using LIMI 1 to ensure interrupts can't interrupt interrupts?
  4. I have a question regarding the nature of TI interrupts. What is the precise nature of when an interrupt can occur? Here's my example: Suppose I have a single instruction: SLA R0,1 Is this instruction atomic? The TMS9900 manual states that when an interrupt is raised, the CPU services the interrupt after the completion of the current instruction. But I'm not clear on what constitutes a single instruction. With the shift instruction above, the CPU needs to read in the memory word at the location of the Workspace Pointer. Then, it needs to shift that word by 1 place, affecting the Status bits. Then, it needs to write the result back out to memory, which in and of itself consists of another read-before-write, thanks to the TI's architecture design. Are each of those steps treated as individual instructions by the CPU? In other words, if an interrupt is raised at any point inbetween these steps, is it possible for the shift instruction to be interrupted halfway through? Or does the CPU treat this whole process as one atomic instruction, uninterruptible until its completion?
  5. Can you explain this recent "discovery" of scanline effects in more detail? What was discovered about the TI that lets you do this? I was under the impression that the video hardware did not issue scanline interrupts, only field interrupts 50/60 times a second.
  6. I'll dig up my spec document for you all, but off the top of my head, I believe I multiplexed the *WE and *CRUCLK line. Consider that you're never writing to both memory and a CRU device at the same time. My TI-side interface would multiplex these two signals, and then the PEB-side would look at the other lines to figure out whether this was a memory write or a CRU write, and pulse the appropriate line. That's a very simplified explanation, but I've got more details to explain the rest once I post my spec doc. If the goal is to remove the RAM from the PEB, then there would certainly be an opportunity to combine 32K of RAM onto my TI-side interface card. 32K static RAM chips are a few cents, and you can get them in tiny surface mount packages that take up almost no room. Would be worth designing it right in, along with a jumper to disable it in case you prefer to use a SAMS in the PEB.
  7. Something like this doesn't really serve a need for me, since I am not wanting to tie myself to a strict 32K RAM expansion, as I hope to build myself a SAMS clone someday for the full 1MB of RAM in the PEB. But certainly I can see this being of use to others. Something that worries me is standing the PEB's firehose cable up on it's edge like that. It's a rather bulky and heavy cable, and I'm afraid its weight would put a considerable strain and torque on the TI's edge connector. The firehose cable's TI-side connector, terrible as it is, was designed with a little foot to keep it elevated so it doesn't strain the connector. Things like strain relief are always a big part of designing a connector.
  8. There would definitely need to be an interface on the TI's end to do the translation/multiplexing that allows 44 pins to be cut down to 40 (fewer than that, actually, since not all positions in an IDE cable are usable), but with modern PLD's, this logic won't take up very much space at all.
  9. I have been writing a specification over the past year for connecting the TI to the PEB using a standard off-the-shelf 40-pin IDE hard drive ribbon cable. My plan is to buy a 3-foot round IDE cable (the twisted pairs are separated and bound together in a rubber moulding) to make the appearance much more tidy, and use that to connect the two. I had to do some clever things to get everything to fit on the smaller IDE cable, but it should be doable. My specification is pretty much complete, just need to build a prototype and test it. Might be useful, as IDE cables are a dime a dozen and easy to find anywhere, eliminating the need for any special connectors or cables.
  10. Hey, all! Thanks for the renewed interest in this project. Here's the long overdue update: I recently got married and moved, and my new house had to accommodate both my stuff and my wife's, so the basement area which I plan on converting into my TI lab still hasn't been set up yet. However, this project is still alive and well in my head. I am toying with the idea of building a replacement PEB interface card. I am also well into the final stages of designing a way to connect the PEB to the TI using an ordinary 40-pin IDE hard drive ribbon cable. (Yes, I know that's not enough pins, I'm using some clever techniques to make it work.) The ultimate plan is to be able to buy an off-the-shelf 3-foot hard drive cable and use it to connect your PEB. You can even get round IDE cables to make your desk less messy. So my thinking is this: the PEB-side connector card for this new interface doesn't need to be very big with today's modern chips, so there's enough room to relocate my Speech-to-PEB circuit to this new card. In other words, the new TI-to-PEB connection card would have an optional card edge connector where you could stick a speech synth, if you had one. So it'd kill two birds with one stone. How's that sound? Would using a standard IDE cable be better for the PEB? And having the connection card pulling double duty with the Speech Synth to save PEB slots as well?
  11. It's not only possible, it's very practical. FPGA's are perfect for a small batch where you don't want to wire together a mountain of discrete logic, and you don't have the funds to roll an ASIC chip. Grab yourself a cheap FPGA dev board and try it out!
  12. Probably not the issue, but does it have anything to do with the interrupt line? I recall when I performed my HDX mod, you need to cut the card's interrupt line, as the card won't have a valid interrupt handler loaded when you first install it. It can be reconnected after loading the DSR. Just a shot in the dark.
  13. I'd like to extend the capabilities of the linking loader for my own purposes. For one, I'd like to implement the assembler directives, PSEG, DSEG, and CSEG, to make use of them. I saw on Thierry Nouspikel's site that he had a bit of code that patched in that functionality, but I'd like to have that functionality rolled into the loader itself, not just a patch job. I also want to remove the linking loader's reliance on other EA utilities, such as DSRLNK, VSBW, VSBR, etc. I'd like for it to be completely stand-alone.
  • Create New...