Jump to content

Photo

TMS-9900 CP/M?


87 replies to this topic

#76 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,734 posts
  • Location:Flyover State

Posted Tue Sep 5, 2017 9:38 AM

Suit yourself. I'm just saying that I don't know who came to that conclusion originally. I've just seen it repeated in many cases.

I've seen it repeated a lot... and I've never seen a reference to who originally said it.
That doesn't make it true.



#77 apersson850 OFFLINE  

apersson850

    Moonsweeper

  • 436 posts

Posted Mon Sep 11, 2017 1:11 PM

Yes, looking in the TMS 9900 timing diagram, one can see at it uses two clock cycles to access memory, even when there's no wait state involved. It seems that they should be able to access an internal register faster than that.

Does anyone have any similar performance data for the TI 990/9 CPU?



#78 pnr OFFLINE  

pnr

    Space Invader

  • 27 posts

Posted Sat Sep 16, 2017 5:41 PM

Can anyone even cite who originally said this?

 

 

I think I can answer this: Daren Appelt, the engineer that designed the TI960, TI980 and TI990 mini computers.

 

First of all, think of the context of the time: most of the 16-bit mini's were designed in the late 60's using the then new TTL logic chips and core memory. 

 

The CPU's were split over 5-20 boards in a rack, and typically clocked at between 3 and 10 Mhz (the maximum speed that such a context would allow) leading to say a typlcal data path cycle time of 200-350 ns. The core memory had a typical cycle time of 800 to 1200 ns. Often the designs would use an analog delay line to measure out the access time of the core memory and stall the CPU as it waited for data from memory. The PDP-11, the Nova, the HP2100, the 316, etc. all shared these characteristics. Things like caches were too expensive to implement and did not appear on this class of computer until the mid/late 70's.

 

In the 1960's having registers in memory was not unusual in mini computers, due to the cost of building registers from individual flip-flops: adding 16 registers of 16 bits each would have taken an entire board full of 7475 TTL chips. For example, the PDP-8 (which pre-dates the TTL era) had its registers in memory. Often only a few H/W registers existed, augmented by a bank of easily addressed memory locations (e.g. on the Nova, and the 6502 "page 0" is an echo of this.).

 

Then came semiconductor memory in the early 70's (and Daren had some involvement in that within TI, holding a few patents etc.). This soon had the potential to build main memory with a cycle time of 250-500 ns, the same as the cycle time of the 5-20 board CPU's of the era. Now, suddenly, it became possible to have registers in memory and not pay a hefty price in performance. In this context, the maxim "memory is as fast as CPU registers" was true, and the 990 was proposed and designed in 1973.

 

The novelty in the TI990 design was the idea of a workspace pointer and placing the register bank anywhere in memory. This was mostly useful in the context of the main production computer languages of the era, Cobol and Fortran. In these languages there was no recursion and each subroutine could be allocated its own block of registers. This was on optimisation over what happened on e.g. an IBM360, where each subroutine started with an instruction to save the registers to a block of memory, and ended with an instruction to reload them from that memory block. The TI990 was successful as a Cobol office machine, with some 100,000 units sold (incl. the desktop System 200/300/etc.).

 

Daren was also the person who noticed that the TI990 CPU could fit on a single chip with the then current technology, and it came to be in 1975. This was also the undoing of the design: with the CPU shrunk from several boards in a rack to a single chip, adding CPU-based registers became almost zero cost and again had a speed advantage. The TI990 design hence had a very short window in computer history where it made sense.

 

In the 80's several designs were tried that had multiple register banks integrated in the CPU, notably in the RISC world (e.g. the Sun Sparc). As time progressed, it turned out that having a smallish block of registers at the traditional instruction set level, an optimising compiler and microcode with advanced pipeline techniques (register renaming etc.) is the best route to performance for our current software base and it squeezed out all other designs.



#79 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,734 posts
  • Location:Flyover State

Posted Sun Sep 17, 2017 11:05 AM

Well, I'm pretty sure that the speed of flip flops increased at the same rate as SRAM since SRAM is made with flip flops.
DRAM uses a transistor and a capacitor, so... maybe faster... but I have my doubts due to the required refresh.
You would have a slightly faster access time only to be replaced with regular big waits.
So I think this is still a board size/cost rather than speed issue.



#80 mizapf OFFLINE  

mizapf

    River Patroller

  • 2,583 posts
  • Location:Germany

Posted Sun Sep 17, 2017 1:42 PM

SRAM is typically much faster than DRAM; in particular, you can find SRAM in cache memory. DRAM requires a periodic refresh; it is much simpler to build (1T1C cell = one transistor, one capacitor), but you have to actually charge/discharge the capacitor of the cell.



#81 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,734 posts
  • Location:Flyover State

Posted Sun Sep 17, 2017 5:45 PM

SRAM is typically much faster than DRAM; in particular, you can find SRAM in cache memory. DRAM requires a periodic refresh; it is much simpler to build (1T1C cell = one transistor, one capacitor), but you have to actually charge/discharge the capacitor of the cell.

My thought as well, but I don't know what things were like in the late 60s.  

I think DRAM would still be slower but...I have nothing to base that on from that time period.



#82 Ksarul OFFLINE  

Ksarul

    River Patroller

  • 4,215 posts

Posted Sun Sep 17, 2017 9:10 PM

Note that Core Memory was still in heavy use until about 1975 or so. . .as the first widely successful DRAM chips didn't show up until 1973. The earliest Intel DRAM chips came in 1970, but they had a lot of issues at first. That said, a lot of design decisions made in the early to mid 1970s still reflected an environment with core. Hard core military and science applications still used a lot of it even into the mid 1980s (Challenger's computers used core memory until the accident that destroyed it in 1986). It was a time of technology transition, so a lot of what was going on then didn't have a long shelf life, engineering-wise.



#83 apersson850 OFFLINE  

apersson850

    Moonsweeper

  • 436 posts

Posted Thu Sep 21, 2017 2:02 AM

What pnr writes above makes sense all the way. And even if flip-flops got faster too, the speed limitation came from the CPU being spread across a number of circuit boards, so that didn't really help.

Then the TMS 9900 just put the same design on one chip, without taking into account that such an integration kind of invalidated the whole idea.

 

Take a look here for a patent application by TI in 1975. Mr. Appelt didn't file it, but he's referenced as he has applied for a patent for an asynchronus bus communication system, later known as TILINE.


Edited by apersson850, Thu Sep 21, 2017 2:12 AM.


#84 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,734 posts
  • Location:Flyover State

Posted Thu Sep 21, 2017 12:25 PM

What pnr writes above makes sense all the way. And even if flip-flops got faster too, the speed limitation came from the CPU being spread across a number of circuit boards, so that didn't really help.

Then the TMS 9900 just put the same design on one chip, without taking into account that such an integration kind of invalidated the whole idea.

 

Take a look here for a patent application by TI in 1975. Mr. Appelt didn't file it, but he's referenced as he has applied for a patent for an asynchronus bus communication system, later known as TILINE.

Actually, I don't think it does.
1.  It does not account for the read-modify-write requirement than guarantees RAM has to be slower.
2. The 990/10 ran at 4.5MHz.  You shouldn't have buss speed issues from board to board until much higher speeds.
3. If there were speed issues across boards, that would also impact accessing RAM which would be... on other boards.



#85 pnr OFFLINE  

pnr

    Space Invader

  • 27 posts

Posted Mon Sep 25, 2017 3:58 AM

I think read-modify-write has a back story of its own, although somewhat related. There are three drivers for this:

1. In TI990/PDP11 style machines ("2-operand machines") the MOV instruction is typically in a group with arithmetic instructions (add, sub, and, or, etc.). The arithmetic instructions require the 2nd operand to be fetched, modified and written. To keep the microcode simple MOV followed the same set of steps, even though the fetch was redundant.

2. Core memory has destructive read (read erases the word) and requires write-back. Write could only set bits (and not clear them), so for writing a word it was first erased. As core memory dictated a read-write cycle anyway, making MOV a read-modify-write operation was not an additional cost. High performance machines used interleaved core memory banks, so that the next word could be read, whilst the previous one was being rewritten.

3. The TILINE bus could only operate on full words (unlike DEC's unibus that had signals to write a half-word). With the mini computer industry's thinking moving from word addressed machines to byte addressed machines around 1970 this was arguably an understandable mistake. As a result, byte instructions on a TI990 needed read before write in any case (and as under 1. above, byte instructions shared microcode with their word counterparts).

All in all, in the early 70's designing machines with a fixed read-modify-write cycle was an understandable and common performance vs. cost trade off.

 

Running much faster than 5MHz was hard. Yes, the individual TTL components can be clocked at 20+ MHz, but a more complicated system could not. Remember that these CPU's would draw several amp's of power, making the environment 'noisy'. For a signal to propagate and settle from a register latch, through a multiplexer, through a 74181 CPU, through a shifter, through a buffer and back to a register latch (the "datapath") would easily take 100ns. For the microinstruction to be decoded and control signals to settle would also take some 100ns; in a high end design the two things could be somewhat overlapped. Even a high end mini (like the 1973 PDP11/45) would have a cycle time of about 120ns. However, it seems that the original TI990 was intended as a machine with the programming convenience of a PDP11 but with the cost structure of a Nova and it used a relatively straightforward datapath design.

 

As to the third point, that RAM and registers would be equally affected: yes, that is exactly what Daren Appelt's thinking was.

 

There is one area where the equal speed argument was wrong, even in a 1973 context. Hardware registers can be dual or triple ported, i.e. one register could be read simultaneously with another being read or written (think 74170 or 74172 TTL). In a more complex (i.e expensive) data path design this can substantially reduce the number of micro-cycles needed to complete an instruction. This advantage cannot be achieved with the registers in core or semiconductor memory. The later 990/12 brings the registers into the CPU, acting as a cache, and exploits this opportunity (see attached paper).

 

You may want to read about the PISC1 demo CPU (or even build one): it is very simple (22 TTL chips) and demonstrates all these design trade offs:

http://www.bradrodri...rs/piscedu2.htm

 

 

Attached Files



#86 kl99 OFFLINE  

kl99

    Dragonstomper

  • 676 posts
  • Location:Vienna, Austria

Posted Mon Sep 25, 2017 6:29 AM

 

thanks for sharing the article.



#87 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,734 posts
  • Location:Flyover State

Posted Mon Sep 25, 2017 8:55 AM

The read modify write has NOTHING to do with core.

The CPU has to read from RAM to know current contents, and writing back is just updating changes back to RAM.  
It also has to be done without something else accessing the RAM in between which could cause unexpected results.

If you don't modify a register, you don't have to write back to RAM.

If you look at other processors or microcontrollers, any time you have bit oriented instructions that work on memory, they also have to perform a read modify write.

Here is a clock cycle analysis of TMS9900 instructions, it backs up my claim.
Notice that no write corresponding to the reads of the opcodes exists.
http://www.unige.ch/...i99/tms9900.htm
 

I looked at the Wiki article for magnetic core memory.
Magnetic core memory started out at about $1 per bit!!!

"And how much memory would you like with this computer?"

"I'd like 1K please."
"Okay, 1024 x 12 bit words x $1 = $4,572,288"
"Um... I'd like that to be 100 instead."

If that were when magnetic core memory was invented in the late 40s and were adjusted to today's dollars... it's over 9 times that.  It's almost $50 million for 1K words!
And that would have been cheaper than the tubes it replaced.
Prices supposedly dropped to about 1 cent per bit in the end, but that would have only been after the intro of DRAM and SRAM which are smaller and easier to deal with.
I'm sure this is part of the reason CPUs used 12 bit words instead of 16 at the time.

The more I look at computer history, the more I come to the conclusion that from a technology and price standpoint, personal computers arrived at almost the earliest time they could have.
 



#88 JamesD OFFLINE  

JamesD

    Quadrunner

  • 7,734 posts
  • Location:Flyover State

Posted Mon Sep 25, 2017 1:18 PM

BTW, both DRAM (intel), and SRAM (Fairchild?) where introduced in 1970






0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users