TMS-9900 CP/M?

JamesD · September 5, 2017

Suit yourself. I'm just saying that I don't know who came to that conclusion originally. I've just seen it repeated in many cases.

I've seen it repeated a lot... and I've never seen a reference to who originally said it.

That doesn't make it true.

apersson850 · September 11, 2017

Yes, looking in the TMS 9900 timing diagram, one can see at it uses two clock cycles to access memory, even when there's no wait state involved. It seems that they should be able to access an internal register faster than that.

Does anyone have any similar performance data for the TI 990/9 CPU?

pnr · September 16, 2017

Can anyone even cite who originally said this?

I think I can answer this: Daren Appelt, the engineer that designed the TI960, TI980 and TI990 mini computers.

First of all, think of the context of the time: most of the 16-bit mini's were designed in the late 60's using the then new TTL logic chips and core memory.

The CPU's were split over 5-20 boards in a rack, and typically clocked at between 3 and 10 Mhz (the maximum speed that such a context would allow) leading to say a typlcal data path cycle time of 200-350 ns. The core memory had a typical cycle time of 800 to 1200 ns. Often the designs would use an analog delay line to measure out the access time of the core memory and stall the CPU as it waited for data from memory. The PDP-11, the Nova, the HP2100, the 316, etc. all shared these characteristics. Things like caches were too expensive to implement and did not appear on this class of computer until the mid/late 70's.

In the 1960's having registers in memory was not unusual in mini computers, due to the cost of building registers from individual flip-flops: adding 16 registers of 16 bits each would have taken an entire board full of 7475 TTL chips. For example, the PDP-8 (which pre-dates the TTL era) had its registers in memory. Often only a few H/W registers existed, augmented by a bank of easily addressed memory locations (e.g. on the Nova, and the 6502 "page 0" is an echo of this.).

Then came semiconductor memory in the early 70's (and Daren had some involvement in that within TI, holding a few patents etc.). This soon had the potential to build main memory with a cycle time of 250-500 ns, the same as the cycle time of the 5-20 board CPU's of the era. Now, suddenly, it became possible to have registers in memory and not pay a hefty price in performance. In this context, the maxim "memory is as fast as CPU registers" was true, and the 990 was proposed and designed in 1973.

The novelty in the TI990 design was the idea of a workspace pointer and placing the register bank anywhere in memory. This was mostly useful in the context of the main production computer languages of the era, Cobol and Fortran. In these languages there was no recursion and each subroutine could be allocated its own block of registers. This was on optimisation over what happened on e.g. an IBM360, where each subroutine started with an instruction to save the registers to a block of memory, and ended with an instruction to reload them from that memory block. The TI990 was successful as a Cobol office machine, with some 100,000 units sold (incl. the desktop System 200/300/etc.).

Daren was also the person who noticed that the TI990 CPU could fit on a single chip with the then current technology, and it came to be in 1975. This was also the undoing of the design: with the CPU shrunk from several boards in a rack to a single chip, adding CPU-based registers became almost zero cost and again had a speed advantage. The TI990 design hence had a very short window in computer history where it made sense.

In the 80's several designs were tried that had multiple register banks integrated in the CPU, notably in the RISC world (e.g. the Sun Sparc). As time progressed, it turned out that having a smallish block of registers at the traditional instruction set level, an optimising compiler and microcode with advanced pipeline techniques (register renaming etc.) is the best route to performance for our current software base and it squeezed out all other designs.

JamesD · September 17, 2017

Well, I'm pretty sure that the speed of flip flops increased at the same rate as SRAM since SRAM is made with flip flops.
DRAM uses a transistor and a capacitor, so... maybe faster... but I have my doubts due to the required refresh.
You would have a slightly faster access time only to be replaced with regular big waits.
So I think this is still a board size/cost rather than speed issue.

+mizapf · September 17, 2017

SRAM is typically much faster than DRAM; in particular, you can find SRAM in cache memory. DRAM requires a periodic refresh; it is much simpler to build (1T1C cell = one transistor, one capacitor), but you have to actually charge/discharge the capacitor of the cell.

JamesD · September 17, 2017

SRAM is typically much faster than DRAM; in particular, you can find SRAM in cache memory. DRAM requires a periodic refresh; it is much simpler to build (1T1C cell = one transistor, one capacitor), but you have to actually charge/discharge the capacitor of the cell.

My thought as well, but I don't know what things were like in the late 60s.

I think DRAM would still be slower but...I have nothing to base that on from that time period.

+Ksarul · September 18, 2017

Note that Core Memory was still in heavy use until about 1975 or so. . .as the first widely successful DRAM chips didn't show up until 1973. The earliest Intel DRAM chips came in 1970, but they had a lot of issues at first. That said, a lot of design decisions made in the early to mid 1970s still reflected an environment with core. Hard core military and science applications still used a lot of it even into the mid 1980s (Challenger's computers used core memory until the accident that destroyed it in 1986). It was a time of technology transition, so a lot of what was going on then didn't have a long shelf life, engineering-wise.

apersson850 · September 21, 2017

What pnr writes above makes sense all the way. And even if flip-flops got faster too, the speed limitation came from the CPU being spread across a number of circuit boards, so that didn't really help.

Then the TMS 9900 just put the same design on one chip, without taking into account that such an integration kind of invalidated the whole idea.

Take a look here for a patent application by TI in 1975. Mr. Appelt didn't file it, but he's referenced as he has applied for a patent for an asynchronus bus communication system, later known as TILINE.

Edited September 21, 2017 by apersson850

JamesD · September 21, 2017

What pnr writes above makes sense all the way. And even if flip-flops got faster too, the speed limitation came from the CPU being spread across a number of circuit boards, so that didn't really help.

Then the TMS 9900 just put the same design on one chip, without taking into account that such an integration kind of invalidated the whole idea.

Take a look here for a patent application by TI in 1975. Mr. Appelt didn't file it, but he's referenced as he has applied for a patent for an asynchronus bus communication system, later known as TILINE.

Actually, I don't think it does.

1. It does not account for the read-modify-write requirement than guarantees RAM has to be slower.

2. The 990/10 ran at 4.5MHz. You shouldn't have buss speed issues from board to board until much higher speeds.

3. If there were speed issues across boards, that would also impact accessing RAM which would be... on other boards.

pnr · September 25, 2017

I think read-modify-write has a back story of its own, although somewhat related. There are three drivers for this:

1. In TI990/PDP11 style machines ("2-operand machines") the MOV instruction is typically in a group with arithmetic instructions (add, sub, and, or, etc.). The arithmetic instructions require the 2nd operand to be fetched, modified and written. To keep the microcode simple MOV followed the same set of steps, even though the fetch was redundant.

2. Core memory has destructive read (read erases the word) and requires write-back. Write could only set bits (and not clear them), so for writing a word it was first erased. As core memory dictated a read-write cycle anyway, making MOV a read-modify-write operation was not an additional cost. High performance machines used interleaved core memory banks, so that the next word could be read, whilst the previous one was being rewritten.

3. The TILINE bus could only operate on full words (unlike DEC's unibus that had signals to write a half-word). With the mini computer industry's thinking moving from word addressed machines to byte addressed machines around 1970 this was arguably an understandable mistake. As a result, byte instructions on a TI990 needed read before write in any case (and as under 1. above, byte instructions shared microcode with their word counterparts).

All in all, in the early 70's designing machines with a fixed read-modify-write cycle was an understandable and common performance vs. cost trade off.

Running much faster than 5MHz was hard. Yes, the individual TTL components can be clocked at 20+ MHz, but a more complicated system could not. Remember that these CPU's would draw several amp's of power, making the environment 'noisy'. For a signal to propagate and settle from a register latch, through a multiplexer, through a 74181 CPU, through a shifter, through a buffer and back to a register latch (the "datapath") would easily take 100ns. For the microinstruction to be decoded and control signals to settle would also take some 100ns; in a high end design the two things could be somewhat overlapped. Even a high end mini (like the 1973 PDP11/45) would have a cycle time of about 120ns. However, it seems that the original TI990 was intended as a machine with the programming convenience of a PDP11 but with the cost structure of a Nova and it used a relatively straightforward datapath design.

As to the third point, that RAM and registers would be equally affected: yes, that is exactly what Daren Appelt's thinking was.

There is one area where the equal speed argument was wrong, even in a 1973 context. Hardware registers can be dual or triple ported, i.e. one register could be read simultaneously with another being read or written (think 74170 or 74172 TTL). In a more complex (i.e expensive) data path design this can substantially reduce the number of micro-cycles needed to complete an instruction. This advantage cannot be achieved with the registers in core or semiconductor memory. The later 990/12 brings the registers into the CPU, acting as a cache, and exploits this opportunity (see attached paper).

You may want to read about the PISC1 demo CPU (or even build one): it is very simple (22 TTL chips) and demonstrates all these design trade offs:

http://www.bradrodriguez.com/papers/piscedu2.htm

990_12_article.pdf

kl99 · September 25, 2017

thanks for sharing the article.

JamesD · September 25, 2017

The read modify write has NOTHING to do with core.

The CPU has to read from RAM to know current contents, and writing back is just updating changes back to RAM.
It also has to be done without something else accessing the RAM in between which could cause unexpected results.

If you don't modify a register, you don't have to write back to RAM.

If you look at other processors or microcontrollers, any time you have bit oriented instructions that work on memory, they also have to perform a read modify write.

Here is a clock cycle analysis of TMS9900 instructions, it backs up my claim.
Notice that no write corresponding to the reads of the opcodes exists.
http://www.unige.ch/medecine/nouspikel/ti99/tms9900.htm

I looked at the Wiki article for magnetic core memory.
Magnetic core memory started out at about $1 per bit!!!

"And how much memory would you like with this computer?"

"I'd like 1K please."
"Okay, 1024 x 12 bit words x $1 = $4,572,288"
"Um... I'd like that to be 100 instead."

If that were when magnetic core memory was invented in the late 40s and were adjusted to today's dollars... it's over 9 times that. It's almost $50 million for 1K words!
And that would have been cheaper than the tubes it replaced.
Prices supposedly dropped to about 1 cent per bit in the end, but that would have only been after the intro of DRAM and SRAM which are smaller and easier to deal with.
I'm sure this is part of the reason CPUs used 12 bit words instead of 16 at the time.

The more I look at computer history, the more I come to the conclusion that from a technology and price standpoint, personal computers arrived at almost the earliest time they could have.

JamesD · September 25, 2017

BTW, both DRAM (intel), and SRAM (Fairchild?) where introduced in 1970

+Vorticon · May 15, 2021

Having a little fun ?

+TheBF · May 16, 2021

So help me get this.

Is CPM running on the PI and the 99 is like a terminal?

wolhess · May 16, 2021

Super cool,

your cpm emulation is another tipi application for the TI-99/4a.

Now we have access to a lot of text based applications from the cpm world.

Maybe there is a MSDOS emulator available for the pi too?

RickyDean · May 16, 2021

2 hours ago, TheBF said:

So help me get this.

Is CPM running on the PI and the 99 is like a terminal?

That is pretty much it. About the same as the two cpm cards produced for the ti99.

+TheBF · May 16, 2021

53 minutes ago, wolhess said:

Maybe there is a MSDOS emulator available for the py too?

Looks like it

Play Classic Games using DOSBox on the Raspberry Pi - Pi My Life Up

+Vorticon · May 17, 2021

Not much different than the Z80 card for the Apple 2, albeit here the Z80 is being emulated by the Rpi.

Tursi · May 17, 2021

17 hours ago, wolhess said:

Super cool,

your cpm emulation is another tipi application for the TI-99/4a.

Now we have access to a lot of text based applications from the cpm world.

Maybe there is a MSDOS emulator available for the py too?

Hmm. DOSBox should run on it, but it'll use the video display. So we just need to add some code to copy the DOS window over to the TI - custom terminal! Probably want F18A so you can have 80 column text.

("just", I say... )

+Vorticon · May 17, 2021

20 hours ago, wolhess said:

Super cool,

your cpm emulation is another tipi application for the TI-99/4a.

Now we have access to a lot of text based applications from the cpm world.

Maybe there is a MSDOS emulator available for the py too?

Unfortunately the TIPI telnet implementation is not complete, so some applications have some issues with the display. Wordstar is one that comes to mind.

+jedimatt42 · May 17, 2021

I was under the impression vt100 was not the defacto terminal type used on CPM systems. Thus there are pages like this out there :

http://canal.chez.com/CPM/ws3.htm

Or maybe the wordstar used was already configured correctly like one if these:

https://schorn.ch/altair_7.php

----

Are you using this emulator: ?

https://github.com/MockbaTheBorg/RunCPM

I would be interested in providing a purpose built 4A client, and configuration. And maybe, if it makes any sense to do so, some file interoperability.

JB · May 18, 2021

19 hours ago, jedimatt42 said:

I was under the impression vt100 was not the defacto terminal type used on CPM systems. Thus there are pages like this out there :

http://canal.chez.com/CPM/ws3.htm

.

"VT100 up to 5 colors"?

That's a bit impressive, given theVT100's a B/W teminal.

+jedimatt42 · May 18, 2021

4 hours ago, JB said:

"VT100 up to 5 colors"?

That's a bit impressive, given theVT100's a B/W teminal.

Yep, vt100 compatibility with differentiating features must have be all the rage for a period in the terminal market.

+Vorticon · May 18, 2021

On 5/17/2021 at 9:30 AM, jedimatt42 said:

I was under the impression vt100 was not the defacto terminal type used on CPM systems. Thus there are pages like this out there :

http://canal.chez.com/CPM/ws3.htm

Or maybe the wordstar used was already configured correctly like one if these:

https://schorn.ch/altair_7.php

----

Are you using this emulator: ?

https://github.com/MockbaTheBorg/RunCPM

I would be interested in providing a purpose built 4A client, and configuration. And maybe, if it makes any sense to do so, some file interoperability.

Not really. In the hay days of CP/M, each machine had it's own terminal settings which is why one needs to install most CP/M applications to match the machine running them.

WordStar runs fine under VT100, but there are issues when editing text with things like the first line disappearing or scrolling problems, which I am assuming are related to an incomplete emulation by TIPI.

An yes, that is the emulator I am using.

A specific 4A client would be totally awesome!

TMS-9900 CP/M?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members