Jump to content

pnr

Members
  • Posts

    159
  • Joined

  • Last visited

Posts posted by pnr

  1. Let's take a look a the remaining entry point, for opcodes in the >03xx range.

     

    It starts thus:

    ; Start of table entry 0806 (opcodes 03xx)
    ; Only >0301 (CR) and >0302 (MM) are valid on a 99110
    ;
    0B1C 0285   CI   R5, >0302       ; CR or MM opcode?
    0B1E 0302
    0B20 155F   JGT  >0BE0           ; no: test extension & exit
    0B22 1301   JEQ  >0B26           ; for CR clear R5 (as a flag)
    0B24 04C5   CLR  R5
    0B26 024F   ANDI R15, >07FF      ; clear status bits
    0B28 07FF
    0B2A C0BE   MOV  *R14+, R2       ; fetch second opcode word
    0B2C 0206   LI   R6, >0004       ; four byte operands
    0B2E 0004
    

    It only accepts CR and MM and all other opcodes from the group are referred to the extension test. Later on we need an easy test for opcode CR versus MM, and R5 is cleared for this purpose. The status bits ST0-ST4 are cleared, as we saw with the 0Cxx opcodes. Then the second opcode word is fetched and R6 is preloaded with an auto-increment constant.

     

    Next it prepares the source operand:

    0B30 C042   MOV  R2, R1          ; extract src bits
    0B32 0241   ANDI R1, >003F
    0B34 003F
    0B36 0101   EVAD R1              ; calculate src address
    0B38 1601   JNE  >0B3C           ; if Ts = 3, autoincrement src ptr
    0B3A A686   A    R6, *R10 

    This uses the EVAD instruction, which is discussed in data sheet section 7.3.3.5. This instruction takes a 6 bit operand field and calculates the actual address of the operand. If the modifier bits signify *Rn+ the EQ bit is set (for a source operand) and a pointer to Rn is loaded in R10. Because we are dealing with 32 bit operands the register is auto-incremented by 4 bytes.

     

    It proceeds with preparing the destination operand:

    0B3C C008   MOV  R8, R0          ; save source address during 2nd EVAD
    0B3E 0242   ANDI R2, >0FC0       ; extract dst bits
    0B40 0FC0
    0B42 0102   EVAD R2
    0B44 1B04   JH   >0B4E           ; if Td = 3, autoincrement dst ptr
    0B46 C145   MOV  R5, R5          ; for MM, increment is 8
    0B48 1301   JEQ  >0B4C
    0B4A 0A16   SLA  R6, 1
    0B4C A646   A    R6, *R9 

    This code is pretty much the same. For auto-increment in the destination field the A> status flag (ST0) is set and the associated pointer is in R9. Because MM has a 64 bit result, the auto-increment is upped to 8 bytes. I'm not sure why the code uses two calls on EVAD, as this instruction can do both src and dst at the same time. If anybody sees a good reason for this, please post your observations.

     

    With the operand access prepared the code moves on to actually fetch the operands:

    0B4E C085   MOV  R5, R2          ; move opcode to R2
    0B50 C200   MOV  R0, R8          ; restore source address
    0B52 C038   MOV  *R8+, R0        ; fetch S to R0,R1 and D to R4,R5
    0B54 C058   MOV  *R8, R1
    0B56 C117   MOV  *R7, R4
    0B58 C167   MOV  @2(R7), R5      
    0B5A 0002 

    The source operand is fetched into R0,R1 and the destination into R4,R5. As we will see later, these registers are chosen for good reason. It does mean that we have to move the (first word of the) opcode out of the way to R2.

     

    The choice to move R0 back to R8 is significant. Indirect addressing via R7/R8 generates special bus status codes. When using indirect addressing with R8/R7, the CPU generates WS and DOP/SOP bus status codes. If Td/Ts was zero during EVAD a WS cycle is used and if Td/Ts was not zero a DOP/SOP cycle is used (see section 7.3.3.4). This way, external hardware cannot tell apart if an instruction is a macro instruction or implemented in microcode.

     

    The data sheet is a bit vague, but there seems to be mechanism that this also works when two EVAD instructions are used.

     

    Finally we get to execution:

    0B5C 0706   SETO R6              ; set the MM / CR flag
    0B5E C082   MOV  R2, R2          ; was opcode CR?
    0B60 1604   JNE  >0B6A
    0B62 0224   AI   R4, >8000       ; change sign of D
    0B64 8000
    0B66 0460   B    @>0830          ; perform CR = S+(-D) without store
    0B68 0830
    0B6A 0460   B    @>0900          ; perform MM
    0B6C 0900 

    The execution path of CR is partly shared with AR and that of MM with MR. Hence a flag (R6) is set to keep track of which paths to follow.

     

    CR is evaluated by calculating S+(-D), and suppressing storage of the result - just the status bits are set.

    • Like 1
  2. So did I understand correctly that the 990/12 implemented these floating point instructions natively in hardware?

     

    The answer depends on what you mean by native.

     

    Yes: the 990/12 had microcode for all these instructions, and also for the double precision variants (AD, SD, MD, etc.).

     

    No: the 990/12 did not have specialized data paths to support floating point, and the microcode calculated the results using the normal 16 bit data path. Simply put, the microcode did the same operations as the macro code on a 99110. Of course, not having to fetch opcodes etc. it runs faster in microcode.

     

    One could say this is "low end native". An example of high end native would be a FPU co-processor as existed for the PDP11.:

    http://www.psych.usyd.edu.au/pdp-11/11_34_fpp.html

    As far as I know there never was such a specialized FPU for the TI990. The co-processor interface on the 99xxx CPU suggest that TI was thinking about it at the time. Had the series been more succesful we could have seen a 99-series FPU chip, perhaps with capabilities like the Intel 8231:

    http://www.cpu-galaxy.at/cpu/ram%20rom%20eprom/other_intel_chips/other_intel-Dateien/8231A_datasheet.pdf

    • Like 2
  3. I'm not a fan of hardware supported virtual memory or kernel vs. user memory in older retro systems like this. Resources are limited enough already, and making something that works with existing hardware and/or software is typically a priority.

     

    I agree with the sentiment, but it also depends on ones retro reference. My retro interest is in building small boards that work like the 16-bit minis of the late 70's: the TI990, the PDP11 and the Nova/Eclipse. And actually, an early 80's home machine like the Cortex is not all that different (neither is the Geneve). My software reference is mostly early Unix, but also the Marinchip stuff:

    http://www.stuartconner.me.uk/mini_cortex/mini_cortex.htm

    https://www.fourmilab.ch/documents/marinchip/

     

    For my next project I'm thinking about a 99000 based board with an MMU modeled on that of the Nova/Eclipse (essentially a fancy mapper).

     

    But I'm also interested to hear about other concepts from back-in-the-day and other people's ideas for today's retro projects.

     

     

    Other 8-bit systems like the MSX have already solved this for a 64K address space CPU, so how does the memory-mapper work in that system? Maybe the best thing would be to copy something that is proven, working, and supported in other systems.

     

    Did a bit of Googling. MSX2 seems to have used a very simple mapping scheme with four 16KB blocks paged into up to 4MB ram (i.e. 8 bit page address). Many machines seem to have implemented a 4 bit page address, using two 74LS170 chips to implement the mapper. The Cortex and Geneve mappers appear more capable.

     

     

    Having support at the CPU level via enhanced instructions is key to making a memory-mapper usable, i.e. jump or branch instructions that know how to reach a destination address longer than 64K. Otherwise having to deal with managing the pages manually becomes a PITA (although no more of a hassle than dealing with a bank-switched cart). But to deal with that you are now looking at a situation of where to store the extra address bits, and how does the memory appear to the software. Intel dealt with this by using segments on the 8088/8086.

     

    Yes. This is what makes the 99000 so interesting: such instructions can be created in macrocode.

     

    On possible solution is the early Unix overlay system. Here the tool chain keeps track of what code lives on what overlay and automatically switches between overlays as functions get called and return. It is mostly transparent to the programmer. This is how Ultrix-11 can run a 150KB kernel on a 16 bit PDP11.

     

    Here's another idea that is specific to the 99000:

     

    - All code must live on word boundaries, i.e. all target addresses for B, BL and BLWP are even. Effectively, bit 15 is not used and wasted.

    - We could use that bit to hold an extra code address bit, above A0.

    - Assume a machine laid out for separate I/D spaces; the PSEL bit is another address bit in I space, but ignored in D space (i.e. I space is 128KB, and D space is 64KB)

     

    Now we can make three new macro instructions, "long" branches: LB, LBL and LBWP. Each of these behave as the normal branches, except that they look at bit 15 of the destination, set PSEL accordingly and then make the jump. So with these instructions we can branch anywhere in the 128KB code space. Total program space 128KB + 64KB = 192KB.

  4. I've now analysed the entry code for group >0Cxx. This group has most of the floating point instructions, see the 99110 annex of the data sheet for details.

    It has revealed an undocument opcode: 'XIT'

     

    The macro code begins thus:

    ; entry point for opcodes 0Cxx
    ;
    0B6E 0285   CI   R5, >0C3F        ; zero or one operand opcode?
    0B70 0C3F
    0B72 1506   JGT  >0B80            ; jump if one operand
    0B74 2560   CZC  @>0BD4, R5       ; valid zero operand instruction?
    0B76 0BD4
    0B78 161C   JNE  >0BB2            ; no: test for 'XIT'
    0B7A 0245   ANDI R5, >0006        ; if valid zero operand opcode,
    0B7C 0006                         
    0B7E 1011   JMP  >0BA2            ; go fetch FPAC & go to opcode routine
    ..
    0BD4 0039   DATA >0039            ; opcode bit test pattern
    

    The above is all straightforward.

     

    The next macro code section proceeds to handle the source operand:

    0B80 C2C5   MOV  R5, R11          ; handle one operand case
    0B82 0245   ANDI R5, >01FF        ; isolate <src> bits
    0B84 01FF
    0B86 0105   EVAD R5               ; calculate EA
    0B88 1609   JNE  >0B9C            ; Ts = 3 ?
    0B8A 024B   ANDI R11, >FFC0       ; mask out operand bits
    0B8C FFC0
    0B8E 028B   CI   R11, >0C80       ; opcode is CIR?
    0B90 0C80
    0B92 1602   JNE  >0B98            ; yes: autoincrement by 2, else by 4
    0B94 05DA   INCT *R10
    0B96 1002   JMP  >0B9C
    0B98 A6A0   A    @>0B2E, *R10     ; >0B2E contains 4
    0B9A 0B2E
    0B9C 0855   SRA  R5, 5            ; calculate switch index from opcode
    0B9E 0225   AI   R5, >0006
    0BA0 0006
    

    This macro code section uses the special macro opcode "EVAD" (evaluate address). This instruction is documented in section 7.3.3.5 of the data manual. EVAD analyzes the operand bits in the instruction and calculates the address of the source operand. This address is placed in R8. If the operand uses the *Rx+ format, a pointer to Rx is placed in R10 and the EQ status bit is set.

     

    As floating point numbers are 32 bit, i.e. 4 bytes, registers are auto-incremented by 4. The only exception is CIR, which has a integer word operand and increments by 2.

     

    Then the macro code proceeds with:

    0BA2 C01D   MOV  *R13, R0         ; fetch FPAC into local R0,R1
    0BA4 C06D   MOV  @2(R13), R1
    0BA6 0002
    0BA8 C08F   MOV  R15, R2          ; save status & clear ST0-ST4
    0BAA 024F   ANDI R15, >07FF       ; (status is save to restore ST3-ST4 as needed)
    0BAC 07FF
    0BAE 0165   BIND @>0BBE(R5)       ; jump to specific opcode routine
    0BB0 0BBE
    
    ; branch table for 0Cxx group (first 4 are zero operand)
    ;
    0BBE 09CE   DATA >09CE            ; CRI
    0BC0 08E4   DATA >08E4            ; NEGR
    0BC2 09D2   DATA >09D2            ; CRE
    0BC4 0A86   DATA >0A86            ; CER
    0BC6 081E   DATA >081E            ; AR
    0BC8 0A80   DATA >0A80            ; CIR
    0BCA 0814   DATA >0814            ; SR
    0BCC 08F4   DATA >08F4            ; MR
    0BCE 0946   DATA >0946            ; DR
    0BD0 08D2   DATA >08D2            ; LR
    0BD2 08D8   DATA >08D8            ; STR
    

    This code fetches the floating point accumulator ("FPAC") from the user's R0,R1 and places this in our local R0,R1. The it clears ST0-ST4 in the users status register. The various instructions will set these bits as needed. The original status register is saved in R2 because some instructions only affect ST0-ST2 and must hence restore ST3-ST4.

     

    Then it uses the 99000 specific BIND (Branch Indirect) instruction to jump to a specific opcode handler routine via a jump table.

     

    That leaves the mysterious undocumented 'XIT'. It is actually rather boring:

    ; Test and implement XIT
    ;
    0BB2 C1C5   MOV  R5, R7           ; test for XIT (>0C0E and >0C0F)
    0BB4 0917   SRL  R7, 1            ;   XIT is a no-op
    0BB6 0287   CI   R7, >0607
    0BB8 0607
    0BBA 1612   JNE  >0BE0            ; no: test for extension & exit
    0BBC 0380   RTWP                  ; macro processing complete
    

    XIT is an instruction on the TI990/12 and is also a NOP there. It is used as part of floating point handling by the TI990 Fortran compiler, to create code that would run both on machines with (the /12) and without (the /10) native floating point (explanation courtesy of Dave Pitts)

     

    The Fortran compiler would for example generate:

          BLWP    @F$RITP
          LR      *R9+
          AR      *R9
          STR     *R8
          XIT
    

    On a 990/10 the "F$RITP" routine is a floating point library that reads the instructions following the BLWP and emulates the floating point hardware. When it sees a XIT instruction it stops emulating and returns. Hence "exit interpreter" or XIT. On a 990/12 the "F$RITP" routine would be empty (i.e. do a RTWP immediately) and the 990/12 hardware would execute the floating point code natively. When it saw the XIT it would treat it as a NOP.

     

    It would seem that the 99110 implemented the "XIT" instruction for the exact same purpose.

     

    • Like 3
  5. Argh ... they just didn't learn. It could not have been too hard to add a bus code for upper/lower byte transfer.

    How does it manage to do a MOVB in 4 clock cycles when it has to do a read first? The 9995 does not need it, as I said, since it addresses single bytes.

     

    ??

     

    99000: 9995:

    1. read opcode 1. read opciode high

    2. read source word 2. read opcode low

    3. read destination word 3. read source byte

    4. write destination word 4. write destination byte

     

    Note that there is no visible ALU cycle as it overlaps with instruction pre-fetch. Both the 9995 and the 99000 use this trick.

     

    It is all documented in section10.6.3 and 10.6.4. of the datasheet. Actually, section 10.6.4. is so detailed that it is almost a listing of the microcode.

  6. I just saw that the 99000 has 16 data lines, but after a short look over the specs, I could not find information about byte handling. Does it use read-before-write like the 9900? The 9995 has the advantage that it uses an 8-bit data bus, which means it can change single bytes in memory without RBW.

     

    Yes, for MOVB it does read-before-write. For MOV, CLR, SETO, etc. it does not. MOVB R0,R1 on the 99000 executes in 4 clocks, same as 9995 (but the clock is up to twice as fast).

     

     

    This specification will make for an interesting read. Are the specifications online somewhere, and if so would you have a link?

     

    I found & read the specifications on WHTech:

    http://ftp.whtech.com//datasheets and manuals/99-8 Computer/TI-99_8 Mapper Specifications 03-23-1983.pdf

    http://ftp.whtech.com//datasheets and manuals/99-8 Computer/TI-99_8 The Mapper And Us 05-26-1982.pdf

     

    It has an interesting approach. Some design choices I find strange, but that is probably because I don't understand the 99/8 context very well.

     

    - It divides logical memory in sixteen 4KB blocks, with each block translated to a physical address through adding a 24-bit base value to the 12 lower logical address bits.

     

    - It has to multiplex the physical address bus because the MMU is a single chip device and the designers ran out of pins: even with the multiplexed bus it needs 64 pins. To me, it would have made sense to include the dynamic ram control on the device as well and turn the multiplexing into an advantage, but there probably was a good reason not to do this.

     

    - It adds a wait state during which the mapper can do its magic. That is a big cost, even taking into account that the dram would require a wait state of its own (33% slowdown). The wait state is probably necessary because (i) it uses full addition of the base register (rather than just replacing the top bits). The benefit of full addition is unclear to me, but it takes precious time to perform; (ii) it uses this state to output the top half of the physical address.

     

    - The MMU holds a single map with 16 entries, 4 bytes per entry = 64 bytes. Loading such a big map creates a significant cost to switching from one map to another and this MMU uses a solution I have never seen before in this form. There is a separate static RAM chip on the processor bus that holds 8 'images' of 64 bytes for the MMU. Upon writing a control byte to the MMU, the MMU requests the bus (using hold/holda) and transfers one of the images in or out of the MMU using DMA, speeding up the transfer significantly. Complex but certainly cool.

     

    - The protection bits work as you already mentioned. Interestingly, an illegal instruction fetch or memory read is not blocked, but proceeds until the interrupt is recognised. This leaves security holes, but the purpose was perhaps more to quickly interrupt crashed programs than to provide OS security. An illegal write is blocked though, allowing for shared read-only memory blocks. The extra MMU wait state helps here, because it creates ample time to prepare for blocking the /WE signal. If I understand the documents correctly, access to the static ram with mapper images and to the mapper control word is not subject to MMU write protection.

     

    I guess that on a 9995 all the protection stuff has limited effectiveness anyway: all programs have the same full rights as the OS anyway as there is no supervisor mode.

     

    However, it may be possible to add this to the 9995 (and 9900) with external hardware: a supervisor bit could be created in a CRU register (e.g. 74LS259). If the bit is set, the system goes to user mode, enabling mapping and enforcing protection bits. CRU operations are blocked in user mode. If there is a protection violation the hardware resets the CPU and the CRU register. Reset will abort the current instruction immediately (and actually save the user's WP, PC and ST in the reset workspace). A user program could call into supervisor mode by generating such a reset deliberately, for instance via the RSET instruction and some external hardware.

     

    Figuring out a usable, safe interrupt mechanism for such a setup is a challenge, though.

     

  7. which is the only part that we don't have logic diagrams for, so this is deduced from the specifications

    This specification will make for an interesting read. Are the specifications online somewhere, and if so would you have a link?

     

    There are no spare bits left in the command word to indicate such a mode.

     

    It is possible to use a two-word opcode, where the first shifts the CPU into another mode, a bit like it works on a Z80 or 8086.

     

    Something like this already happens in the 99000 instruction set for the 32 bit add, subtract and shift instructions (AM, SM, SRAM, SLAM, see page 73 of the data sheet).

     

    One could imagine one prefix that makes the data 32 bit instead of 16, and another one that makes the address 32 bit instead of 16. Without the prefix a 16 bit address would refer to its own segment. This is then pretty similar to 'near' and 'far' pointers on a 8086.

     

    Using macro code one could actually prototype a lot of this stuff on a 99000.

     

    • Like 1
  8. <Re-posted from "99110 ROM disassembly" because it is off topic there>

     

    I suppose the TMS99105 still does not have what it would take to implement virtual memory, i.e. no support for page faults and generic restartable instructions?

    In short:

    - No, it does not have such support out of the box

    - But maybe it can. Because of the 'registers in RAM' architecture, I think it might be possible with the help of external hardware (also for the 9900 & 9995)

    But what is the point of demand paging when virtual memory space (64KB) is so much smaller than physical memory (say 1MB)? Maybe it only makes sense in the reverse situation. Also, does a TLB make sense when virtual memory is small?

    Next to address translation, the other purpose of a MMU is memory protection. How to implement that on a 99xx/99xxx is an interesting question too.

    As far as I saw, there is still a limit at 16 bit addresses, maybe a second bank, but not more. (I think there is a map bit.) It is a pity that the TMS architecture does not allow for more.

    Yes, from a non-kernel program viewpoint, space is limited to 16 bits. The 99000 has two kludges to make it somewhat 17-18 bit like.

    - Separation of instruction space and data space (not used on TI990 mini's). This was used with great success on PDP11 mini's and early Unix.

    - The PSEL bit. Most mini's of the era had two memory spaces (kernel/user) driven by the supervisor bit in the status register. TI separated the two functions into separate bits, but when using a 74LS612 mapper or the TI990 MMU this is not fully exploited and the two bits move in tandem. With some new macro instructions PSEL could be made more useful.

  9. This is a very interesting and broad topic. It will be fun to discuss, but let's use a separate thread, e.g. "Designing MMU's"

     

    Below some short comments that can be used to kickoff such a thread.

     

    I suppose the TMS99105 still does not have what it would take to implement virtual memory, i.e. no support for page faults and generic restartable instructions?

     

    In short:

    - No, it does not have such support out of the box

    - But maybe it can. Because of the 'registers in RAM' architecture, I think it might be possible with the help of external hardware (also for the 9900 & 9995)

     

    But what is the point of demand paging when virtual memory space (64KB) is so much smaller than physical memory (say 1MB)? Maybe it only makes sense in the reverse situation. Also, does a TLB make sense when virtual memory is small?

     

    Next to address translation, the other purpose of a MMU is memory protection. How to implement that on a 99xx/99xxx is an interesting question too.

     

    As far as I saw, there is still a limit at 16 bit addresses, maybe a second bank, but not more. (I think there is a map bit.) It is a pity that the TMS architecture does not allow for more.

     

    Yes, from a non-kernel program viewpoint, space is limited to 16 bits. The 99000 has two kludges to make it somewhat 17-18 bit like.

    - Separation of instruction space and data space (not used on TI990 mini's). This was used with great success on PDP11 mini's and early Unix.

    - The PSEL bit. Most mini's of the era had two memory spaces (kernel/user) driven by the supervisor bit in the status register. TI separated the two functions into separate bits, but when using a 74LS612 mapper or the TI990 MMU this is not fully exploited and the two bits move in tandem. With some new macro instructions PSEL could be made more useful.

     

    Let's continue in a new thread.

  10. I have at least one TMS99000 chip here at the house, on a strange little single-board machine that was given to me by a former TI employee in Germany. All I have is the fully-populated board--not the rest of the associated machine.

     

     

    Interesting. Could you post a picture of the board and of the chip markings?

  11. My 'new' 99105 arrived from China. It has the 99110 ROM :^)

     

    After reading out the ROM, I can confirm that it is byte-for-byte identical to the ROM in speccery's chip.

     

    That further supports the theory that - at least later in the chip's lifecycle - fully functional 99110 silicon was put in some of the packages marked 99105.

    • Like 1
  12.  

    Is the PRIV bit negative logic? (Possibly makes sense, because the bit is set to 0 for TMS99xx, which would turn its programs to non-privileged.)

     

    Yes it is negative logic and confusingly named. I think USER would have been a better name:

     

    - When the bit is 0 the CPU is in privileged mode and can execute all instructions.

    - When the bit is 1 the CPU is in unprivileged mode and I/O type instructions become restricted. Attempts to make the bit 0 again become illegal as well.

     

    The only way to get back to privileged mode is via a reset, interrupt or XOP. As that code is normally controlled by the operating system it can retain control.

     

    See section 6 of the data sheet for details.

  13. And here is the code for the 8th slot in the vector table. This slot (>0Fxx, >07xx) handles only the LDS (>0780) and LDD (>07C0) opcodes.

    .

    ; entry point for the 0Fxx and 07xx opcodes
    ; only the 74LS612 mapper variant of LDD and LDS are recognized
    ;
    0AF6 2560   CZC  @>0B18, R5    ; is the opcode >0780 or >07C0?
    0AF8 0B18
    0AFA 1672   JNE  >0BE0         ; no: test for extension & exit
    0AFC 27E0   CZC  @>0B1A, R15   ; are we in user mode?
    0AFE 0B1A
    0B00 1303   JEQ  >0B08
    0B02 0300   LIMI >0000         ; yes: set up PRIVOP error
    0B04 0000                      ;  (will cause INT #2 after the RTWP)
    0B06 0380   RTWP
    0B08 0283   CI   R3, >C000     ; is this a first LDS?
    0B0A C000
    0B0C 1303   JEQ  >0B14
    0B0E 0283   CI   R3, >6000     ; is this a first LDD?
    0B10 6000
    0B12 1601   JNE  >0B16
    0B14 C08E   MOV  R14, R2       ; save address+2 of first LDS/LDD in a sequence
    0B16 0384   RTWP4              ; return & defer interrupt
    
    0B18 F83F   DATA >F83F         ; reverse bit pattern of LDD/LDS
    0B1A 0100   DATA >0100         ; PRIV bit in ST register
    

    .

     

    This has two interesting points.

     

    LDD and LDS are only valid when in system mode (PRIV bit not set). If they are called from user mode (PRIV bit set) an privilege violation error must be created. This is achieved by using LIMI 0. This is a hardware instruction that is also only valid in system mode and will set the PRIVOP error bit. This normally immediately causes an INT #2 to occur. However, macrocode cannot be interrupted so it remains pending. Only after we return to normal code with the RTWP is the interrupt honored.

     

    Saving the address+2 of the first LDD/LDS in a sequence is an obscure feature that can be used when implementing TI990/12 style interruptible instructions in an external macro ROM (also see bottom of page 107 of the data sheet). When such an instruction is used in combination with LDD/LDS and allows itself to be interrupted, it will save its progress in a checkpoint register and reset the saved PC in R14 to the address of the first LDD/LDS. After the interrupt has finished, the instruction will restart from the first LDD/LDS (setting up the hardware assists) and the interruptible instruction will restart from its checkpoint.

     

    The 990/12 assembler manual has more information about interruptible instructions and checkpoint registers.

    • Like 1
  14. If you use an Apple Computer, you can use the TI-Disk Manager for disassembling code for nearly all processors of the 99xxx family.

    It has an interactive tool (Disassembler Editor) for producing clean source code.

    That is very interesting. How would I best use this tool for the job at hand? Does it support the extra 99xxx instructions? And how about the (obscure) macrostore specific instructions (EVAD, the interrupt jumps, the RTWP variants)? Can the disassembler tool be used on a stand alone basis?
    • Like 1
  15. Now that the internal ROM of the 99110 processor has been read out, it becomes interesting to see what the TI engineers had put in it.

     

    This thread is intended for posts about recreating the source code for the ROM. It is meant to be wide in scope and include discussion about the best tools for this specific job, the ins and outs of floating point formats and their implementation on the 99xx and 99xxx, etc. All contributions towards these topics are most welcome.

     

    This first post includes a binary dump (the .bin file) and a quick disassembly using xda99 (which was the first tool I came across).

     

    The entry table (see section 7.3.1 of the data sheet) is:

                AORG >0800
    ; macrostore entry vectors (see table 7 of datasheet)
    ;
    0800 0BE0   DATA >0BE0         ; entry point for 00xx opcodes
    0802 0BE0   DATA >0BE0         ; entry point for 01xx opcodes
    0804 0BE0   DATA >0BE0         ; entry point for 02xx opcodes
    0806 0B1C   DATA >0B1C         ; entry point for 03xx opcodes
    0808 0B6E   DATA >0B6E         ; entry point for 0Cxx opcodes
    080A 0B80   DATA >0B80         ; entry point for 0Dxx opcodes
    080C 0BE0   DATA >0BE0         ; entry point for 0Exx opcodes
    080E 0AF6   DATA >0AF6         ; entry point for 0Fxx + 07xx opcodes
    0810 0BE0   DATA >0BE0         ; entry point for two-word opcodes
    0812 0BE0   DATA >0BE0         ; entry point for macro XOP's
    

    .

    .

     

    Most of the entries refer to the exit code at >0BE0. This code implements the extension interface documented in section 7.3.6 of the data sheet.

    ; unimplemented instructions jump here to check for external macro ROM.
    ; Officially, 0BE0-0BFF was reserved for factory test code
    ;
    0BE0 C1E0   MOV  @>1000, R7       ; test macro location >1000 for >AAAA magic    
    0BE2 1000
    0BE4 0287   CI   R7, >AAAA
    0BE6 AAAA
    0BE8 1602   JNE  >0BEE            ; if not present exit
    0BEA 0460   B    @>1002           ; jump to external macro code
    0BEC 1002
    0BEE 0382   RTWP2                 ; return & trigger ILLOP interrupt
    

    macrorom.bin

    macrorom.txt

    • Like 1
    • Thanks 1
  16.  

    I didn't think the 99000 existed as a specific chip, i thought it was just a name for the 99xxx series of chips, the 99105, 99110 and the probably never made 99120.

     

     

    That's what I thought for a long time as well. However, chips marked TMS99000 (or at least marked TMP99000 and TMX99000) do exist:

    http://www.cpu-world.com/forum/viewtopic.php?t=19180&start=0

    The coding looks like preproduction samples, but the date codes (1983) suggest that these were much later. Interestingly, one of the pics shows a TMP99000B (not A). If memory serves me right, Ksarul has a chip marked "TMS99000" in his collection, but I'm not 100% sure.

     

    When I write having a "99000 ROM" I mean the ROM in the chip that was used in the TI990/10A. The pictures of TI990/10A CPU boards that survived are too low res to read the marking, or have other issues (like glare on the lid) for example here:

    http://img11.hostingpics.net/pics/890850tm990_10a.jpg

    The TI990/10A board that Dave Pitts has in his collection has a chip with completely faded markings.

     

    In summary, I don't know for sure what the markings on the CPU chip in a TI990/10A were, and I use "99000" as a best guess.

    • Like 1
  17.  

    I am not sure if I understood your assumptions correctly - is your working assumption that some 99105 chips are actually 99000 chips without any support for macro code? I thought that we've been lucky with some 99105 chips behaving as 99110, but that a 99105 would still be a 99105 as a minimum.

     

    In short my assumption is that all of the 99000, 99105 and 99110 actually have macro code support, but only differ in what is inside the macro ROM. So far we have yet to find a true 99105 chip with a blank ROM: so far we have three 99105's that have 99110 ROMs inside, one that most likely has a 99000 ROM inside, and two that are as yet unidentified (one of which is your second chip - which also seems to have the 99000 ROM, the other is JH's chip - which I suspect has a 99110 ROM).

     

    The longer answer is that the 99xxx was designed in the closing days of manual mask production, with masks being taped out by hand on huge mylar sheets. In that context it does not make sense to have multiple designs, other than ROM contents.

     

    Some two years ago I thought that there perhaps never was a 99000 and that the TI990/10A actually used a 99105 chip. This idea has now reversed: perhaps there never was a 99105 ROM, but all are 99000 or 99110 ROMs. Maybe in the early years 99105's included chips that were fully functional except for an error in the ROM: this would have improved yields and thus reduced cost. On the other hand, the ROM covers only ~10% of the chip surface area, so functional chips with only an error in the ROM would be relatively rare among defects.

     

    As time progressed the yield must have gone up, perhaps as high as 80-90% in the second half of the 80's. If TI persisted with having no specific 99105 ROM that would mean that a very large share of 99105's currently floating around have functional 99000 or 99110 ROMs inside (and by 1985 it would have made no sense for TI to put new money into the 99xxx, at that point they were just producing them to recoup the investment). That theory seems to fit with what we observe. Based on findings so far I would say that the odds are 50-50 or perhaps 60-40 in favour of finding a chip with a 99110 ROM.

    • Like 1
  18. This gets puzzling.

     

    Maybe not. Internal macro store accesses do not generate RD# signals. What I think you are seeing on 99110 silicon is the single read of external location >1000. If your 2nd chip does not have the extension interface at all, it will not read location >1000 (or apparently any other location) but simply return with an ILLOP exit. If it has no extension interface, there is no obvious way of reading out its macro rom.

     

    I've now tested through all macro jump table entries (first instruction for each entry only) and with any possible magic word at location >1000. For my chip I did not find an extension interface this way either.

     

    To see if you have 99000 silicon in your second 99105 you could try the LMF instruction (>0320). If I'm not mistaken that instruction should make at least 3 accesses to parallel IO space, which your logical analyzer can trigger on if you add A0 and BST3 as inputs (but perhaps wise to test first that LMF is accepted as a valid instruction on your chip).

     

    That leaves the question of how they factory tested a 99000. Maybe they felt that the LDS/LDD/LMF instructions were simple enough that they could check the chip's responses in test scenarios that covered every instruction for these in its macro rom code. That would be the hard way. The easy way would been to have some sort of back door to read out the macro rom and simply test that each word has the data in it that it should have.

     

    The easy way ties in with having the last 16 words reserved for factory testing, which suggests those 16 words might have a backdoor. The question then becomes how to trigger this backdoor. Maybe TI implemented an macro instruction that reads and returns a macrostore word, or implemented a macro XOP (i.e. with ST11=1) for that.

     

    Maybe I'm just clutching at straws here: maybe the 99000 is simply a locked box.

    • Like 1
  19. Very interesting. I don't know why but I assumed that LDD/LDS would be hardwired instructions - but apparently they are not.

     

    You correctly assumed that because they invert the PSEL bit for one or two memory accesses. That has to be done in hardware. It would seem that LDD and LDS are a mix of hardware and macro code.

     

    First, the hardware maintains 3 bits of information about LDD/LDS sequences, see the table on page 107 of the data sheet. If you look at the individual states for LDD and LDS (page 84) you'll see that these bits are updated in cycle #2 of an LDD or LDS. An MID trap follows, which is listed on page 91. In cycle #7 of the trap these bits are then copied to the top of R3, where macro code can inspect them.

     

    The hardware bits are also used on the next SOP or DOP bus access, and it is during these two that PSEL is inverted. This why LDD/LDS does not work when accessing registers: those have bus code WS and not SOP or DOP (see page bottom 106 to top page 108). Macro code can generate SOP/DOP cycles by using indirect accesses via R6 and R12 respectively (page 43).

     

    I've just read those pages again and the last section on page 107 makes sense for the first time, and explains the copy of R14 to R2 in the LDD/LDS code.

     

    ====

     

    On a 99000 (i.e. for a TI990/10A mini computer) the macro code will be different. This mini does not have a 74LS612 style mapper, but a more complex MMU (physically it is a 64 pin ULA on the CPU board). This MMU is a CRU device and has two base maps, one for when PSEL is 0 and one for when PSEL is 1. Each map is 6 words long.

     

    On a TI990 both LDS and LDD take a single argument, which is a third, temporary memory map. My hypothesis is that on a 99000 the macro code will copy out the temporary map to the third MMU map. What I don't understand is how the MMU knows when to use this third map when it only has the single PSEL bit to work with. Perhaps it assumes that LDD/LDS will only be used when PSEL is 0 and that the first SOP/DOP cycle with PSEL=1 after loading the third map must use the third map instead of the second.

     

    It gets complicated, because on a 990/10A the sequence LDS-LDD means that the first LDS modifies the LDD to go long distance for its temporary map. To change both source and destination for another instruction one must use the sequence LDD-LDS. Complicated sequences like LDS-LDD-LDS or LDS-LDD-LDS-LDS would all appear valid and have specific meaning.

     

    Hopefully I can find a backdoor to my 99105 chip (which I assume has 99000 silicon) and list the actual macro code. This weekend I'll set up a test that works its way through all magic numbers on macro location >1000: maybe there is one that works...

  20. This is the macro code for LDD/LDS on a 99110:

    0AF6 2560   CZC  @>0B18, R5    ; Is this a 99110 style LDD/LDS instruction?
    0AF8 0B18
    0AFA 1672   JNE  >0BE0         ; No: test for external macro code & exit
    
    0AFC 27E0   CZC  @>0B1A, R15   ; Are we in supervisor mode?
    0AFE 0B1A
    0B00 1303   JEQ  >0B08      
    0B02 0300   LIMI >0000         ; No: cause a PRIVOP error & return
    0B04 0000
    0B06 0380   RTWP
    
    0B08 0283   CI   R3, >C000     ; Is this a first LDS?
    0B0A C000
    0B0C 1303   JEQ  >0B14
    0B0E 0283   CI   R3, >6000     ; Is this a first LDD?
    0B10 6000
    0B12 1601   JNE  >0B16
    0B14 C08E   MOV  R14, R2       ; Save address of first LDD/LDS in sequence in R2
    0B16 0384   RWTP4              ; Return from macrostore & skip interrupt test
    
    0B18 F83F   DATA >F83F         ; reverse bit pattern of LDD/LDS
    0B1A 0100   DATA >0100         ; location of supervisor bit in status register
    
    

    I'm not sure why it saves the address of the first LDD or LDS in R2 and then seems to do nothing with it.

     

    It shows that on a 99110 the code tests specifically for opcodes >0780 and 07C0. On my 99105 the full TI990 range of these instructions seems to be valid. This implies that it has different macrocode (i.e. not 99110 code with a faulty macro rom).

     

    • Like 1
  21. This morning was my first use of xda99, so we're about in the same position. The command line I used was:

     

    ./xda99.py macrorom.bin -a 0800 -f 0814

     

    -a 0800 means assume the bin file is based at 0800, -f 0814 means real code starts at location 0814.

     

    The timing issue indeed sounds more likely. If you have a counter for the time that the AUMS and AUMSL bus codes appear, they should count all the time: those bus codes are also used for internal ALU cycles and most instructions include one or more of those. Only the appearance of a simultaneous RD# or WR# makes it a macro store cycle.

  22. macrorom.txt

    Without further ado, ladies and gentlemen (well guess mostly gentlemen here), I give you the dump of the TMS99110 macrostore ROM:

     

    Wow, great! I've been wanting to do that since 1982 or thereabouts. :)

     

    Please find attached a quick disassembly with xda99. It does not handle the extra 99000 instructions and has those as DATA, but that is minor.

     

    Most macro instruction groups simply go to location >0BE0, which has:

    0BE0 C1E0   MOV  @>1000, R7
    0BE2 1000
    0BE4 0287   CI   R7, >AAAA
    0BE6 AAAA
    0BE8 1602   JNE  >0BEE
    0BEA 0460   B    @>1002
    0BEC 1002
    0BEE 0382   DATA >0382 (= RTWP2 = exit macrostore & generate ILLOP interrupt)
    

    Interestingly, 0BE0 is the first word of the 16 bytes reserved for factory test code. I wonder if all 99000 have this code there, perhaps testing for different magic. My 99105 does not respond to placing >AAAA at location >1000, but perhaps it tests for >5A5A or perhaps for >AAAA at location >2000 or some such.

     

     

     

    After having success I also tested my other TMS99105 CPU. To my disappointment the debug counters did not move at all, indicating that this CPU does not support macrostore features at all...

     

    Hmm... that is indeed an unexpected result. It seems like the CPU is operating in baseline mode, where it simply ignores having macrostore. Any chance that you had the APP# line grounded during that test? (this would place the chip in baseline mode). Another possibility is that this is a true 99105 chip, but that does not quite square with its support for LDD and LDS.

     

    • Like 1
  23. I was trying to figure out how many batches of 99000 chip were produced during its lifetime, at least an order of magnitude.

     

    First an estimate of how many chips were produced. As far as I know the 99000 had two major applications: the TI990 minicomputer (and its later desktop variant BusinessSystem 200/300/etc.), and industrial applications. Over the years I have come across some models of Siemens PLC's that used them, some Fluke lab equipment and apparently a pair of 99105's was used in the guidance system of an early model of cruise missile.

     

    When it moved to the S1500 workstations in the late 80's TI claimed that it had an installed base of over 100,000 TI990 (incl. variants) systems. If half of that was based on the 99000 chip, that makes 50K chips. All the industrial applications seem to be things that sell hundreds to thousands, not ten thousands. Maybe all industrial applications together also 50K chips, for a total of about 100K.

     

    I think the die is about 8x8mm; on a 4 inch wafer you get about 100, on a 6 inch wafer about 250 of those. That makes for between 400 and 1000 wafers. Wafers are made in batches (all with the same set of masks), with back in the 80's perhaps 10 wafers per batch (my guesstimate). That's 40 to 100 batches, assuming 100% yield. In reality yield was perhaps 50%, so 100 to 200 batches between 1981 and 1991. That is rather less than I would have thought.

     

    Still, 10 to 20 batches per year seems high enough to make specific silicon for each of the 99000, 99105 and 99110.

     

     

    • Like 1
×
×
  • Create New...