Jump to content

apersson850

Members
  • Content Count

    1,063
  • Joined

  • Last visited

Everything posted by apersson850

  1. Yes, it's a combination of things. TI decided that the TMS 9900 should mimic the TTL CPU in their 990 range. Excellent idea, they would be able to build cheaper minis. TI developed the 990 series with a stack implementation, floating point instructions etc. (990/12). Excellent idea which increased performance. TI developed the TMS 99000 microprocessor to incorporate these ideas in a chip. Excellent idea, they would be able to build more powerful minis at lower cost. TI suddenly realized that minis were not selling like hot cakes any longer. They figured out that they needed to find a new market for their chips. Excellent idea, would keep production up. TI decided to put a microprocessor, designed to run a minicomputer, into a small home computer, with an ugly keyboard, locked in software and high cost, to compete with emerging computers that sometimes also had ugly keyboards, but were simpler to make and had open architectures. Not too excellent an idea, to say the least. As we know by now, it did crash, too. But processors with specific tasks isn't anything new. I've been involved in designing the CPU architecture for a processor which was dedicated to robotics, for example. You could use that for anything, if you wanted to, but it would be pretty awkward. For coordinate transformations it was excellent, however, as it among other things supported floating point trigonometric instructions in hardware. It would have failed miserably in a home computer. I haven't checked now, but out of memory, I think the other parts in the chipset for the processor family came with the TMS 99000. The TMS 9900 was intended to be used in minis with CRU I/O only, but the higher end models also had TI-LINE. That's a parallel bus, somewhat similar to the Unibus. I'd have to look around to find the details, and don't have time for that right now.
  2. Nice to know. I'm not that much into using simulators/emulators, since I'm of the opinion that the reason for using the TI 99/4A is to use it for fun. And to me, "fun" involves both software and hardware. Thus I prefer the real hardware, and that's why I have some of my own ideas built into my TI as well. There was no real-time clock available on the market when I built mine, for example, and I still don't think there's any general I/O board. Mine has 40 inputs and 32 outputs, plus eight analog input channels. But now I'll re-fix my own RAM disk for the p-system, then see if I can get that old Horizon RAMdisk into the living crowd again, and if so, see if I can integrate that too in the Pascal environment. It should work.
  3. Never heard of it. But it makes sense that they would have planned such a chip, just like there was both the 8088 and the 8086 from Intel.
  4. But TI did make a chipset to support the TMS 9900. But several of these chips were also in 64-pin DIL fashion, so they didn't fit well into small home computers. They worked better in the minicomputer series they were originally designed for. However, these minis quickly became obsolete when the personal computer's capacity quickly increased. Something which wasn't that obvious in the 1970's, when they were designed. Remember that when the first IBM PC came out, a hard drive was optional. You could get one with five Megabyte capacity. Well, there was one with ten Megabyte too, but nobody understood why. The world's accumulated software available for it wasn't ten Megabytes... The other chipset, which did fit in home computers, was based on the lower-end mini's interface, the CRU. Being serial, it was way before its time, and thus not fast enough. This was not the time of Gigabit/s serial USB transfer, but a time where parallel transfer was faster. No, the TMS 9900 did well in the TI 990/4 and TI 990/5, where it replaced the expensive and complex CPU used in the TI 990/9. Just like the TMS 99000 in the TI 990/10A was a good replacement for the 5½ card CPU used in the TI 990/10. Since the minicomputers soon were replaced with microcomputers, the personal computer as well call them today, the minis went into process control in many places. When these kind of systems are handled with traditional computers, they normally rely on interrupt service to an extreme rate. It's not at all what you see in a typical desktop computer today. For that market, the context switch was much more important than it would be for a home or personal computer. The problem with the TMS 9900 was more that it was designed with an aim to fit into a much more special slot than a general desktop computer. Thus it didn't do well in such a computer either. If they have had the TMS 9995 available for the TI 99/4, it would have fit in much better there. A CPU doing most of its instructions in four cycles, running at 12 MHz, when most of the other manufacturers had 8-bit designs internally, or a 16-bit but running at one third of the frequency, then it would have been a different story. Now they searched for a market for the TMS 9900 that wasn't there. It was too special. So it failed from a sales point of view, for sure.
  5. Because PC, ST and WP are used by the microprograms, and they don't access memory. Sometimes engineers are very conservative. It was "known" that the registers must be internal to the CPU, because it was the only thing you could think of to begin with. At the time when TI took the more radical approach of moving the registers to the memory, other manufacturers were contemplating what to do with their internal registers. The Apollo project had shown that efficient embedded systems required prioritized tasks and quick context switching. Fewer registers are quicker to save, but more of them are better to work with. When TI considered moving the registers to memory, some other manufacturers discussed having multiple conventional register files. If you think this is a minor question, you should know that people have left one company and formed another over discussions like these. The 990/9 architecture had quite a lot of cycles per instruction. So it's like you say, with a specific architecture, you could just as well go with "registers" in memory. The TMS 9900 mimics the operation of the 990/9, so it does the same thing. It took until the TMS 9995 to make the internal architecture more efficient. Then the backside of having a register file in memory is more obvious. Where the TMS 9900 is accessing memory about one third of the clock cycles, the TMS 9995 is almost constantly reading or writing to memory. It's always easier to be clever after the fact. It could be worthwhile to realize that these decisions were taken several years before a technical manager at what was then Televerket Radio in Sweden (the national phone company's technical department, a department that was instrumental in pushing the RDS FM system into service for the first time in the world) stated that "anything above 2400 bits/second on conventional copper phone wires is a physical impossibility". When I eventually gave up ADSL for fiber we had 24000000 bits/second...
  6. When TI designed the architecture for the TI 990/9, which then was implemented on a chip as the TMS 9900, the memory technology available at that time was equal in speed to the CPU. Thus there was nothing to gain from internal registers, but a lot to lose. The decision was correct at that time, but then development changed the playing field.
  7. For cartridge ROM yes, but both BASIC (with E/A module) and Extended BASIC support loading their assembly routines as relocatable code. That implies that you don't need to think about where to load the code. With relocatable code, you could have a library A, and two software packages B and C, which you want to make available to BASIC. Both B and C requires A to be loaded, but neither B nor C requires the other one to be present to work. This then implies that you can let the BASIC program load A, then B and finally C. Or load A, then C and finally B. Or even make the program load none, and only load A and B if B is needed, or load A and C if C is needed, or augment A and C with B if that later is needed. This automates the loading of the different assembly routines. Using absolute code as support to BASIC/Extended BASIC only makes sense to speed up loading. But it doesn't make life easier for the programmer. For automation, as I hinted above, BASIC can handle loading the necessary code files. When using the E/A module, it's also possible to include REF LOADER in your program, then set up a PAB for file access and then BLWP @LOADER to let an assembly program use the linking loader to load more assembly routines.
  8. I hope I'll have that missing part here within a week. But maybe not, since I may have to go to Poland for some work next week. This work thing just consumes too much time...
  9. RoLB and ROLB are the same, but R0LB is not. It should be R0LB in this case. And yes, END marks the physical end of what should be assembled, not the logical end of the main program. When you assemble a program, the assembler is reading your source and makes an object code out of it. Instructions like SWPB R4 can be assembled immediately, as they are completely defined. But an instruction like B @KALLE can't, if you don't know what the value of KALLE is. Each time the assembler program encounters such an instruction, it will remember that there's a referenced to KALLE that's unresolved. When it later finds KALLE in the program, it can fill in the value of KALLE in the instruction, and the problem is solved. While the assembler is running, it will build a symbol table with all names defined in the program, and eventually it will have a value associated with all of them. If not, you'll get an error at the end of the process, since then there are unresolved references which the assembler can't resolve. You can make code absolute or relocatable. When it's absolute, it's always possible to fully resolve all addresses during assembly. You know for example that the code should load at >A000, so you know that an address, like KALLE, that's >10 bytes down the code must be >A010. The CPU can only execute absolute code. But when the code is relocatable, you don't know where it will load. It could be at >A000, but it could also be at >2BC4. Or any other address. The only thing you know about KALLE then is that it's +>10 bytes down from the origin of the code. This is where tagged object code and the linking loader comes in. The assembler produces tagged code, where each word is tagged with information describing the content. An instruction like SWPB R4 will be tagged with the load as it is tag. No changes needed. But the address KALLE will be tagged by the add +>10 to the start address of the code tag. Thus the assembler will resolve the address relative the code segment start, and then the loader will finally resolve the address by adding the code segment loading address. When that's completed, the code in memory is now absolute and can be executed. But the loader supplied with the TI 99/4A is more clever than that. It's not only a loader capable of loading relocatable code, but it's a linking loader. That implies that you can load code that contains addresses that are completely unknown at assembly time. If you have a reference to KALLE, but nowhere in your code define KALLE, then KALLE is unresolved. But by including the directive REF KALLE, you can tell the assembler that you are aware of that KALLE is needed in the program, but KALLE is in another program, so you'll not know until at load time what KALLE is. The assembler will then tag references to KALLE by the resolve at load time tag. This is where the linking loader comes in. When it finds a reference to KALLE, it will look at a table called the REF/DEF table to find out the address of KALLE, and put that address into memory where the reference to KALLE is. Now the code has become absolute and can be executed. So where does the definition of KALLE come from? Well, in this example it's assumed that: You have written another program, that contains the definition of KALLE. That piece of code also contains the directive DEF KALLE, to indicate that the definition of KALLE is external, and thus made available to other programs. You have loaded this code, with DEF KALLE inside, before you load the code that contains REF KALLE. When we look at the routines like VSBW and similar, it's the E/A system that pre-loads, and defines, some routines for you. They include VSBW. In the example above, you got away with REF VSMW ​at assembly time, because it simply tells the assembler that it's up to the loader to fix the VSMW reference. But at load time, it would have exploded, since the label VSMW wasn't actually externally DEFined by any other program. This whole concept is a powerful one, since it allows you to build libraries of frequently used functions, and include external definitions in these libraries. They may then be externally referenced from other programs, so that you can load the same library as one code file, and use it from different other code files. This is exactly what the E/A system provides for you. Since the tagged object code loader can load relocatable code at any address it finds convenient, it doesn't matter how big the library is. Your program will always load after it, regardless of where that's in memory. This was a capability inherited from the system used in the TI 990 minicomputers, and was more powerful than what was available in several contemporary home computers.
  10. Well, the file uploaded in post #117 tells you most of it. ​​I just forgot one piece of setup code, that must be loaded first. I'll come back with that. Once that's in place too, it will work, provided there's a card with the given CRU address that implements at least the sector read/write subprogram. The p-system has its own file system, so it uses only direct sector access from the normal TI system. My own RAMdisk is large enough to host the compiler, editor and filer. It can also take a text and code file for a Pascal program. It makes a big difference, in cutting the compile time in half.
  11. Typing in the code I do intend to do. I'm almost done with it. But the question was if I should spend the effort to explain the details about how to add a RAMdisk, when chances are that perhaps not a single one of you will ever use it, or maybe even bother to read it. The only thing I didn't have documented on paper was the memory loader, which loads the DSR from a code file, produced by the p-system assembler, into the memory of my own RAMdisk. You can't do that with a normal loader, since you have to activate the RAMdisk DSR, using the appropriate CRU bit, to be able to load the code. But it's fairly simple to do a basic such loader. Using a RAMdisk with the p-system makes the compiler run about twice as fast, as it's heavily segmented. The floppies are running constantly as long as the compiler is running, but if you have the compiler on RAMdisk and the code on a floppy, the floppy will actually stop once in a while.
  12. From a functional point of view, it took Borland to get to Turbo Pascal 4 before they were on par with UCSD Pascal. But performancewise they were far ahead. The Turbo Pascal compiler was capable of compiling large source files much quicker than most other options. Execution speed was also a lot faster, especially compared to what the TI 99/4A is capable of doing. The program I mentioned above did a "benchmark" calculation in two minutes. When the same benchmark was running on a PC AT, which implies 12 MHz 80286 with 80287 floating point co-processor (which the program utilized, since it executed thousands of floating point operations), it took the PC three seconds to do the same job. The OS was DOS 3, I think, or maybe even earlier. AutoCAD was at version 2 at least, at this time. But before this program was developed, the same calculation took a half to a full week to do manually. Thus waiting for two minutes or three seconds didn't really matter. Both were so much faster than the manual procedure that it was a revolution anyway. It made it possible to return a commercial offer for an install in half a day after the request for a quotation, instead of after a week. Customers actually complained about the quotations, as they thought it could not reasonably have been processed properly in such a short time.
  13. The main objective of UCSD Pascal is to be able to provide a full Pascal implementation, and then some, within the constraints of a 32 K RAM + 16 K video RAM computer. And it does. But that implies that when there's a tradeoff between speed and code size, small code size has to win most of the time. The p-system for the 99/4A is improved in such a way that it supports one entire system disk in GROM, on the p-code card. Thus it's "better" from that point of view than most of the p-code implementations on contemporary machines, since the system disk is fast as a RAMdisk and means that (large) files like SYSTEM.PASCAL and SYSTEM.MISCINFO don't have to occupy your 90 K system disk. Remember that 90 K disks was the only thing that existed when the p-code card was born. Interestingly, and perhaps because of this intense disk activity, the p-system does take some moves to make I/O more efficient. It does this by doing a pre-scan of I/O units when the system boots. Thus, when for example a disk access occurs, it already knows the CRU address of the card and the entry address of the sector read/write subprogram. It's the same with the RS232 device. It's pre-scanned and data is stored in the system. Thus, for each I/O request, there's not the normal scanning of all cards in the machine to find the correct card and the correct functionality, as it already knows where it is. But this is also why a RAMdisk doesn't work by itself in the p-system. Although the p-system itself is designed to allow for at least six disk drives (it can be expanded, but that's the default), the 99/4A implementation only fills in three of the disk unit data structures with the required information. Hence it's never looking for the any other drives. To make a fourth disk drive working (like if you add a CorComp disk controller), you only need to copy the data for one of the first three drives to the fourth location. The disk drives are identical, in that the controller uses the same read/write code for all of them, regardless of whether they are three or four. So copy the basic data and then do a UNITCLEAR command on that drive, and it's ready to use. But for a RAMdisk you need to prepare by first having the RAMdisk DSR in place (some drives have that in RAM, like my own design). Then you need to not only copy the normal disk data, but modify it, as you can't use the same CRU base nor sector read/write entry address for the RAMdisk. Third, and here we are in line with normal disks, you need to fill in the data that indeed is identical for all disks, just for yet another unit and fourth, you can now execute the UNITCLEAR command and the p-system will see the RAMdisk as just any other drive. If you add two physically different RAMdisks, then you have to do it again for the second one, as it will also have a different CRU base and probably a different entry address for the code. Since the p-system was designed to be flexible about this, it's doable, but since the TI implementation is a bit short ended, it takes some extra steps to make it work. And due to this pre-scanning for efficiency optimization, you need to do that same thing after the fact, as the short ended implementation prevents the system from doing it by itself. Normally, the p-system allows you to change the whole operating system by providing updates to the SYSTEM.PASCAL file on the p-code card. If you have such a file on your root volume (normally DSK1, or unit #4: as it's called in the p-system world), it will take over from the SYSTEM.PASCAL file on the card. You don't have to replace the whole file, but just the segment you want to update. But this device pre-scan is done by code in ROM on the p-code card itself, and runs before the system has really started, when the BIOS and input/output system's data structures are built, so you can't wedge anything into that sequence. Your modifications don't start running until this is already completed. That's why this data has to be inserted after the system has bootstrapped. Still, you can write a program that does it automatically, and even chain that to the startup code, so it's not too much of a deal, once you've figured out what and how to do. But that took me quite a while once... Regarding the other question: I once wrote a 4000 line program in Pascal on the 99/4A. I later ported it to a PC, using Turbo Pascal 4.0. The only real change was including uses dos to handle some operating system specific things (like getting the current date from the system) and modifying the key codes returned by system keys (F1-F12) on the DOS keyboard. The remaining parts of the code ran unaltered. The program used text only. If there were graphics, it would have been different. But apart from that, it kind of had everything you can expect, like file and printer I/O, configuration file handling, dynamic memory allocation, floating point arithmetic etc.
  14. Around post 75 in this thread, we talked about adding a RAMdisk to use with the p-system. In post 117, I referred to a document about how to do that, which I once prepared. However, some who actually tried couldn't get it to work. A while ago, when I got inspired by a benchmark thread here, I attempted to actually run my p-system again. By mistake I erased the diskette where I had my RAMdisk software. As I couldn't find a copy, I've now searched my archives and did at least find a printout of everything related to creating a RAMdisk within the p-system. The p-system is designed to improve speed by pre-building tables to faster access various devices. But it's only prepared to have three disks when used on the 99/4A, as no more could be controlled by the TI controller. When adding a fourth, as you can do with a CorComp controller, you have to copy some disk drive values in a table entry for DSK1 to where DSK4 should have been. That's enough, as a diskette is a diskette. But when adding a RAMdisk, some values aren't identical to those used by normal disks. In the document referred to in post 117, I forgot that part. That's why people couldn't get it to work. But now I found that program too, so I can see how to fix that as well. Is anybody at all interested? I mean really would do this? I'm asking because I can prepare a document that covers the whole thing, but I'm not going to go through that work just for fun. It takes that somebody actually wants it, or I'll do other things. I've by coincidence acquired an old Horizon RAMdisk. Since my own runs as the fifth drive in my p-system, I could perhaps add this one as the sixth, as an exercise. It's broken from a hardware point of view, so I need to fix that first. I'll see...
  15. "When you think too much about something, you may create problems that weren't there from the beginning." (Winnie the Poh)
  16. I've written software that switches between text mode and bit-map (Graphics II) mode. Pascal starts and stops in text mode, and then I wanted to display diagrams while the program was running. So I went to bit-map for a while, then back again. I've also gone from text to Graphics I and back to text, for similar reasons. I think it was because I needed sprites for something. I don't remember exactly.
  17. I started using the expansion box as soon as I wanted more than the basic machine. But I got the box with RS-232, TI disk controller, 32 K RAM expansion and two double sided drives in it, plus the speech synthesizer outside the box, in one fell swoop. Later, I exchanged the disk controller to a CorComp model, to get double density and four drives. I built an extra box to house two more drives. After that, I designed my real time clock card and a digital/analog I/O card, which fits into the PEB, as well as the 16-bit 64 K RAM expansion inside the console. That's about it. Any sidecar expansion was never on the table. The table got full anyway, even if my own expansion box stacks on top of the TI one.
  18. I wasn't writing games (there are many of them already) when I ran out of memory, but was designing a program which would need a large database to work. The main database could be on floppy disk without too much time penalty, but I couldn't even fit in the index into the database in RAM, and then finding what I needed took too long time. Now I have a Horizon RAMdisk lying around. It isn't functional, but if I spend time fixing it, then that program would be doable. But now it's not really needed any longer, as my TI activity is very low.
  19. That's how I do it when I use bit-mapped graphics in Pascal. The UCSD p-system for the 99/4A isn't intended for that mode. It supports the normal graphics mode, text mode (default) and multicolor, but not bit-map, due to the memory requirement. But provided the p-system hasn't already reserved too much memory in VDP RAM, you can modify a location called "interpreter's memory pointer" to tell it that most of VDP RAM is unavailable. Then it's actually possible to run bit-mapped graphics on the 99/4A too. The system will automatically load Pascal programs to the memory expansion instead.
  20. A 32 sector file is enough to fit in 8 K code. Since the BSAVE/BLOAD in RXB seems to always save from and load to 8 K RAM, there's no need to keep track of any load address. The normal memory image files are 33 sectors, since they can hold 8 K code plus the address, byte count and link indicator (to link to the next file, if the memory image is more than 8 K). I once wrote a program, or rather tried to, that would have needed somewhere like 200-256 K RAM to run well. I had to abandon it, since it was too slow when doing all data processing via disk. That's the largest I've done, and I haven't heard anybody making anything bigger, even if that of course very well could have happened.
  21. Correct. And you have to figure out yourself which of the memory accesses that are slower (8-bit) and which are faster (16-bit). Thus you need to know about the hardware architecture of the 99/4A to get it right. Look at MOV R0,R1. You have four memory accesses here. Fetch instruction. Fetch source data Fetch destination data Write destination data If the instruction and the workspace both are in the internal RAM in the console, address >8300 - >83FF, then that's it. You have the basic timing of the instruction there, 14 cycles. Since all memory accesses require two cycles, which is the minimum, there are no additional delays. But it's only the small RAM in the console and the ROM chips in the console that are on a 16-bit wide bus with no additional wait states. When the CPU is accessing the memory in the expansion box, this memory is on a bus that's 8-bit wide. To make that possible, there is extra circuitry in the console, which splits up the 16-bit memory access into two 8-bit accesses. The circuit then puts these two parts together and present them to the CPU as one 16-bit word. Hence the two cycle memory access becomes a four cycle one. But it doesn't end there. Each of these 8-bit memory access cycles also have one extra wait state. So from a memory point of view, outside in the PEB, we're talking about 8-bit access with one wait state. But from the CPU's point of view, it looks like each 16-bit access is slowed down by four wait states. So if we look at the MOV instruction again, and pretend that both workspace and code is in 32 K RAM in the PEB, then suddenly all four of them memory accesses occur in slow RAM. Thus you must add 16 cycles that are waisted, in addition to the 14 that the instruction itself uses. But even if the code and WS are in fast RAM, the instruction MOV *R0,R1 adds four cycles, since the CPU must first fetch R0, then the address R0 is pointing at. Now if that address is in slow RAM, you need to add another four cycles for that access. If the instruction instead is MOV R0,*R1, and R1 is pointing at slow RAM, then both the indirect fetch of the destination and the store there adds four cycles of wait states each. If you autoincrement, then you need to write to the register after reading it, and if the register then is in slow RAM that's even worse.
  22. Then you have TI's original meaning of the phrase. But they refer to when you transfer a piece of software from the tagged object code file format to the memory image format used by E/A option 5. This is essentially the same thing, but here the files are supposed to load from disk, or even cassette, instead of loading form various ROM devices.
  23. That makes sense, since then it's only the instruction fetch you win. And each memory access is two cycles in 16-bit memory and six in 8-bit memory.
  24. And since these programs still load into RAM when they execute, you don't have to consider the constraints that are necessary for programs that runs in ROM only. If you happen to have one of these consoles that are modified to have the RAM expansion internally, you need nothing else than the console and the cartridge.
×
×
  • Create New...