Jump to content

speccery

Members
  • Content Count

    544
  • Joined

  • Last visited

Everything posted by speccery

  1. From the album: TI99FPGA

    This uses 512K paged cartridge rom and 120K GROM, all coming from the SDRAM of the FPGA board. The main menu.
  2. Those are great comments. Maybe the TMS9995 indeed does some internal processing at 12MHz. The only "solid" counter argument I would have is that if the processor was internally operating at 12MHz, why would the external memory cycles always need to be done at 3MHz granularity? In other words, if you needed a wait state for a memory device, you go in 333 ns increments and that has a huge impact, basically at zero wait state the memory access time would need to be something like 160ns, and if you can't do that, the next option is about 500ns (one 333ns wait state), so that is a huge step, even with memories of the day. If that could be done at 80ns increments (1/12MHz) it would have been more efficient.
  3. I am not sure if that is true, at least all the timing diagrams seem to indicate that the processor would be working from a single clock, clkout. Modern processors (and even old RISC processors) do all of their processing typically from a single clock. The TMS9900 has a four phase clock - you could say the individual phases run at 12MHz even with it. I don't know if the TMS9995 internally has a similar arrangement internally, with phased clocks not visible to outside. They normally use multiple clock cycles through the 8-bit ALU to handle 16-bit quantities. As an example, the Z80 has 16-bit arithmetic instructions, but those take more clock cycles (or T-states as Zilog calls them). So a 16-bit operation on the Z80 like ADD HL,DE would internally be divided into something like ADD L,E followed by ADC H,D (the latter addition takes into account the carry that might be created from the addition). On the 6502 this is also evident: for example you can use 16-bit addresses and 8-bit offsets (aka LDA $4050,X) but that instruction takes an extra cycle if the address calculation yields a carry from the 8-bit offset addition. So the ALU would be used twice for address calculation in that case. If I remember correctly the 6502 has some undocumented behavior related to this, so in some circumstances the carry is not properly propagated and the addition does wrap around within the lowest 8-bits - apparently it was used by some clever game programmers to make copy protection algorithms and such. Actually starting from Pentium Pro Intel added a feature called PAE (Physical Address Extension) which allowed the 32-bit architecture to address more than 4GB (it expanded physical memory address space to 36 bits). Basically the virtual memory page tables gave you additional address bits, similarly to the memory banking on the TI . But that gets messy, like you say 64-bit platforms are the way to go. Yes it can, but 16-bit operations become multi cycle operations internally. That by the way also adds to the internal complexity of the processor, as you have to operate the machinery multiple times to get through an instruction. The TMS9900 is a nice and clean architecture compared to the 8-bit processors. Having said that there are nice 8-bit processors too, Atmel's AVR architecture is good modern example about that.
  4. That is sad indeed. As an example the move instructions MOV (16 bit) and MOVB (8 bit) take the same amount of clock cycles on the TMS9900, despite the fact that there is no need to do a destination read-before-write when dealing with 16-bit operands. This was fixed for the TMS9995 and successors. With regards to the 8088 / 8086 comparison that matthew180 illustrated, the TMS9995 would be similar to the 8088 and TMS9900 to the 8086 from an external bus width point of view. However, the TMS9995 is also an architectural enhancement, so in practice it is much faster despite having an 8-bit external bus. It also has an internal memory block (256 bytes) that can be accessed at 16-bit width. It does not do unnecessary memory cycles. It is actually interesting to compare the amount of clock cycles it takes the TMS9900 and TMS9995 to do the same 16-bit operation. If we look at MOV R1,*R2 instruction (write contents of workspace register 1 to the address designated by workspace register 2) for these two processors and assume zero wait states for both, and also assume that the TMS9995's internal RAM is not used (worst case conditions) so that it has to go through the 8 bit external bus, it is still faster: on the TMS9995 that instruction takes 8 clock cycles, for the TMS9900 it takes 18 clock cycles. As an aside, one thing often confused with the TMS9995 is its cycle time: the processor runs at 12MHz, but that gets divided internally by 4, so the machine cycles are 333 ns each and thus similar to the TMS9900. The memory cycles for the TMS9995 are only 1 machine cycles, whereas for the TMS9900 they are 4 cycles, so there is a big difference there.
  5. The Pipistrello board is $155: http://pipistrello.saanlima.com/index.php?title=Welcome_to_Pipistrello The problem is more the availability of the buffer board ($20), which contains the level shifters. This is out of stock: http://saanlima.com/store/index.php?route=product/product&path=66&product_id=55 In addition to those you need the edge connector to the TI side port, and a bunch of wires to connect the two. I used a perfboard on which I soldered the edge connector and pin headers, and then I used jumper wires (female-female) to connect the two boards. The overall cost of all of the above is probably less than $200 with shipping, but you've got the issue of non stock and then the wiring. In addition, the buffer board will run out of buffers before this is ready, so I am planning to mount one or two 74LVC245 buffers on to the perfboard, to be able to handle the remaining signals. This is because there are a few signals I want to be able to drive from the FPGA to the TI (LOAD, CRUIN, maybe READY) in addition to the current signals and there are no available buffers going to that direction. In addition at least CRUIN probably needs a three state driver, so an additional buffer is required there. Alternatively those perhaps could be implemented with discrete transistors. If this whole thing was a custom board with a smaller FPGA, the price would probably be less than $100, maybe even less than $50 as far as the component costs are concerned. But there is the custom PCB and assembly that will be required.
  6. And then I realized that one of the binaries was the SAMS memory test V4.0. It was time to become fully SAMS compatible. So I added one more wire between the side connector and the FPGA (CRUCLK* signal) and modified the FPGA in three ways: Page register visibility controlled by CRU bit 1E00 - same as Super AMS Paging enable controlled by CRU bit 1E01 - same as Super AMS A non-standard CRU bit 1E03 when set enables paging of the entire 64MB memory range (16bit page numbers). By default this is reset, reducing the paged area to 1 megabyte with byte sized page numbers, but this is required for Super AMS compatibility. And now the design passes the memory test :
  7. So I then went exactly to the opposite direction than I thought - no software development today - instead VHDL development... I looked again at the cool FlashROM99 project and realized that I have been stupid. Facepalm time. What the FlashROM99 actually is? Banked RAM in the cartridge port area. Check. Needs 32K memory extension for some binaries. Check. A capability to load raw binaries to the banked cartridge RAM. Check (with my USB loader). What else? Nothing. Nothing. I basically have had this capability for something like two weeks without realizing it. Aaarrgh!!! Well I guess it is better to discover this later rather than never... So I downloaded the binaries and tested them. All the ones I tested worked
  8. From the album: TI99FPGA

    Pipistrello FPGA board mess of wires interconnect to the side port of the TI-99/4A.

    © Erik Piehl

  9. From the album: TI99FPGA

    Super AMS Memory tester first successful run on the pipistrello FPGA based memory extension.

    © Erik Piehl

  10. Thank you for those words motivation works both ways! I really want to make this something that others can replicate and hopefully find useful. The easiest way would be to make an adapter board for the pipistrello, but it would be cooler and cheaper if it was a bespoke board.
  11. Thank you for that comment. Actually I want to move the GROM contents over to the SDRAM too, it is overkill to use the FPGA on-chip block RAM for this purpose. So my plan is to set aside a region of SDRAM and have the GROM accesses use that region, similarly to what is done with the cartridge ROMs. Once that is done there is abundant memory available, so I'll make the GROM's 8K each. I wanted to keep them 6K as long as they're implemented with the FPGA block RAM to minimize block RAM use. It is not so important for the pipistrello board as it has a big FPGA, but once this project is moved to a dedicated board it is likely the FPGA will be much smaller and then this would have become important. A stupid question: where can I get the ROM images for the modules you listed? I'd like to try them out. Actually now that I think about it, XB 2.7 is included with classic99 so that one I probably have already.
  12. This probably is a good moment to summarize the project status. As a starter I'd just like to point out that it is a hobby project, so I did not really set some specific goals for myself, other than having fun with the TI, expanding it to a point that makes my configuration (the bare console) useful (as in having fun) - and hopefully resulting in something that can be easily replicated by others. That last step probably in practice means designing a custom board. As of today, the following functionalities are implemented: SDRAM interface, allowing the entire 64MB of memory to be accessed by the TI. Memory paging unit, which breaks the processor's 64K address space to 4K pages. The location of each page in the 64MB memory can be independently set. In practice this relates to the 32K memory expansion portion of the address space visible to the CPU. But the entire 64MB is accessible. This unit is compatible with the SAMS extension system, with the exception that CRU support is not there yet and thus the banking registers are always visible in the processor's address space. Also the way in which memory paging is enabled is currently non-standard. Accesses to the cartridge ROM area (TI's address range 6000-7FFF, 8 kilobytes) are remapped to SDRAM. There is support for 74LS378 style non-inverted banking of the cartridge ROM area. Extended Basic uses this type of banking too. There are 128 banks overall, thus 1 MB of memory can be reached through the cartridge ROM window. In the SDRAM that area is from 16MB to 17MB. Note that this area is not specifically reserved for cartridge ROM, it just so happens that accesses to cartridge ROM get combined with the cartridge ROM page number and an offset of 16M is added. That same memory area can be also accessed through the memory paging unit, using addresses that belong to the 32K memory extension. The FPGA implements the GROMs of the extended basic cartridge internally. In other words, there is a 4*6K=24K GROM compatible memory, currently loaded with the extended basic GROM contents. There is a memory mapped SPI port, connected to the micro SD card on the pipistrello board. Thus support for SD cards (i.e. mass memory) is there, once I get around writing some software for the TI. This is still untested, but the SPI port design is mostly what I've used earlier in other projects. The port runs at 12MHz, which may be too much for some SD cards (I have earlier used the port at 6MHz). Finally there is a memory access state machine, connected to a USB port serial channel. This allows access to the 64MB SDRAM from a PC. The PC can read and write the RAM concurrently to the TI itself using the memory. I have mostly used this capability so far to load extended Basic ROMs, game ROMs or my own memory dumper ROM from memory location 16M onwards, making them look like the corresponding cartridge would have been plugged in. A bunch of debug registers, allowing the TI read from certain memory locations the status of the SDRAM (to for example find out that maximum SDRAM access cycles are about 28 cycles long at 100MHz). Currently that's it. There is no speed up for anything yet. Speedup could be realized by also implementing the console GROMs in the FPGA, and then removing the original GROM chips from the console. The internal GROMs determine the speed of GROM accesses by delaying the CPU, making them horribly slow. The FPGA GROMs operate immediately, there is no need for any wait states. There is no video circuitry implemented, although that could be done as the Pipistrello has a HDMI output port, and there are examples how to drive the HDMI port, so that could be done. That's the overview. The short answer is that the entire 64MB of memory is available for programs. I hope the above lengthy explanation of current specs answers the memory question. Note that to go over 32K of RAM, the programs need to support that specifically by modifying the paging registers as the program runs. You could for example keep 28K (7 pages) in fixed locations in the 64MB memory, and use the remaining 4K as a window to the rest of the memory. You could just change the page register for that 4K area as the program runs. Already with the current hardware configuration a lot can be done. I think I want to enable the SD card support next, which means that I'll probably be spending time writing software rather than VHDL code for the FPGA. It is almost the case that the sky is the limit here, the FPGA has so much capacity left that it could implement graphics circuitry, coprocessors for various tasks, etc. Plus the entire circuitry of a TI-99/4A a few times over.
  13. I posted another update to hackaday. Advances come in small steps. I did get the external RAM now fully working with my code, so I am starting to build up some confidence that the memory interface is finally reliable. I had great fun working on the memory dumper app in TMS9900 assembly. As part of the very small improvements it now copies the smaller font definitions from GROM to VDP pattern tables, making the hex dump much more readable.
  14. This week was a travel week... But I got home and had a little time to work on the TI. I added the SPI interface to the FPGA, this was a straight port of my previous VHDL SPI port code. I have no test code for it yet. The SPI connects to the micro SD slot on the board. But I did spend some time in writing TMS9900 assembly code, and wrote a simple hex dumper for the TI. It is also a simple testbed for assembly coding, having code to write to the screen and reading the keyboard (just to change the dump address). I posted a picture of that below (actually not yet since I don't know how to do that and don't have time to find out right now). I noticed that unfortunately still there are probably some problems with the memory interface - placing the CPU workspace into external RAM yields strange behavior. Running machine code from external RAM while keeping CPU workspace in internal 16-bit scratchpad works (or seems to work). My next plan is to port my existing code to read data from FAT16 formatted SD card to the TI, and to implement some memory test code to understand what might be wrong with the memory interface.
  15. That's a nice way to come up with an alias - why didn't I think about that... I did a simple modification to the ascart.asm example (added register display in hex), assembled that with xas99.py. I used classic99 to debug it (yes I did manage to make a bug in just a few lines: MOV R3,4 is not quite the same as LI R3,4 which I wanted - no warnings, I'd prefer that the R prefix is required and not optional for registers) and got it to run both on classic and on the actual hardware using my USB serial loader to the FPGA. This actually marks the first assembly program I've written (modified in this case) to run on an actual TI! I don't dare to count how many years it was in between from owning my first TI to this day... I don't have an issue in open sourcing my code, I think it would be fun to get some collaboration going. I feel I've used so many open source projects that it is about time to contribute something back. But a development hardware platform would be needed. What I am using (the Pipistrello + buffer board + wires + connector) is great but a tad expensive and a design based on this FPGA would be hard to manufacture. At the prototyping level having so many jumper wires from the side connector to the FPGA board makes it a little hard to construct and move the system around. I have been thinking about creating another iteration of my SD processor board, replacing the CPLD with an FPGA, adding level shifters, upgrading the micro controller, adding a RAM chip. It would become quite a different board from my previous board. Anyway that would effectively become a nanoPEB on steroids. Probably for the first iteration of such a board I would shy away from SDRAM and just go with regular SRAM - the board would be complex enough without having to deal with SDRAM signals. Erik
  16. Hi Lee, thanks for sharing your experience. Now I have to also take a deeper look at your project, which looks like a great one in its own right! Before jumping over to assembly programming I actually used Forth for a while to learn programming (on my ZX Spectrum back in the day). I've done two Forth implementations myself, one for Amiga 500 and another one for one of my self made single-board computers (based on the MC68HC11 processor). As you probably know, there are many FPGA implementations of Forth processors (or stack machine processors which pretty much run Forth as their machine language). Hardware implementations of stack machines on FPGAs are small and efficient, and typically run at great speeds, as the memories for return and parameter stacks can be stored on-chip and accessed concurrently. I haven't done any work on that field myself, but this too would be another interesting avenue. The Gameduino VGA board for the Arduino boards is FPGA based and has a simple stack machine as a co-processor. That stuff is open source and easy to integrate into other designs - including this FPGA project... Erik
  17. Thanks for sharing your experiences. Gotta like your nickname "Asmusr"! Erik
  18. Thanks a lot for the pointer. I just cloned the repository and it seems to be packed with great stuff!
  19. Real life has kept me busy this week. I unfortunately injured a little my leg last week, to the point that I've got a cast on for a while now. Not great as I also need to travel for work the entire next week.... And when I can't do sports it feels like my brain is working at half the speed. Oh well. I don't think I'll have a lot time to do much in the coming week, but if I do I think I need to get some TI assembly programming going. On that note, I'd like to ask you programmers what toolchains do you use for TI cross assembly development? I have done a lot of assembly programming on many different processors, but not so much for the TMS9900. I have only done TMS9995 assembly programming so far for the breadboard etc, so not on the TI-99/4A platform. I've used the asm994a and as99 (a unix style cross assembler) to make ROM firmware. I have to say that I don't really like asm994a, I prefer command line tools and makefiles to automate the workflow. Having said that, I think it would make sense for me to write the software such as hardware device drivers using toolchains that are commonly used by the community, so that if I come up with something useful that work could easily be used by others. The choice of toolchain would mostly amount to the assembler directives and perhaps the way the assembler handles arithmetic expressions; the code itself would obviously not really change regardless of the assembler. Opinions? This forum thread is probably not the best for this question...
  20. I've played with the TMS9995 processors on a breadboard (with FPGA support) and on a single-board computer. The '95 is much faster thanks to internal architecture improvements, but the instruction set is basically the same, with a couple of additional instructions. Since this whole thread is essentially about having more memory and thus paging, that step is in a sense already done, although I don't know how the paging system on the Geneve looks like. I can't be too different though. I also have two TMS99105 chips waiting, it would be interesting to hook one of them up to an FPGA otherwise acting as a TI-99/4A. Those run at more than twice the speed of the '95, but are still from the same era and true TI silicon.
×
×
  • Create New...