Jump to content
IGNORED

Erik's ET-PEB


speccery

Recommended Posts

If you need a program that uses AMS I can provide one for testing. It uses under 256k so it should be fine.

 

 

Thanks adamantyr, that would be great! I much prefer known working software to my own test code, so that I know what I am actually debugging... If you have something that uses under 256k, that's perfect.

 

I haven't written the disk DSR support for this system yet. This limits the kind of software I can load currently - raw binary images work best. My plan is to port my existing FPGA system DSR setup to the new ET-PEB platform. There is a bit of work, since the FPGA system uses a PC as a file server, and the MCU does not have that kind of resources at its disposal - I cannot load large files to RAM for preprocessing as I don't have that kind of RAM. However, now that I think about this, as an intermediate step it would probably be useful to do a version which uses pretty much the same PC server software that I use with the FPGA system. That would enable me to get a higher level of confidence into the hardware platform before writing the full disk support with the MCU.

Link to comment
Share on other sites

Just a layman's question here in regards to "I cannot load large files to RAM for preprocessing".

 

You mention using FAT filesystem so I'm assuming FAT16 and not FAT32.

Could you partition say a 4GB SD card into 2GB partitions - one for use as you originally intended and the other partition (formatted anyway that best) for use with loading large files for preprocessing like a virtual ram drive as a sort of staging area?

Link to comment
Share on other sites

You mention using FAT filesystem so I'm assuming FAT16 and not FAT32.

 

It is actually both FAT32 and FAT16, courtesy of the library I am using. Thus large partitions and files are supported.

 

My comment was not very precise. My disk support for the FPGA system is based on the ideas as implemented by Classic99 and borrows a few functions from there. That code implementation parses the record based TI-99/4A files so that the entire file is preloaded to RAM, and then records are returned to the TI-99/4A as requested by the running application. This approach does not work on the ET-PEB, as the MCU only has 8K of RAM. Thus I need to rewrite that section to deal with files in a piecemeal fashion, which is in fact how the original TI-99/4A code must have done.

 

Now that I am writing this, I could actually also implement file support using the external RAM (which is shared with the TI). The RAM chip has 512K bytes, and half of that is for the TI expanded memory. The remaining 256K is currently used for the DSR code and the page table, leaving most of the 256K unused. The MCU can access this memory. I have never had an actual PEB or used TI with 5.25 inch disks, but I recall disk images being 170K in size, so surely no legacy TI file would be bigger than that - which would enable preloading... But I think it would be cool to enable things that could not be done before, such as streaming data from large files to TI's memory, enabling long audio (or video) playback for example, so having support for files larger than 170k would be interesting. The MCU can easily do this type of streaming in the background, with zero CPU impact to the TMS9900: the data just appears in its memory.

  • Like 3
Link to comment
Share on other sites

Some random ramblings...

 

I start with my conclusion: I can conclude with obvious result that the CPLDs (Complex Programmable Logic Devices) are not the same as FPGAs (Field Programmable Gate Arrays).

 

I have been debugging memory accesses from the processors to the memory of the ET-PEB. At the center of my design is the CPLD chip, which performs many functions, the chief among them being memory arbitration. I've now spent many more hours than I anticipated in debugging the CPLD design (probably by now 10 - 20 hours), before I came up with a test setup that vividly shows on the logic analyser screen what is going on when problems occur. And what happens is that the behaviour is simply not what my VHDL code or simulation says. Hence the debugging has been taking a long time, but at least now I have a way to see what goes wrong when things do not work. That in turn means I will be diving in even deeper in to the rabbit hole - in to the specifics of the CPLD chip I am using.

 

To go back in time a little, my feature complete design took initially about 139 macrocells out of the 144 available ones. I knew from the beginning that I would be pushing the logic chip to near maximum usage, so that is not surprising. There were many signal routing problems, these are typical when a programmable logic chip becomes too full. With careful arrangement I was able to make a version of the logic design that appeared to work - but only mostly. Since then I made two bigger design changes. First I implemented a clock divider, to run the logic at half the speed. At 25MHz that is still fast enough to do what I need to get done. That did reduce the problem intervals greatly, but the issues did not completely disappear. Then I came up with a much more clever way to arrange a portion of the logic, which brought down the macrocell usage to 128, and also eased routing issues.

 

After those changes the TMS9900 side of the memory interface seems completely stable. I did run the AMS memory test without any failures 12 consecutive times (each run takes a while and I ran out of patience to run more tests), so the paging over 256K memory range works as designed. But the MCU side access to the memory just did not work fully, and now I was finally able to observe on the logic analyser the root cause, which is that a portion of the signals function as designed, while one of the key signals (the address multiplexer control) is working correctly 99.9% of the time from the MCU side, but it mysteriously not working all the time. It is strange, because it is a very simple signal and other activities triggered from the same logic state do work.

 

It goes like this for memory writes: when the MCU's ARM core wants to access the TMS9900's memory, it writes a sequence of four bytes, containing the address, data and desired activity (write). These writes occur over a 8-bit data bus to the CPLD. After the last write the memory controller marks a write request pending. If the bus controller of the RAM is idle, it will start to serve the write request and outputs a bunch of signals, including address, data and control signals to the RAM. These signals are output as the bus controller steps through a sequence of states. Now the weird thing is that at a certain state two signals are set as the logic transitions to the next state. Out of these signals one is output properly, while the other is not. This would sound like a timing problem. But the internal logic of the CPLD can run at over 100MHz (something like 178MHz), so at 25MHz one would think there would be plenty of time for things to settle, even if generating the signal would mean logic propagation through multiple macrocells...

 

Which brings me back to the FPGA vs CPLD conclusion: I have never had this type of a problem with FPGAs, with a design as simple as this one is. I am a bit torn between the personal challenge of debugging this to the end - and just redesigning the board with an actual FPGA with plenty of capacity for this design and other features. The downside of the FPGA board design is that it requires more support chips to do level conversion between 5V and 3.3V, and FPGAs also require multiple voltage regulators. Those combined mean that a four layer board becomes a necessity. I was trying to avoid that with a simple CPLD design. It seems now though that the time saved in a simpler board design is lost in the lesser capability of the chip and synthesis tools...

  • Like 3
Link to comment
Share on other sites

Wow. It looks like you are at a level of debugging the CPLD to proof that the CPLD is not working according to its specifications.

Could it be a compatibility issue between the CPLD and the chips it interacts with? (I am no expert to name the correct ones)

In the end you have to decide which technology you want to go forward with.

Link to comment
Share on other sites

I think I now found at least one major source of a problem, which is that the my clock timing constraint format was somehow not good for the tools. After fixing that and running timing analysis again, it gives me the maximum clock speed of 39.37MHz for the design, so running it at 50MHz is outside specifications.

 

I have built another board, but I have not soldered in all the test connectors, so I have not been able to run it against logic analyser. So far that board has behaved pretty much identically at 50MHz clock, despite it having a faster CPLD chip on board. With that board I have tested it running at 27MHz, and at the lower clock speed I get different errors. But this 27MHz test I did yesterday very quickly, without measurements, before finding the timing constraint error.

 

Looking at the generated equations I also found out that the clock divider works differently than I thought; in practice it does little to help. I need to get back to this tomorrow with a fresher mind. I don't have 3.3V oscillators at other frequencies than 50MHz (the 27MHz oscillator is a 5V model although it seems to work at 3.3V but who knows). So I am thinking hooking up one of my FPGA boards, and using the clock synthesiser on the FPGA to output whatever clock I want to the CPLD, and just completely remove the local oscillator. Something like this cannot be done with a CPLD. Once I get that wired up I could try running the system at around 33MHz. Hopefully tuning the design at a lower clock speed will yield expected results and let me finally start working on the software side more, writing more meaningful code than just specific test tools.

 

The good news is that finally I can stop looking at the VHDL code - tweaking it will not bring me the performance I need in this design. The other good news is that the second board I built earlier this week was a pretty fast build, taking just over 2 hours to assemble. With two boards I can pretty much exclude specific chip errors, so one less thing to suspect.

Edited by speccery
  • Like 6
Link to comment
Share on other sites

FINALLY - almost there! After quite a bit of trial and error testing, I made a few discoveries, and now have a nearly stable version. I say a nearly stable version as I need to do more testing.

 

One of the problems I have had is that it has been very hard to debug, as making changes hasn't kept me on a steady path. It has been unclear what has caused the problems, and it has been hard to do systematic debugging. I now tried to eliminate as many variables as possible and go from there. So I made a version of the CPLD that

  • does not support accesses from the TI-99/4A but only from the ARM side which has been more problematic
  • I unmounted the regulator and I was driving the board from a bench power supply, 3.2V - it consumes about 110mA
  • I used the faster 7ns CPLD
  • and lowered the clock speed to 27MHz

After some debugging I was able to make this entirely stable, memory tests from the ARM side were working fine. I was surprised that it did not work straight away, but fixed this by adding an intermediate state to allow for my time for the address lines to stabilise.

Then I started to add back commented out VHDL code. The first step was to bring in the address multiplexer. That proved to be surprisingly difficult. A very simple 1 line change to the source code resulted in completely unpredictable behaviour - but this time I had the logic analyser connected in a way that I saw what was happening. The result was that the state machine went completely haywire, it was was writing to RAM continuously on alternating clock cycles, and my debug signals clearly showed that the state machine was not staying on the designed path. This despite the system filling the timing constraint with ease. So clearly it is very easy to go from a working version to a non-working version by changing nothing substantial.

 

With this in mind I added many more states to the state machine, so that every state does very little work. The logic I had in mind was that this way it is easier for the logic synthesis to output signals, such as WE# for RAM, as that is only output on a few states, rather than outputting the signal as a more complex equation. This approach worked, and for the first time the system became pretty much completely stable.

 

After this successful test I applied the same logic design and ARM firmware to my other board, with the lower speed CPLD, 50MHz crystal and built in regulator. That completely works from the ARM side, but not 100% from the TI side. However, this is to be expected, as the max clock frequency for this design using the faster logic chip is 47MHz, so with the slower speed chip especially I am running outside specs. I have some more oscillators on order (for 24MHz) , once I get them I think I will be able to have reliable operations on the slower speed CPLD as well.

  • Like 6
Link to comment
Share on other sites

I'm just venturing into the CPLD world for the Dragon's Lair cart... the mappings I need are simple but there's a lot to learn. I have been reading your descriptions as a sort of warning for the stuff I should watch out for. ;)

Altera's Quartus II (if you're using that CPLD brand/family) makes it easy to scoop up a ton of logic and it doesn't get really hard until you're trying to track down errant timing issues in sequential logic - still learning VHDL :-o , but it's not so bad on the combinational logic side.

I got most of my gremlins tracked down this weekend on my PEB I/O + Hyper-AMS (modified Thierry's design) ...

post-48993-0-02789100-1518409603_thumb.jpg

post-48993-0-19396500-1518409616_thumb.jpg

  • Like 4
Link to comment
Share on other sites

 

 

Once the design is right (schematics + parts) doing another board layout is just some additional work, so this would be interesting. Having the speech synthesiser in there simultaneously would be a lot harder. Perhaps the ET-PEB board could be on the bottom, connecting to the the side port, and there could be a stacked new board with the original TMS5220 chip and the two TMS6100 ROMs on top.

I did bookmark this one ages ago for the code to emulate the non-existent TMS6100s https://www.waitingforfriday.com/?p=30implementation appears to work for TI's handling with small modifications. Maybe one day I'll get that incorporated into a new MB just like the TI-99/8 did.

Just wanted to make you aware of the link/code.

Link to comment
Share on other sites

I did bookmark this one ages ago for the code to emulate the non-existent TMS6100s https://www.waitingforfriday.com/?p=30implementation appears to work for TI's handling with small modifications. Maybe one day I'll get that incorporated into a new MB just like the TI-99/8 did.

Just wanted to make you aware of the link/code.

 

 

Thanks a lot for sharing the information - another interesting project to look at!

Link to comment
Share on other sites

It's been a while since my last update. I've been busy and traveling, but now back and I've had a little time to work on the ET-PEB.

I finally found out the source of my problems with the CPLD. The Xilinx synthesis tool XST has a bunch of algorithms it uses to synthesise certain language constructs. In my case I've been a little puzzled about why my state machine sometimes has it's states represented in compact representation (i.e. if the state machine has up to 16 states the state variable has 4 bits) and sometimes with one-hot representation (as I learned it is called, see this Xilinx document for more information). In the one-hot representation the state vector for 16 states has 16 bits, one of which is set and rest are zero. I was thinking this is a cool way to represent states, and probably yields good results with CPLD chips. CPLDs have way less routing capacity than FPGAs.

 

My problem all along has been instability. I make a small change, and then the whole design goes haywire. What I discovered this morning (its winter break here :) ) is that whenever the synthesis tool decides to create one-hot representation, the design just does not work - pretty much at all. After figuring this out and becoming aware of these parameters, I edited the XST parameters, so that it always uses the "compact mode" for finite state machine representation. Actually the document I mentioned earlier does say, that the one-hot representation is not recommended for CPLD chips. Why the synthesis still creates designs with that algorithm every now and then - I don't know. And why does it seemingly arbitrarily change algorithms is another open question - well I guess it depends on the designed circuit details. The really annoying thing is that the synthesis process does not create any error messages when it creates a version of the logic circuit that does not work. Of course if it did that, I would have discovered the problems earlier... Anyway the good news is that with "compact mode" the messages are also more meaningful, it will now for example tell me when it decides to buffer a signal, so that it can fit into a functional block.

 

A quick background: The CPLD contains 8 functional blocks or FBs, the FBs contain macrocells doing the actual logic. A macrocell has only so many input variables it can use for logic equation computation, and often with a more complex design it cannot fit the design as-is. In these cases it will break a logic equation between multiple macrocells and that requires buffering. Or that's what I think it is doing.

 

Now that I got back to a track where gravity again applies, I could make systematic changes and optimise the logic design. I was able to remove quite a few intermediate variables I had previously used to make the CPLD fitting process successful. This work resulted in a bunch of simplifications, which is nice as the design now uses less macrocells (now at 115 macrocells or 80% utilisation). Another great result is that the simpler design runs a whole lot faster, timing analysis now indicating it would run at nearly 80MHz! I have two prototype boards I use for testing, one of them running at the original design frequency of 50MHz and the other at 24MHz. They now behave indentically.

 

But the best part is that this thing now finally started to work: memory tests from both the ARM micro controller and TI-99/4A side now pass, every time. On the TI side my sole test program has been the SAMS memory tester v4.0. It now happily marches through the 256K of expanded memory.

 

I still have something to sort out though - concurrent accesses sometimes do not work. More specifically, when the ARM is writing to memory while the TI side is both reading and writing to memory, it sometimes (but quite rarely) messes up the TI memory accesses. Reading from the ARM side concurrently does not seem to have any impact on the TI side. I think I just have to modify the access windows slightly to solve this remaining issue. Here the problem starts to be that despite the CPLD now having more available macrocells, it still has some very congested parts. It seems that any modification requiring one more input signal in a certain part of the design results in the design not fitting into the CPLD. At least I now get that error every time...

 

I should have some more time this week to work on the design, so I am hopeful that something actually usable starts to get ready!

  • Like 8
Link to comment
Share on other sites

Continued to work on the project - on multiple fronts: board design, CPLD design, ARM code and TI DSR code. I have designed a new version of the PCB, its a 4 layer design that I sent to manufacturing. The form factor is also nicer. Due to the Chinese new year it probably will be a while before any boards show up. On this one I tried to pay attention to signal quality and power supply side a bit more, so assuming the board turns out as intended it will allow me to see if the problems I have experienced are partially due to poor PCB design. If this board turns out fine, it will pave way for my own FPGA boards, as those you cannot design without going to four layers.

 

I've tweaked the CPLD design a bit more, simplifying it more, I am now at 111 macrocells. That now starts to leave a good amount of headroom, although the busy areas of the design remain busy, and nearly all pins are used up, so I'm not sure if the extra macrocells can be used for something meaningful before disturbing the rest of the design.

 

The most interesting progress I have had on the software front, where I started to implement disk support. I modified the very simple DSR routine I originally wrote for the TMS99105 based TI-99/4A clone to run on the real TI-99/4A iron. All the heavy lifting is actually done by the ARM micro controller, the DSR merely passes the disk access commands to the ARM, which then accesses the SD card, handles the FAT file system and the TIFILES headers and pushes the data to TI's memory map. For this application I decided to split the 8K disk DSR area at >4000 to two 4K blocks. The first 4K is code for the TMS9900 (only 406 bytes used at the moment) and the second 4K is used as a interprocessor communication area between the TMS9900 and the ARM. The TI DSR writes command requests into the RAM area, the ARM occasionally polls the memory and then fulfils the requests by reading or writing to the SD card. This marks the first time the two processors are working together for real, although still under debug code. I had to seriously slow down the ARM code and add synchronisation, as even my slow implementation was pushing data to the memory of the TMS9900 so fast that it could not keep up, even if the only task the TMS9900 was doing was just copying the loaded data to VDP memory (where it needs to go on the TI-99/4A I/O architecture - often it then gets copied back to main memory).

 

The good news, actually great news I think, is that I have been now able to both load and save Basic programs from the SD card! So I've been running XBDEMO and XBGAME on the real iron, under extended Basic and its 32K expanded memory provided by the ET-PEB. By the way - I still don't know how one interacts with anything on the XBGAME, it tells to press E and F and then something should happen - I guess. The disk support is still limited to the simple load and save operations, the more sophisticated operations such as record access will need more time, as on the micro controller side I don't have a lot of memory to work with, so I will need to refactor the software a bit to make it fit.

 

I also keep in mind FPGA alternatives, I have done test synthesis of the ET-PEB logic design for the Xilinx Spartan 6 LX9 FPGA chip and for the small Lattice ICE40HX1K FPGA. The latter one is very small FPGA, and this simple design takes about 20% of its logic capacity (this FPGA has 8K of SRAM on board which is not used in my test synthesis). Lattice also has ICE40HX4K FPGA which has about three times the logic capacity of the smaller chip. The cool part is that both of these Lattice FPGAs cost less than the CPLD I have been using, although they are not 5V tolerant like the CPLD and thus require some buffer chips. But for the ET-PEB design the end result would probably cost about the same, and they would provide much more space for interesting extra features. Yes - the feature creep is always there...

  • Like 10
Link to comment
Share on other sites

Over the weekend I turned my attention from the ET-PEB hardware debugging to software. The hardware will need more testing, but I am waiting for the new PCBs to arrive before continuing extensive hardware testing - there is plenty of work on the software side too. Also the hardware has been getting stable enough for software development.

By the way, one small learning for me was that it would be beneficial to keep the dimensions of the PCB under 10cm per side - the new PCB is now over 10cm long while a lot narrower, but going over 10cm in either direction means a bit more cost. Well, this is still prototyping - and learning along the way. I actually quite enjoyed doing the board layout - so I will probably do yet another one after debugging/testing the new boards. I may switch to micro SD cards to reduce board size. I will also remove the debug headers and maybe even the JTAG header for the CPLD, as my plan is to implement the CPLD programmer in the MCU. That will make the board much more versatile and also reduce size.
On the software side I had some time during the weekend to work on the DSR and microcontroller code to build SD support. I have had some legacy code from my FPGA project, where the PC acted as a file server for the TI-99/4A. That code base was influenced by Tursi's great work on Classic99 and actually reused a portion of that code. But now, running the SD card FAT/FAT32 filesystem on the microcontroller side I had to rewrite pretty much all the code to support a different memory model, since the microcontroller has very limited RAM and Flash ROM. After quite a few attempts and bug fixes I did get it working. My test case has been running the Editor/Assembler cartridge (off FinalGROM99) and testing loading and saving editor files in both variable and fixed record file formats, as well as compiling Matthew's YING assembler program using the ET-PEB as the memory expansion and mass storage device.
I got all of this to the point where I have been able to make my test case work, in other words E/A works, it is able to load the editor program to expanded memory, as well as the assembler. The assembler was able to compile and produce working binary files (i.e. tagged TI object code), as well as listing files, so this is getting to a nice level of functionality. I did get bitten (again) by the fact that the E/A cartridge does not care to close the source files before using the same "file handle" i.e. VDP memory address again for another file. I fixed that by just special casing this behaviour - if an already open file handle is being used again for another open command, the micro controller will close automatically the previous file before opening the new file.
To summarise, the ET-PEB now supports
- Saving and loading DSR operations (tested with saving and loading short basic programs and loading of binary files)
- Open/Close/Read/Write/Restore DSR operations on record based files, such as the ones produced and consumed by the E/A cartridge. I've tested both fixed and variable length records to the extent used by E/A.
- Mapping of DSK1/DSK2/DSK3 to separate directories on the SD card. Files are stored with TIFILES 128-byte headers
- 32K memory expansion (standard non-paged memory expansion)
- 256K SAMS style paged memory (only tested by running the AMSTEST4 program many times, which repeatedly succeeds)
There are still some lacking features, for instance it is not yet possible to open DSK1. to enumerate the files on the disk. From where I am at the moment adding that support is not hard.
One challenge I am facing now is that the 32K Flash memory of the microcontroller is pretty much used up, I have a few hundred bytes left. My code has a lot of debug print statements, and removing those will save perhaps one or two kilobytes of code space, but it will be a squeeze. I will probably move over to the LPC1347 chip, which is pin compatible but has 64K of Flash, to give some breathing room. Despite being pin compatible this chip has a somewhat different set of peripherals on board, meaning that for example the general purpose I/O control registers and pin configuration in general needs to be setup differently. So there is a bit but not much porting work to be done to support this chip.
Edited by speccery
  • Like 12
Link to comment
Share on other sites

 

 

My test case has been running the Editor/Assembler cartridge (off FinalGROM99) and testing loading and saving editor files in both variable and fixed record file formats, as well as compiling Matthew's YING assembler program using the ET-PEB as the memory expansion and mass storage device.

 

This is actually a pretty good test - the E/A system does a few non-intuitive things and for a long time Classic99's system wasn't able to run the assembler successfully even though many other things appeared to work. :)

  • Like 1
Link to comment
Share on other sites

It's fascinating to follow the blow-by-blow synopsis of your ET-PB project coming together. What you're doing and your level of understanding of the processes is so far above my head, but you put it in layman's terms quite well.

-Ed

Edited by Ed in SoDak
  • Like 1
Link to comment
Share on other sites

Great work, Erik! All your TI work is really impressive and both inspiring and anti-inspiring at the same time :)

I wound up here because I've been thinking about cloning the PEB and some common peripherals with an ICE40 and LPC4337 to bridge past and present and take my TI knowledge to the next level. Now you've gone and done it first and at best I'll only ever appear to have copied your work :-D

 

But when I do get a little further along with my project and if I have trouble writing or debugging my DSRs, maybe you would be gracious enough to answer some of my questions. It is really wonderful to see the progress you are making, but then again after your FPGA softcpu this should be quite easy for you!

  • Like 1
Link to comment
Share on other sites

Great work, Erik! All your TI work is really impressive and both inspiring and anti-inspiring at the same time :)

I wound up here because I've been thinking about cloning the PEB and some common peripherals with an ICE40 and LPC4337 to bridge past and present and take my TI knowledge to the next level. Now you've gone and done it first and at best I'll only ever appear to have copied your work :-D

 

But when I do get a little further along with my project and if I have trouble writing or debugging my DSRs, maybe you would be gracious enough to answer some of my questions. It is really wonderful to see the progress you are making, but then again after your FPGA softcpu this should be quite easy for you!

I imagine a literal tiny peb would be something attractive. A small empty box with 8 or so cartridge ports that plugs into the side expansion port. The FDC, 32k, rs232, SAMS, etc are nothing more than cartridge sized PCBs that plug into each slot. to the TI it would appear as a normal sized PEB but "miniaturized" - That would be interesting indeed. A less expensive mini box customizable with mini cards ass per the end user's need. ??? But I digress - I think there's a dreamers topic here someplace for such musings.

Edited by Sinphaltimus
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...