Jump to content

Photo

Erik's Tiny PEB


66 replies to this topic

#51 speccery OFFLINE  

speccery

    Moonsweeper

  • Topic Starter
  • 286 posts

Posted Mon Feb 5, 2018 5:28 AM

If you need a program that uses AMS I can provide one for testing. It uses under 256k so it should be fine.

 

 

Thanks adamantyr, that would be great! I much prefer known working software to my own test code, so that I know what I am actually debugging... If you have something that uses under 256k, that's perfect. 

 

I haven't written the disk DSR support for this system yet. This limits the kind of software I can load currently - raw binary images work best. My plan is to port my existing FPGA system DSR setup to the new ET-PEB platform. There is a bit of work, since the FPGA system uses a PC as a file server, and the MCU does not have that kind of resources at its disposal - I cannot load large files to RAM for preprocessing as I don't have that kind of RAM. However, now that I think about this, as an intermediate step it would probably be useful to do a version which uses pretty much the same PC server software that I use with the FPGA system. That would enable me to get a higher level of confidence into the hardware platform before writing the full disk support with the MCU.



#52 Sinphaltimus OFFLINE  

Sinphaltimus

    River Patroller

  • 2,293 posts
  • Distracted at the Keyboard
  • Location:Poconos, PA

Posted Mon Feb 5, 2018 6:29 AM

Just a layman's question here in regards to "I cannot load large files to RAM for preprocessing".

 

You mention using FAT filesystem so I'm assuming FAT16 and not FAT32.

Could you partition say a 4GB SD card into 2GB partitions - one for use as you originally intended and the other partition (formatted anyway that best) for use with loading large files for preprocessing like a virtual ram drive as a sort of staging area?

 



#53 speccery OFFLINE  

speccery

    Moonsweeper

  • Topic Starter
  • 286 posts

Posted Mon Feb 5, 2018 9:21 AM

You mention using FAT filesystem so I'm assuming FAT16 and not FAT32.

 

It is actually both FAT32 and FAT16, courtesy of the library I am using. Thus large partitions and files are supported.

 

My comment was not very precise. My disk support for the FPGA system is based on the ideas as implemented by Classic99 and borrows a few functions from there. That code implementation parses the record based TI-99/4A files so that the entire file is preloaded to RAM, and then records are returned to the TI-99/4A as requested by the running application. This approach does not work on the ET-PEB, as the MCU only has 8K of RAM. Thus I need to rewrite that section to deal with files in a piecemeal fashion, which is in fact how the original TI-99/4A code must have done. 

 

Now that I am writing this, I could actually also implement file support using the external RAM (which is shared with the TI). The RAM chip has 512K bytes, and half of that is for the TI expanded memory. The remaining 256K is currently used for the DSR code and the page table, leaving most of the 256K unused. The MCU can access this memory. I have never had an actual PEB or used TI with 5.25 inch disks, but I recall disk images being 170K in size, so surely no legacy TI file would be bigger than that - which would enable preloading... But I think it would be cool to enable things that could not be done before, such as streaming data from large files to TI's memory, enabling long audio (or video) playback for example, so having support for files larger than 170k would be interesting. The MCU can easily do this type of streaming in the background, with zero CPU impact to the TMS9900: the data just appears in its memory.



#54 Asmusr OFFLINE  

Asmusr

    River Patroller

  • 2,580 posts
  • Location:Denmark

Posted Mon Feb 5, 2018 10:51 PM

 

Thanks adamantyr, that would be great! I much prefer known working software to my own test code, so that I know what I am actually debugging... If you have something that uses under 256k, that's perfect. 

 

 

Dungeons of Asgard is also using 128K SAMS.

Attached Files



#55 adamantyr OFFLINE  

adamantyr

    Stargunner

  • 1,263 posts

Posted Tue Feb 6, 2018 1:28 AM

Oh yes, that's a good one to use as well! Mine is incomplete and I found bugs. :P

#56 speccery OFFLINE  

speccery

    Moonsweeper

  • Topic Starter
  • 286 posts

Posted Thu Feb 8, 2018 9:41 AM

Some random ramblings...

 

I start with my conclusion: I can conclude with obvious result that the CPLDs (Complex Programmable Logic Devices) are not the same as FPGAs (Field Programmable Gate Arrays).

 

I have been debugging memory accesses from the processors to the memory of the ET-PEB. At the center of my design is the CPLD chip, which performs many functions, the chief among them being memory arbitration. I've now spent many more hours than I anticipated in debugging the CPLD design (probably by now 10 - 20 hours), before I came up with a test setup that vividly shows on the logic analyser screen what is going on when problems occur. And what happens is that the behaviour is simply not what my VHDL code or simulation says. Hence the debugging has been taking a long time, but at least now I have a way to see what goes wrong when things do not work. That in turn means I will be diving in even deeper in to the rabbit hole - in to the specifics of the CPLD chip I am using.

 

To go back in time a little, my feature complete design took initially about 139 macrocells out of the 144 available ones. I knew from the beginning that I would be pushing the logic chip to near maximum usage, so that is not surprising. There were many signal routing problems, these are typical when a programmable logic chip becomes too full. With careful arrangement I was able to make a version of the logic design that appeared to work - but only mostly. Since then I made two bigger design changes. First I implemented a clock divider, to run the logic at half the speed. At 25MHz that is still fast enough to do what I need to get done. That did reduce the problem intervals greatly, but the issues did not completely disappear. Then I came up with a much more clever way to arrange a portion of the logic, which brought down the macrocell usage to 128, and also eased routing issues.

 

After those changes the TMS9900 side of the memory interface seems completely stable. I did run the AMS memory test without any failures 12 consecutive times (each run takes a while and I ran out of patience to run more tests), so the paging over 256K memory range works as designed. But the MCU side access to the memory just did not work fully, and now I was finally able to observe on the logic analyser the root cause, which is that a portion of the signals function as designed, while one of the key signals (the address multiplexer control) is working correctly 99.9% of the time from the MCU side, but it mysteriously not working all the time. It is strange, because it is a very simple signal and other activities triggered from the same logic state do work.

 

It goes like this for memory writes: when the MCU's ARM core wants to access the TMS9900's memory, it writes a sequence of four bytes, containing the address, data and desired activity (write). These writes occur over a 8-bit data bus to the CPLD. After the last write the memory controller marks a write request pending. If the bus controller of the RAM is idle, it will start to serve the write request and outputs a bunch of signals, including address, data and control signals to the RAM. These signals are output as the bus controller steps through a sequence of states. Now the weird thing is that at a certain state two signals are set as the logic transitions to the next state. Out of these signals one is output properly, while the other is not. This would sound like a timing problem. But the internal logic of the CPLD can run at over 100MHz (something like 178MHz), so at 25MHz one would think there would be plenty of time for things to settle, even if generating the signal would mean logic propagation through multiple macrocells...

 

Which brings me back to the FPGA vs CPLD conclusion: I have never had this type of a problem with FPGAs, with a design as simple as this one is. I am a bit torn between the personal challenge of debugging this to the end - and just redesigning the board with an actual FPGA with plenty of capacity for this design and other features. The downside of the FPGA board design is that it requires more support chips to do level conversion between 5V and 3.3V, and FPGAs also require multiple voltage regulators. Those combined mean that a four layer board becomes a necessity. I was trying to avoid that with a simple CPLD design. It seems now though that the time saved in a simpler board design is lost in the lesser capability of the chip and synthesis tools...



#57 kl99 OFFLINE  

kl99

    Dragonstomper

  • 720 posts
  • Location:Vienna, Austria

Posted Fri Feb 9, 2018 7:55 AM

Wow. It looks like you are at a level of debugging the CPLD to proof that the CPLD is not working according to its specifications.

Could it be a compatibility issue between the CPLD and the chips it interacts with? (I am no expert to name the correct ones)

In the end you have to decide which technology you want to go forward with.



#58 speccery OFFLINE  

speccery

    Moonsweeper

  • Topic Starter
  • 286 posts

Posted Fri Feb 9, 2018 3:04 PM

I think I now found at least one major source of a problem, which is that the my clock timing constraint format was somehow not good for the tools. After fixing that and running timing analysis again, it gives me the maximum clock speed of 39.37MHz for the design, so running it at 50MHz is outside specifications.

 

I have built another board, but I have not soldered in all the test connectors, so I have not been able to run it against logic analyser. So far that board has behaved pretty much identically at 50MHz clock, despite it having a faster CPLD chip on board. With that board I have tested it running at 27MHz, and at the lower clock speed I get different errors. But this 27MHz test I did yesterday very quickly, without measurements, before finding the timing constraint error.

 

Looking at the generated equations I also found out that the clock divider works differently than I thought; in practice it does little to help. I need to get back to this tomorrow with a fresher mind. I don't have 3.3V oscillators at other frequencies than 50MHz (the 27MHz oscillator is a 5V model although it seems to work at 3.3V but who knows). So I am thinking hooking up one of my FPGA boards, and using the clock synthesiser on the FPGA to output whatever clock I want to the CPLD, and just completely remove the local oscillator. Something like this cannot be done with a CPLD. Once I get that wired up I could try running the system at around 33MHz. Hopefully tuning the design at a lower clock speed will yield expected results and let me finally start working on the software side more, writing more meaningful code than just specific test tools.

 

The good news is that finally I can stop looking at the VHDL code - tweaking it will not bring me the performance I need in this design. The other good news is that the second board I built earlier this week was a pretty fast build, taking just over 2 hours to assemble. With two boards I can pretty much exclude specific chip errors, so one less thing to suspect.


Edited by speccery, Fri Feb 9, 2018 3:06 PM.


#59 speccery OFFLINE  

speccery

    Moonsweeper

  • Topic Starter
  • 286 posts

Posted Sun Feb 11, 2018 5:37 AM

FINALLY - almost there! After quite a bit of trial and error testing, I made a few discoveries, and now have a nearly stable version. I say a nearly stable version as I need to do more testing.

 

One of the problems I have had is that it has been very hard to debug, as making changes hasn't kept me on a steady path. It has been unclear what has caused the problems, and it has been hard to do systematic debugging. I now tried to eliminate as many variables as possible and go from there. So I made a version of the CPLD that

  • does not support accesses from the TI-99/4A but only from the ARM side which has been more problematic
  • I unmounted the regulator and I was driving the board from a bench power supply, 3.2V - it consumes about 110mA
  • I used the faster 7ns CPLD
  • and lowered the clock speed to 27MHz

After some debugging I was able to make this entirely stable, memory tests from the ARM side were working fine. I was surprised that it did not work straight away, but fixed this by adding an intermediate state to allow for my time for the address lines to stabilise.

Then I started to add back commented out VHDL code. The first step was to bring in the address multiplexer. That proved to be surprisingly difficult. A very simple 1 line change to the source code resulted in completely unpredictable behaviour - but this time I had the logic analyser connected in a way that I saw what was happening. The result was that the state machine went completely haywire, it was was writing to RAM continuously on alternating clock cycles, and my debug signals clearly showed that the state machine was not staying on the designed path. This despite the system filling the timing constraint with ease. So clearly it is very easy to go from a working version to a non-working version by changing nothing substantial.

 

With this in mind I added many more states to the state machine, so that every state does very little work. The logic I had in mind was that this way it is easier for the logic synthesis to output signals, such as WE# for RAM, as that is only output on a few states, rather than outputting the signal as a more complex equation. This approach worked, and for the first time the system became pretty much completely stable.

 

After this successful test I applied the same logic design and ARM firmware to my other board, with the lower speed CPLD, 50MHz crystal and built in regulator. That completely works from the ARM side, but not 100% from the TI side. However, this is to be expected, as the max clock frequency for this design using the faster logic chip is 47MHz, so with the slower speed chip especially I am running outside specs. I have some more oscillators on order (for 24MHz) , once I get them I think I will be able to have reliable operations on the slower speed CPLD as well.



#60 Tursi OFFLINE  

Tursi

    River Patroller

  • 4,965 posts
  • HarmlessLion
  • Location:BUR

Posted Sun Feb 11, 2018 5:54 PM

I'm just venturing into the CPLD world for the Dragon's Lair cart... the mappings I need are simple but there's a lot to learn. I have been reading your descriptions as a sort of warning for the stuff I should watch out for. ;)



#61 helocast OFFLINE  

helocast

    Chopper Commander

  • 130 posts
  • Location:Amarillo, TX

Posted Sun Feb 11, 2018 10:27 PM

I'm just venturing into the CPLD world for the Dragon's Lair cart... the mappings I need are simple but there's a lot to learn. I have been reading your descriptions as a sort of warning for the stuff I should watch out for. ;)

Altera's Quartus II (if you're using that CPLD brand/family) makes it easy to scoop up a ton of logic and it doesn't get really hard until you're trying to track down errant timing issues in sequential logic - still learning VHDL :-o , but it's not so bad on the combinational logic side.

I got most of my gremlins tracked down this weekend on my PEB I/O + Hyper-AMS (modified Thierry's design) ...

Attached Files



#62 helocast OFFLINE  

helocast

    Chopper Commander

  • 130 posts
  • Location:Amarillo, TX

Posted Sun Feb 11, 2018 10:57 PM

 

 

Once the design is right (schematics + parts) doing another board layout is just some additional work, so this would be interesting. Having the speech synthesiser in there simultaneously would be a lot harder. Perhaps the ET-PEB board could be on the bottom, connecting to the the side port, and there could be a stacked new board with the original TMS5220 chip and the two TMS6100 ROMs on top.

I did bookmark this one ages ago for the code to emulate the non-existent TMS6100s https://www.waitingforfriday.com/?p=30implementation appears to work for TI's handling with small modifications. Maybe one day I'll get that incorporated into a new MB just like the TI-99/8 did.

Just wanted to make you aware of the link/code.



#63 speccery OFFLINE  

speccery

    Moonsweeper

  • Topic Starter
  • 286 posts

Posted Mon Feb 12, 2018 3:09 AM

I did bookmark this one ages ago for the code to emulate the non-existent TMS6100s https://www.waitingforfriday.com/?p=30implementation appears to work for TI's handling with small modifications. Maybe one day I'll get that incorporated into a new MB just like the TI-99/8 did.

Just wanted to make you aware of the link/code.

 

 

Thanks a lot for sharing the information - another interesting project to look at!



#64 Ksarul OFFLINE  

Ksarul

    River Patroller

  • 4,354 posts

Posted Mon Feb 12, 2018 6:42 AM

Actually, Unicorn has some TMS6100 chips. . .so they're not quite unobtainium yet.



#65 speccery OFFLINE  

speccery

    Moonsweeper

  • Topic Starter
  • 286 posts

Posted Mon Feb 19, 2018 8:22 AM

It's been a while since my last update. I've been busy and traveling, but now back and I've had a little time to work on the ET-PEB.
 
I finally found out the source of my problems with the CPLD. The Xilinx synthesis tool XST has a bunch of algorithms it uses to synthesise certain language constructs. In my case I've been a little puzzled about why my state machine sometimes has it's states represented in compact representation (i.e. if the state machine has up to 16 states the state variable has 4 bits) and sometimes with one-hot representation (as I learned it is called, see this Xilinx document for more information). In the one-hot representation the state vector for 16 states has 16 bits, one of which is set and rest are zero. I was thinking this is a cool way to represent states, and probably yields good results with CPLD chips. CPLDs have way less routing capacity than FPGAs.

 

My problem all along has been instability. I make a small change, and then the whole design goes haywire. What I discovered this morning (its winter break here  :) ) is that whenever the synthesis tool decides to create one-hot representation, the design just does not work - pretty much at all. After figuring this out and becoming aware of these parameters, I edited the XST parameters, so that it always uses the "compact mode" for finite state machine representation. Actually the document I mentioned earlier does say, that the one-hot representation is not recommended for CPLD chips. Why the synthesis still creates designs with that algorithm every now and then - I don't know. And why does it seemingly arbitrarily change algorithms is another open question - well I guess it depends on the designed circuit details. The really annoying thing is that the synthesis process does not create any error messages when it creates a version of the logic circuit that does not work. Of course if it did that, I would have discovered the problems earlier... Anyway the good news is that with "compact mode" the messages are also more meaningful, it will now for example tell me when it decides to buffer a signal, so that it can fit into a functional block.

 

A quick background: The CPLD contains 8 functional blocks or FBs, the FBs contain macrocells doing the actual logic. A macrocell has only so many input variables it can use for logic equation computation, and often with a more complex design it cannot fit the design as-is. In these cases it will break a logic equation between multiple macrocells and that requires buffering. Or that's what I think it is doing.

 

Now that I got back to a track where gravity again applies, I could make systematic changes and optimise the logic design. I was able to remove quite a few intermediate variables I had previously used to make the CPLD fitting process successful. This work resulted in a bunch of simplifications, which is nice as the design now uses less macrocells (now at 115 macrocells or 80% utilisation). Another great result is that the simpler design runs a whole lot faster, timing analysis now indicating it would run at nearly 80MHz! I have two prototype boards I use for testing, one of them running at the original design frequency of 50MHz and the other at 24MHz. They now behave indentically.

 

But the best part is that this thing now finally started to work: memory tests from both the ARM micro controller and TI-99/4A side now pass, every time. On the TI side my sole test program has been the SAMS memory tester v4.0. It now happily marches through the 256K of expanded memory.

 

I still have something to sort out though - concurrent accesses sometimes do not work. More specifically, when the ARM is writing to memory while the TI side is both reading and writing to memory, it sometimes (but quite rarely) messes up the TI memory accesses. Reading from the ARM side concurrently does not seem to have any impact on the TI side. I think I just have to modify the access windows slightly to solve this remaining issue. Here the problem starts to be that despite the CPLD now having more available macrocells, it still has some very congested parts. It seems that any modification requiring one more input signal in a certain part of the design results in the design not fitting into the CPLD. At least I now get that error every time...

 

I should have some more time this week to work on the design, so I am hopeful that something actually usable starts to get ready!



#66 speccery OFFLINE  

speccery

    Moonsweeper

  • Topic Starter
  • 286 posts

Posted Mon Feb 19, 2018 8:23 AM

Oh and I forgot to thank pnr for taking the time to review and comment the design! That was helpful and motivating!



#67 kl99 OFFLINE  

kl99

    Dragonstomper

  • 720 posts
  • Location:Vienna, Austria

Posted Mon Feb 19, 2018 9:57 AM

Hi Erik!
Congrats on your progress! Seems like you solved the big mystery of instability issue.

Such a penetrant issue that is not even related to the actual project can be de-motivating.

I am enjoying very much your updates!






0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users