Jump to content

Spaced Cowboy

  • Content Count

  • Joined

  • Last visited

Community Reputation

416 Excellent

About Spaced Cowboy

  • Rank
  • Birthday 05/17/1969

Profile Information

  • Gender
  • Location
    San Jose, CA
  • Interests
    Electronics, CNC machines, Saltwater fish, general creative stuff

Recent Profile Visitors

4,427 profile views
  1. Much as I would love to give some positive news on that, with the current status of the CV19 pandemic, I've had to change job. My last day is in fact tomorrow, and it's been (understandably, I hope you can agree) very hectic over the last few months preparing for this eventuality that I could see coming a mile off The family, and keeping bread being put onto the table come first, so this is shelved for the immediate future, sorry. I'd love to keep it going but there's not much call for old guys in Silicon Valley as it is, so I just don't have the time right now - I simply *have* to make sure that the new job is a success.
  2. All that sampling is turning out to be very useful. I've got some verilog code now: `timescale 1ns/1ns //////////////////////////////////////////////////////////////////////////////// // States that the bus-monitor can be in //////////////////////////////////////////////////////////////////////////////// `define BS_NUM 2 `define BS_WAIT_FALLING_CLOCK `BS_NUM'h0 `define BS_WAIT_READ_ADDR `BS_NUM'h1 `define BS_WAIT_RISING_CLOCK `BS_NUM'h2 `define BS_WAIT_READ_DATA `BS_NUM'h3 //////////////////////////////////////////////////////////////////////////////// // Bus monitor module. // // Todo: // - Make it manage writes as well as reads // - Take input from external mux for what to write to bus // - Handle setting IRQ appropriately // - Handle memory apertures and detection //////////////////////////////////////////////////////////////////////////////// module busa8 ( input clk, // CPLD clock @ 100 MHz input a8clk, // A8 clock @ ~1.8MHz input rw_n, // A8 read/write signal input halt_n, // A8 /HALT signal input irq_n, // A8 /IRQ signal input rd5, // A8 rd5 cartridge signal input s5_n, // A8 /S5 cartridge select input rsrvd, // unused input cctl_n, // A8 /CCTL signal input [15:0] addr, // A8 address bus input extsel_n, // A8 /EXTSEL signal input [7:0] data, // A8 data bus input rst_n, // A8 /RST signal input rd4, // A8 rd4 cartridge signal input s4_n, // A8 /S4 signal input mpd_n, // A8 Math-Pak Disable (/MPD) signal input ref_n, // A8 Dram refresh (/REF) signal input D1xx_n, // A8 access to $D1xx output reg [15:0] busAddr, // buffered address for this cycle output reg [7:0] busData, // buffered data for this cycle output reg busDlist, // busData is display-list data output reg busScreen, // busData is screen memory data output reg busChar, // busData is character-data output busDram, // in a dram-refresh cycle output busHalt // in a /HALT cycle ); //////////////////////////////////////////////////////////////////////////// // Local state //////////////////////////////////////////////////////////////////////////// // Display-list related reg [15:0] dlistAddr; // Current address of the display list reg [15:0] screenAddr; // Address of next screen byte reg lmsLo; // read the LMS low byte reg lmsHi; // Read the LMS high byte // Bus timing related, start with clock going low reg [4:0] delay; // Clocks to wait until doing something reg inRefresh; // Whether we're in a dram refresh cycle reg [1:0] inHalt; // Whether we're in a halt cycle //////////////////////////////////////////////////////////////////////////// // State machine for the video detection //////////////////////////////////////////////////////////////////////////// reg [`BS_NUM-1:0] busState; // State machine //////////////////////////////////////////////////////////////////////////// // Sync a8clk to the FPGA clock using a 3-bit shift register to avoid // metastability due to the different clock rates //////////////////////////////////////////////////////////////////////////// reg [2:0] clkDetect; always @(posedge clk) if (rst_n == 1'b0) clkDetect <= 3'b0; else clkDetect <= {clkDetect[1:0], a8clk}; //////////////////////////////////////////////////////////////////////////// // We want to know about rising/falling edges to handle bus traffic timing //////////////////////////////////////////////////////////////////////////// wire clkRising = (clkDetect[2:1]==2'b01); wire clkFalling = (clkDetect[2:1]==2'b10); //////////////////////////////////////////////////////////////////////////// // map the refresh and halt signals // halt cycles are one-clock delayed, refresh is this cycle //////////////////////////////////////////////////////////////////////////// assign busDram = inRefresh; assign busHalt = inHalt[1]; //////////////////////////////////////////////////////////////////////////// // Monitor the bus //////////////////////////////////////////////////////////////////////////// always @ (posedge clk) if (rst_n == 1'b0) begin busState <= `BS_WAIT_FALLING_CLOCK; delay <= 5'h0; busData <= 8'b0; busAddr <= 16'h0; busDlist <= 1'b0; busScreen <= 1'b0; busChar <= 1'b0; end else begin case (busState) // Everything is synced off the falling 8-bit clk, where // all the signals are reset `BS_WAIT_FALLING_CLOCK: begin // At the start of the clock cycle, reset things busDlist <= 1'b0; busScreen <= 1'b0; busChar <= 1'b0; if (clkFalling) begin delay <= 5'h12; busState <= `BS_WAIT_READ_ADDR; end end // We've waited 180ns, sufficient for the address to be // stable on the bus, and the /HALT and /REF signals to // be asserted `BS_WAIT_READ_ADDR: begin delay <= delay -1; if (delay == 3'h0) begin busState <= `BS_WAIT_RISING_CLOCK; busAddr <= addr; inHalt <= {inHalt[0],!halt_n}; inRefresh <= !ref_n; delay <= 5'h0D; end end // We now re-sync to the rising clock signal rather than // dead-reckon `BS_WAIT_RISING_CLOCK: begin if (clkRising) begin // Halt is 1-cycle delayed delay <= 5'h12; busState <= `BS_WAIT_READ_DATA; end end // We've waited another 180ns, sufficient for the data to // be stable on the bus `BS_WAIT_READ_DATA: begin if (busAddr == 16'h0230) begin dlistAddr[7:0] <= data; $display("Set DL:Lo to %x", busData); end else if (busAddr == 16'h0231) begin dlistAddr[15:8] <= data; $display("Set DL:Hi to %x", busData); end else if (addr == dlistAddr) begin $display("DL data %x @ %x", busData, dlistAddr); busData <= data; busDlist <= 1'b1; dlistAddr <= dlistAddr+1; end else if (addr == screenAddr) begin $display("Screen data %x @ %x", busData, screenAddr); busData <= data; busScreen <= 1'b1; end else if (inHalt[1] && !inRefresh) begin $display("Char data %x @ %x", busData, busAddr); busChar <= 1'b1; busData <= data; end busState <= `BS_WAIT_FALLING_CLOCK; end endcase end //////////////////////////////////////////////////////////////////////////// // Handle the Load-Memory-Scan instructions in the display-list stream //////////////////////////////////////////////////////////////////////////// always @ (posedge clk) if (rst_n == 1'b0) begin lmsLo <= 1'b0; lmsHi <= 1'b0; screenAddr <= 16'h0; end else if (busDlist) begin if (busData[3:0] == 0 && (lmsLo == 1'b0) && (lmsHi == 1'b0)) begin lmsLo <= 1'b0; lmsHi <= 1'b0; end else if (lmsHi) begin screenAddr[15:8] <= busData; lmsHi <= 1'b0; $display("Screen: %x%x", data, screenAddr[7:0]); end else if (lmsLo) begin screenAddr[7:0] <= busData; lmsHi <= 1'b1; lmsLo <= 1'b0; end else begin lmsLo <= busData[6]; end end else if (busScreen) begin screenAddr <= screenAddr + 1; end endmodule ... that captures the state of the bus in a manner useful to interpreting the bus traffic as ANTIC data. The above code gets me results like this for the top of the power-on BASIC-enabled screen (where it just prints READY) DL data 41 @ 9c3d DL data 20 @ 9c3e DL data 9c @ 9c3f Screen: 9c20 Set DL:Hi to 9c [9c40] Set DL:Lo to 20 [9c40] DL data 70 @ 9c20 DL data 70 @ 9c21 DL data 70 @ 9c22 DL data 42 @ 9c23 DL data 40 @ 9c24 DL data 9c @ 9c25 Screen: 9c40 Screen data 00 @ 9c40 Screen data 00 @ 9c41 Screen data 00 @ 9c42 Screen data 00 @ 9c43 Screen data 00 @ 9c44 Screen data 00 @ 9c45 Screen data 00 @ 9c46 Screen data 00 @ 9c47 Screen data 00 @ 9c48 Screen data 00 @ 9c49 Screen data 00 @ 9c4a Screen data 00 @ 9c4b Screen data 00 @ 9c4c Screen data 00 @ 9c4d Screen data 00 @ 9c4e Screen data 00 @ 9c4f Screen data 00 @ 9c50 Screen data 00 @ 9c51 Screen data 00 @ 9c52 Screen data 00 @ 9c53 Screen data 00 @ 9c54 Screen data 00 @ 9c55 Screen data 00 @ 9c56 Screen data 00 @ 9c57 Screen data 00 @ 9c58 Screen data 00 @ 9c59 Screen data 00 @ 9c5a Screen data 00 @ 9c5b Screen data 00 @ 9c5c Screen data 00 @ 9c5d Screen data 00 @ 9c5e Screen data 00 @ 9c5f Screen data 00 @ 9c60 Screen data 00 @ 9c61 Screen data 00 @ 9c62 Screen data 00 @ 9c63 Screen data 00 @ 9c64 Screen data 00 @ 9c65 Screen data 00 @ 9c66 Screen data 00 @ 9c67 DL data 02 @ 9c26 Screen data 00 @ 9c68 Screen data 00 @ 9c69 Screen data 32 @ 9c6a Screen data 25 @ 9c6b Screen data 21 @ 9c6c Screen data 24 @ 9c6d Screen data 39 @ 9c6e Screen data 00 @ 9c6f Screen data 00 @ 9c70 Screen data 00 @ 9c71 Screen data 00 @ 9c72 ... ... which might not mean much until you realise that the '32','25','21','24','39' spell out READY in the internal character set The bus signals look like which is exactly what I'd expect. I need to merge in the fetch-character-data that's also happening (I've done it in a different source file) and I'd be able to reconstitute a GRAPHICS 0 screen [Edit] And here's the character data being fetched for the second pixel-row of READY, the first pixel row is all zeros... Char data 00 @ e001 Char data 00 @ e001 Char data 00 @ e191 Char data 7c @ e129 Char data 7e @ e109 Char data 18 @ e121 Char data 78 @ e1c9 Char data 66 @ e001 Char data 00 @ e001 Char data 00 @ e001 Char data 00 @ e001 Char data 00 @ e001 Char data 00 @ e001
  3. I wanted to test out PCBway as an alternative to Seeed for assembly, so I sent off the board that will plug into the back of the XL/XE as a simple test of the procedures etc. It's just come back and it looks pretty nicely done: The back row is for test purposes, it's just the pinouts of the port. The front holes are the cartridge expansion (probably through a right-angle connector) and audio-in (for the XEL or anywhere else you can get stereo line-level audio ) The connector on the side is a surface-mount mini-SAS connector with a guide-shield for cable insertion. It comes like this, and then you break off the plastic clip and it turns into a two-piece part. The mini-SAS cable looks like and is really intended to be an internal cable, but it has a locking snap on the top, and I think it'll work well. I get 36 wires down a relatively flexible connector, and it's a lot easier to work with than an IDC cable. When you snap off the plastic cover, the board looks like: and connecting it all together, it looks like:
  4. Aaaand edit: I'm an idiot. The address-bus values *between* those above were constant at $E001, so I was assuming that was the CPU (being halted every other clock) constantly accessing the same location. That's not the case Looking at the traces after Antic has stopped /HALTing the 6502, ... it's clear to see that the CPU is what is accessing $F306,7,8,9,A,... Not Antic. Which means Antic is accessing the $E001 address a lot. On the XE, $E000 is the standard domestic character-set offset, and I'm assuming the XEL is booting into basic (because I didn't have a keyboard or screen attached) and getting the character-based screen data. Looking at other access patterns, it seems this is Antic reading a byte representing a row in a character (in this case the next-to-top-row of a space character which data is stored at $E000 through $E007). There are other sequences where Antic holds /HALT low for 80 clocks at a time, interleaving 40 read-character-at-screen-position with 40 bytes describing this scan-line's representation of that character. It's all becoming a bit clearer now This is all going to feed into a script I'm writing which will take these bus-traces as input and attempt to parse out the screen display from them based on triggering callbacks from events it discovers in the trace sequences (clk going high, /halt going low etc.). Doing it in software is a lot easier than doing it in hardware, and I can then just port the algorithm to the hardware world once I know it works in software
  5. So, it seems my 130XE is indeed expletive (ahem) ... not fully functional, at least with respect to its external address bus. Here's the screenshot of an 1088XEL trace: ... where I happened to catch Antic doing a refresh of the 256'th column, so the address bus is all-on. This image also shows Antic doing memory-fetches for the graphics data, immediately after the DRAM refreshes. I am a little confused over the accesses at the moment though, it looks as though Antic's memory access are to: $F30F - unknown $02FC - Internal hardware value for last key pressed ? $F310 .. $F311 .. $F312 .. $F313 .. $F314 .. all unknown $F3FD - unknown $F2FD .. $F2FE .. $F2FF .. $F300 .. $F301 .. all unknown All of these seem to be inside the OS ROM (and not where the charset is at $E000 .. $E3FF) unless my edition of "Mapping The Atari" is out of date, or unless the XEL replaces the ROM somehow ? And yes, I checked that the low bit of the high nibble of the address wasn't always being set - once bitten, twice shy Still, the reality is that the traces are what they are - so I guess I just have to figure out what they mean now
  6. So I didn't intend to have another post this quickly, but this is kind of interesting Since I have the bus traces from an actual 130XE, I'm writing a bus-decoder in Objective-C on the Mac before I try to do it in verilog, it's far faster to debug it in software when I can run tests as part of the build process Once I know I have a working algorithm, I can then translate that to verilog for the CPLD. Annnnyhoo - I noticed something odd... Under no circumstance was bit 4 of the address vector ever being set. I thought that was a bit weird, so this morning I went and wired everything up again (40 flying leads!) in case I'd missed one somehow in the previous run. Nope. I still didn't get anything on A4 (bit 1 of the second nibble). So I wondered whether the analyser interface wire was broken, and changed out the flying lead to a different port (this thing can sample 256 lines in parallel at 2ns resolution!) - still nothing on A4. So I un-wired everything, took the interface board ... back to my desk and used a continuity tester to make sure that A4 (3rd pin along on the top row) was connected to the pin correctly, and there hadn't been a fabrication error. Yep, that beeped. So then I double-checked that it *was* in fact the third pin along on the top row that I ought to be sampling for A4, which seems to be the case. Which leads me to believe that my 130XE is buggered, or at least it's A4 line to the outside world is - unless there is some unbeknownst-to-me reason why A4 should never be high ? As an aside, this version of the breakout board has /REF and /HALT brought out onto it, and also vends a cartridge port interface, so I can plug in a cartridge and look at how /MPD and /EXTSEL work when something is actually plugged in. Here's ANTIC doing the RAM refresh... ... It looks as though /HALT is asserted one clock before /REF is, and the bus refresh address changes when /REF is active. This is how I found the problem, actually - I was looking at Antic counting up the columns, and it does 9 (weirdly, not 8..) accesses within about every 64 uS with a heavy front-load within those 64uS, meaning it cycles through refreshing every row roughly every 18ms. Looking at the address though, it went $FFCC $FFCD $FFCE $FFCF $FFE0 $FFE1 (so the $xxCx ought to have been $xxDx) and looking around, I saw it was a pattern. A previous sequence was $FFAE $FFAF $FFC0, ... where the $xxAx really ought to have been $xxBx. So, I think I have a broken external port. I've got my 1088XEL and I'll bring that in and see if I see the same thing (which would be *very* weird) and go from there...
  7. All of those things. If you read back in the thread, some of the things I think you can do: "Memory Apertures" allow arbitrary remapping of the 32MB of SDRAM on-board into the 6502's memory map We have access to the /HALT line on the bus. We also know the video requirements for the current ANTIC scanline so we can predict when ANTIC will want access. The rest of the time, we can /HALT the CPU and read/write from/to memory ourselves. The STM is the ultimate co-processor ANTIC uses the bus to fetch the display list and display data - so we can reconstitute the video display just from the bus accesses, which are conveniently signaled by ANTIC using the /HALT line (and not the /REF line, which we also have access to). This is the basis for the video-out-to-HDMI plan I think it might be feasible to write software-only peripherals on the R-Pi. It ought to be possible, I think, to have a program running on the R-Pi which has access (via an API) over the USB-bus to the STM32 in some standardized way, and the STM would then vend that software "Device" to the XL/XE as a peripheral sinking/sourcing data The CPLD has an embarrassment of riches with regard to i/o pins, so I put 4 PMOD interfaces in there. All they need is a cable to be brought out to the outside world. There's also an internal expansion bus (to both CPLD and CPU) to link in anything else in the future that need more oomph than the stock slots provide for. Speaking of the CPLD, I have a 6502 design that runs at ~50MHz on that CPLD, and there's sufficient SRAM hanging off the CPLD to make it a full-blown co-processor... Networking is actually pretty trivial - you need a daemon on the linux box, and an API in cc65 (or Action! or whatever) that the N: driver can implement. Anything over open, close,read,write can be done with XIO, or you could go native and just memory-map things - so you set up a memory page, 128 bytes is buffer, 128 bytes for control structures, and the driver just reads/writes to those memory locations which cause transactions to happen to the Pi. Hard disks (USB-attached) are equally simple - just a buffer for data and transactional API for the transport - it's actually pretty much the same as the network model. The Pi has 2x USB-2 and 2x USB-3 ports exposed... I've plumbed the audio lines through, so if you can get line audio into the cabling on the Atari side (trivial with a 1088XEL, the pins are right there, a bit more difficult with the stock hardware) you can have audio sent over the HDMI link, in stereo even. Using something like a raspberry pi to reconstitute the display leads to some interesting possibilities - for example when the XL is putting out one of the APAC modes, we could recognize and render it correctly (ie: no black lines on alternate rows).. Things like that. To be honest, just having the gargantuan 32 Mbytes of RAM from the STM32 at hand could lead to some pretty freaking cool things I think the Pi is too good a deal to miss out on, so I expect most people to go for it, but you don't *have* to have it - the STM is perfectly capable of running the slots without the Pi being present. If you *do* go for the Pi, though, the goal is to have the best damn developer machine for the Atari that you could reasonably get in real hardware - deploying the binary can be done over the parallel bus, there are 2 HDMI ports (one for the XL display, the other a high-res linux interface for text-editing/compiling) and a relatively fast CPU to work with. I've been playing with the Pi-4 for a few days, and it really is remarkably zippy, considering its cost and form-factor.
  8. One thing I forgot to add in to the above price is the cost of the XL/XE interface board. This is pretty minimal, just a pass-through from one port-type (Atari cartridge) to another (mini-SAS), but it has some high-ish cost items on it (the mini-SAS connector is $7 and the Atari cartridge port is $4.50 from DigiKey, though given that that is through-hole, it might be a lot cheaper to source from Ebay). So figure adding another $18-ish for the board and cable.
  9. So, just to poke my head up over the trenches for a brief moment, the project isn't abandoned... I just got the impression that it was going to turn out to be too expensive for any significant take-up. To that end, I've been (off and on, it's been a busy year) rejigging it with cost in mind from the get-go, as opposed to thinking about that at the end It's now a 4-layer (not a 6 or 8-layer) board, and the parts are significantly cheaper. There are of course different trade-offs that have had to be made, so read on I've just run the latest incarnation through the cost estimator at PCBway (preferred over Seeed for cost), and got per-board prices of $99 (quantity:5), $65 (quantity:50) and $55 (quantity:100). This is for all the SMT work assembled, and includes the cost of the SMT parts. There's a fair few optional through-hole parts on the board (figure another $10 or so per board), and to keep the costs down I've been assuming that kits with the SMD done, but the thru-hole parts being left for the buyer to do would be acceptable. If you're not into soldering at all, I'm sure enterprising individuals would be willing to step into the gap Closer to the time, I'll get costs for a full assembly, and we can take a view then. For now the P.o.R is to only do SMT assembly (anything with red pads on the diagram below). There's a lot of green-holed through-hole parts, but they're not all necessary - most people will only want the slots, the power-in, and maybe the 3 R-Pi connectors.One benefit is that if you just want the slots and memory capabilities, you're pretty much done once you've added the slots and (probably) a case. The board currently looks like the below, and the basic design is to have a cable (I'm using a mini-SAS cable) from the back of the XL/XE to this motherboard. We can put the cartridge connector into that back-of-the-computer module, so there's no need to duplicate it on the motherboard. There's a CPLD to manage the fast turnaround times for the /MPD and /EXTSEL signals, and an STM32 handles the slots. Each slot has a dedicated UART with which it can talk to the STM32 whenever it wants, and the STM32 will schedule all the traffic into an ordered sequence of instructions to the host Atari. It seems to me that using a UART (which can be set to run at 115,200 or 1 Mbaud or 4 Mbaud by pulling pins low on the slot connector) is pretty much foolproof - every MCU under the sun has a UART facility There are a few mentions of the R-Pi4 on the annotated image below, that's because there's an optional R-Pi holder on the underneath of the board on the left-hand side. You can see the extent of the Pi where the dashed/dotted line is. The Pi is connected to the STM32 via the built-in 480 Mbit/sec USB link, which the STM will use to send the video signal (that the CPLD decoded from the Atari's bus activity) down. The R-Pi4 has enough grunt (I think ) to take the video data coming in over USB and zoom it up using the GPU/OpenGL to give a full-screen HDMI interface to the XL. Oh, and you get all the facilities that Linux offers too... The drawback is that a Raspberry Pi takes longer to boot (about 8 seconds on my Pi) than the XL/XE do, so you'll have to switch on the expansion board before you switch on the computer. To be honest, I seem to remember having to do that with anything I plugged into my old ST back in the day... I think this is a reasonable trade-off for more than halving the price of the expansion kit. You pays your money, and ... To set expectations, I'm hoping that the all-in price (including a 1GB R-Pi4, the cables to connect it in, and SMT assembly) will come to ~$130 assuming there's sufficient interest for the quantity-50 price. I'm also hoping to get myself a rather nice 3D printer this month (it's bonus week, so the yearly toy gets bought...) and I'm definitely keeping an eye on being able to print the case - this printer can do 27x15cm prints... Anyway, that's the update. Updates are not going to be anywhere near as frequent as at the start of this thread, but the project has certainly not been forgotten
  10. Yep. I ran the numbers on using an Arm microprocessor when doing my Parallel port expansion project as well, and given the interrupt latencies, the bus requirement to respond with /MPD and/or /EXTSEL within 48 nanoseconds of getting a valid address is a pretty hard requirement. 50ns is a 20MHz clock, it's all ok to do it via hardware in the old days because you didn't need the flexibility of programmable memory apertures or what have you. It's a bit different trying to do that in a software world... In the end I went with an FPGA. That way I could get very fast responses, and also have direct access to memory, giving me interesting opportunities for memory management I'm still (slowly) working on it, even though the project thread went radio-silent. I think the cost was a problem (and I think I've found a way around that, roughly halving the cost at Q50) and I've got a board out with Seeed now to test out some of the connectors. Anyway, just poking my head in - and letting people know I'm still playing/thinking about the PBXL, even though it might not seem that way
  11. Me too - I have a gigabit fiber connection, and a 64TB server with plenty of space on it yet Happy to trickle it in, if that helps your bandwidth, and happy to set it up as {ftp,http}.us.pigwa.net if you want ? [edit] Ok, mirroring has begun, and it's slowly trickling in (bandwidth-limited on the fetching side, and with delays between fetches as well). I don't want to put any strain on the source As things come in, this'll get more useful, but the site is now up at http://atari-archive.net/
  12. Just a ping to say I'm still alive, I haven't given up, and it's 6:30 am and I'm posting from work, which is why there's not been much any progress on the board this last month. I have a deadline to meet in a few weeks, and hopefully after that I'll find the time to carry on working on it - but between my work schedule, school activities, and planning the kids birthday next month, there's just been no time.
  13. Further updates: The SD card interface isn't playing ball. I think there's something wrong with the middleware I was planning on using, which means I'll either be trying to debug the SD card library I have, or writing one from scratch, which is a bit disappointing. The card responds to several calls, happily tells us its configuration, type and required settings, but then a call to get the card status always returns an error. The code is a bit opaque, so I don't think it'll be too easy to fix. We'll see. The other side of the board, of course, is the FPGA. There's a fair amount of prep-work involved in bringing up the FPGA - specifying all the i/o pins as being linked to a given named port in the top-level module via constraints, and specifying all the voltage specifications etc. On top of that, there's the actual module code to be written. In my (admittedly, limited) experience creating verilog for FPGAs, it's a good idea to have an overall diagram specifying how all these things are going to interact, so the first draft of that looks like: There are going to be a few clock-domains implemented using the DCM tiles in the FPGA, so that I only need a single clock input (which runs at 50MHz). The figures are only estimates of what I think I'll be able to get the code to do, so they may well change... I know I want SPI5 (input SPI from the ARM chip) to be as fast as possible so I can do oversampling and recover the clock relatively accurately. I know the host bus interface (purple) will only need relatively slow access, since that's a <2 MHz clock on the atari. I also know I can synthesize a 6502 core at ~100 MHz, so that'll set the green clock speed. The ARM can recieve SPI at up to 100MHz, so SPI6 can piggy-back on that. I want the decode and Antic code to be "as fast as possible", but that'll be limited by the ANTIC part I suspect. We'll see. Encouragingly, it's not an overly-complicated design. There's really only 4 paths through the design for data to flow, which means I can separate things nicely. I'll use wide-bit FIFOs, so I can encode data+context into a single word, and this is made easier by the FPGA allowing the use of Block-RAMs as FIFOs with widths up to 72-bits, 512 words deep... plenty for combined data+what-to-do-with-the-data information. With that out of the way, I have an input/output constraints file to write...
  14. Quick update on the board-bringup So the first thing to do, once the ubiquitous LED has lit, is get serial i/o working on the debug port. I duly connected it all up, told the app to io_writechar(huart2, 'X'), hit run, and .... nothing happened. no X for me. It took a bit of head-scratching to figure out the problem - I'm using the same connector as I use at work (a 10-pin JTAG pinout) because then I have access to all the cool hardware set up for board bringup. To get that to work with the open-source tools though, I'm going through a 20->10-way JTAG adapter, and that board doesn't propagate the serial line signals to the correct pins. No real problem, I soldered a couple of wires to the back of the board and connected it to a different serial->usb converter, so I still have serial output. Next up was the second serial port. This was a screw-up on my part. The circuit is fine, up until the DB-9 port, where I routed TX to the RX pin and vice versa. I must have looked at that circuit 1000 times, and I still didn't catch it. For this revision, it's not too important, I'll just run one of these between the board and the cable, and swap the lines over. Now that writing bare bytes to the serial port was up and running, I wanted printf() output (formatted text out). This is really useful in debugging to log something, so its always one of the first things I bring up. A simple call to printf() produced nothing, which is odd because printf calls into _write() which in turn you have to make output characters to the correct serial port using the __io_putchar() call. I'd done that, but I could see that the code wasn't being called (I put a breakpoint in it) by printf(). More head-scratching... There is the concept of a 'weakly linked stub' in programming, where a default (usually empty) chunk of code will be linked in, if the user-supplied code of the same name isn't supplied. I thought that somehow my own code wasn't being called, and the weakly-linked code (which does nothing in this instance) was getting the call instead. I spent quite some time figuring out that this wasn't in fact the case. Eventually, I was poring over the assembly output and I realized the code for _write() wasn't being compiled into the binary, even though I could see it ... *right there* in the IDE. If you've figured it out, you're ahead of me... The IDE generated this code (syscalls.c) for you, puts it into the source tree, but doesn't actually compile it until you move it to a directory named 'Src'. Wonderful. So, moving the file over, recompiling, and flashing it down to the board, and we have a working printf. "Hello World" rules again. So, serial out of the way, next to verify the clock speed was actually what I thought it was. There's a function library that comes with STM chips called the HAL (Hardware Abstraction Layer). One of the HAL_RCC_xxx calls will return the frequency in Hz, so it was just a matter of formatting the output using printf() into MHz for ease of reading... Finally (for this post), I wanted to check out the SDRAM. Using the data sheet, I calculated the parameters for this SDRAM, running it at 100 MHz (which is the max speed for the STM32H7). Enabled byte-lanes, burst-mode, set CAS and RAS timings, and various other things. Tried to access the RAM, and ... nothing. The CPU hit a bus exception and jumped to its exception handler. So, a bit more reading up on SDRAM - I generally use an SRAM on microcontrollers if I need more memory (I generally don't need anywhere near the memory available on this board), so I'd overlooked that you have to initialise an SDRAM with a particular sequence of commands. Laying down those commands in the correct order with the correct delays, and configuring the SDRAM MODE register to match what the STM32H7 would be sending, I could read and write to locations in the SDRAM memory space. The boot sequence is currently pretty short, but it looks like: SoS Booting. SoS Compile date: Dec 18 2018 @ 15:08:54 SoS Booting at 400.00 MHz SoS SDRAM Memory ok at both BASE and LAST All of which is duly printed out on boot The remaining peripherals are: ​USB (which I'm going to leave for now, getting the HID service up and running is not a trivial task) The SD card The video output The SPI interfaces to the FPGA The SPI interface to the slots The i/o expander But so far, so good. Nothing has gone wrong that isn't recoverable Simon.
  15. I'm not sure if this is worth a post from the not-me perspective, but it's a big deal for *me*, so I got SW4STM32 up and running, with openocd debugging configured, so I can code in Eclipse (which isn't my favourite IDE, but it's tolerable). What it does get me is a graphical interface to gdb, letting me interactively single-step through code running on the STM32H7 on my board. As an embedded developer by trade, this is one of those "thank ${deity} it works!" moments Sure, you *can* use gdb/lldb from the commandline, but it's much nicer to have all the information presented in the UI... I was a bit concerned because I usually use a work-provided debugger which properly handles the NRST signal, but the standard ST-Link hardware doesn't do that. Fortunately there's an option in SW4STM32 to do a software-reset rather than rely on hardware, and this worked on my STM32H7 chip. You need the reset to do the flash programming, so it's kinda important. This way, once you have a board, and have git-cloned the repository, anyone can update the firmware and flash it down to the board using free-to-use open source tools, and the cheap ST-Link2 programmer with a 20-10 way adapter.
  • Create New...