+CyranoJ Posted January 25, 2019 Share Posted January 25, 2019 The code can't be completing before it even detects it has started as its copying 256 bytes around. What if the VI (level0) triggers at that point? 3 Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 (edited) On 1/25/2019 at 11:15 AM, CyranoJ said: What if the VI (level0) triggers at that point? Good call! It's all in a very tight loop copying 256 bytes from cart to main ram to test saturated reads from RISC to RAM so the extremely unlikely event of the video int triggering between two instructions is bizarrely high... Adding a sr=2700,sr=2000 around the start loop SEEMS to have sorted it. I did have it sitting in soak constantly reading to the cart RAM and then re-testing for hours yesterday without an issue, but since adding a proper progress bar the timing of the code has changed and the lock-ups became more frequent (but still quite infrequent). It's looking good though. I'm hoping this may actually be the last head-ache to proper stable interrupt driven loading from the memory card. For those who are interested in the mechanism, its very much the same as the CD-ROM. You request a number of bytes to be read from a file and then you get an interrupt when data is ready. Data is presented in a 32 byte memory mapped area in the cartridge memory space ($dfffc0) and is read in blocks like this from the GPU. When the block (FIFO buffer) is read, you clear the bit to acknowledge the data and the next FIFO read is kicked off. Data is actually sent in 512 byte packets, and there is also a packet available / packet ack handshake system as well. Its basically abstracted as a synchronous call in the read file function at the moment, but I'll add asynchronous methods also which just leave the GPU running. At the moment its stopped and started per read, but could be setup as a full interrupt driven block request system. Finally I may see some light at the end of the tunnel. If this is completely stable its also good news for CD emulation.... as there have been so many little issues with just getting the cart data loading reliably, I'm not surprised there were issues with the CD data reads. I'll now leave this soaking for the rest of the day... Edited October 25, 2019 by SainT 9 Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 Well, it helped! Less frequent, but still sometimes a hang. And this time with no interrupts enabled, so the screen blanks too! Fun. Quote Link to comment Share on other sites More sharing options...
42bs Posted January 25, 2019 Share Posted January 25, 2019 The only odd thing I am seeing now is that sometimes the RISC doesn't start. I basically have this code -- move.l #RISCGO,G_CTRL .waitStart: move.l G_CTRL,d0 and.l #RISCGO,d0 beq.s .waitStart .waitStop: move.l G_CTRL,d0 and.l #RISCGO,d0 bne.s .waitStop I've made sure the RISC is stopped before loading code and starting it up. I've added some border colour changes to see where its hanging, and its in the .waitStart loop waiting for the RISC to be running. Has anyone seen this before? I looked up my code, and I do enable GPU/DSP with interrupts disabled (ok, I do not use the 68k for interrupt, therefore the "stop"): lea $00F02114,a3 move.l #startGPU,-4(a3) IF SOUND = 1 move.l #0<<11|1,$00F1A114 ENDIF move.l #0<<11|1,(a3) .loop: stop #$2000 tst.l ResetFlag(a6) beq.s .loop Anyway, you should not read the GPU registers in a tight loop. There is an errata about this I think. But cannot find it yet. There should also be no reason to start the GPU in a loop. 2 Quote Link to comment Share on other sites More sharing options...
42bs Posted January 25, 2019 Share Posted January 25, 2019 Good progress being made here. The interrupt stuff seems pretty stable now. I only enable, handle and disable interrupts from RISC code, and I've not had any odd hang-up issues on the RISC now. I also found a hardware bug in my ASIC which was causing weird behavior. The Jaguar hardware doesn't correctly de-assert the chip-selects when accessing external memory space from the RISC chips. it seems to behave OK from the 68k. Do you use phrase loads? Quote Link to comment Share on other sites More sharing options...
SCPCD Posted January 25, 2019 Share Posted January 25, 2019 (edited) I also found a hardware bug in my ASIC which was causing weird behavior. The Jaguar hardware doesn't correctly de-assert the chip-selects when accessing external memory space from the RISC chips. it seems to behave OK from the 68k. I'm not sure if its because it's do contiguous reads so the external bus is saturated and the chip select doesn't get de-asserted, or if the external get is lazy and doesn't de-assert until a different address space is accessed. As describe in the "Technical reference v8" section "Timing Diagram" and probably in another document, you can't rely in ChipSelect to latch the address : there is a pipeline mecanism on ROM1 access, so the Chip Select is remaining asserted during all "burst" transfert. With the 68k it never happens since the memory controller add an extra cycle to "translate" the jaguar bus to the 68k bus width, and so a bus release is made. In opposite, with the GPU, Blitter or OP, you will have 100% chance to encounter successive read/write in "burst mode" at a time or another. With the DSP, i don't remaind if I made some ROM1 tests in the past so can't say if it's case 1 or case 2. Edited January 25, 2019 by SCPCD 5 Quote Link to comment Share on other sites More sharing options...
42bs Posted January 25, 2019 Share Posted January 25, 2019 Well, it helped! Less frequent, but still sometimes a hang. And this time with no interrupts enabled, so the screen blanks too! Fun. Which CPU does the OP-list reloading? It sounds, the 68k does it, if the interrupts are disabled, you have no OBL reloading and the screen goes off! Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 I looked up my code, and I do enable GPU/DSP with interrupts disabled (ok, I do not use the 68k for interrupt, therefore the "stop"): lea $00F02114,a3 move.l #startGPU,-4(a3) IF SOUND = 1 move.l #0<<11|1,$00F1A114 ENDIF move.l #0<<11|1,(a3) .loop: stop #$2000 tst.l ResetFlag(a6) beq.s .loop Anyway, you should not read the GPU registers in a tight loop. There is an errata about this I think. But cannot find it yet. There should also be no reason to start the GPU in a loop. Ok, I'll switch to a flag for stopping and see how that affects it. I had this originally, but switched to polling RISCGO to make sure it had stopped. It's just polling waiting for it to start as well, rather than setting the GO bit in a loop. Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 Which CPU does the OP-list reloading? It sounds, the 68k does it, if the interrupts are disabled, you have no OBL reloading and the screen goes off! Yep, screen blanking with interrupts off is to be expected. It's just a very basic setup for letting me handle what I need for the menu. Quote Link to comment Share on other sites More sharing options...
42bs Posted January 25, 2019 Share Posted January 25, 2019 For those who are interested in the mechanism, its very much the same as the CD-ROM. You request a number of bytes to be read from a file and then you get an interrupt when data is ready. Data is presented in a 32 byte memory mapped area in the cartridge memory space ($dfffc0) and is read in blocks like this from the GPU. When the block (FIFO buffer) is read, you clear the bit to acknowledge the data and the next FIFO read is kicked off. Data is actually sent in 512 byte packets, and there is also a packet available / packet ack handshake system as well. What data rate do you expect from SD card? For high rate I'd suggest to do phrase read into GPU RAM until one or two 512bytes chunks are read and then blit it to main RAM. 1 Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 What data rate do you expect from SD card? For high rate I'd suggest to do phrase read into GPU RAM until one or two 512bytes chunks are read and then blit it to main RAM. At the moment the data rate is being crippled somewhat by the microcontroller. The silicon is a bit crap, and the best I can get is about 8Mhz on the SPI line. Then at the moment it's being read in, then written out, which again halves the rate. With all the overhead, it's coming in at about 300kb/s, which is just ok for CD! I may be able to actually get the DMA hardware on the micro to read from one SPI channel and output to another rather than me doing it via a buffer, so I may be able to double it. But either way, it's not bad. The data rate limitation certainly isn't on RAM writes. 2 Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 Ok, I'll switch to a flag for stopping and see how that affects it. I had this originally, but switched to polling RISCGO to make sure it had stopped. It's just polling waiting for it to start as well, rather than setting the GO bit in a loop. Initial testing with a stop flag seems promising. Tried with a stop just in the data read and a poll on the copy, got a hang quite quickly. Changed to a stop flag in the copy as well and it's been going well. I can believe that polling the GPU control register at the wrong moment could feasibly screw something up, so I'm hopeful. More soak testing... 3 Quote Link to comment Share on other sites More sharing options...
LinkoVitch Posted January 25, 2019 Share Posted January 25, 2019 What data rate do you expect from SD card? For high rate I'd suggest to do phrase read into GPU RAM until one or two 512bytes chunks are read and then blit it to main RAM. Would phrase read make sense? Given both the Cart bus and the GPU memory bus are both 32bit not 64? Not sure if a split phrase read is more efficient or not than a regular long read? (I don't know the answer, hence asking the question) Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 Would phrase read make sense? Given both the Cart bus and the GPU memory bus are both 32bit not 64? Not sure if a split phrase read is more efficient or not than a regular long read? (I don't know the answer, hence asking the question) I'm not sure if you can do a loadp / storep from cart to dram. There are scoreboard issues with phrase read / write as well. So in general, it's probably not useful... Quote Link to comment Share on other sites More sharing options...
42bs Posted January 25, 2019 Share Posted January 25, 2019 Would phrase read make sense? Given both the Cart bus and the GPU memory bus are both 32bit not 64? Not sure if a split phrase read is more efficient or not than a regular long read? (I don't know the answer, hence asking the question) Phrase support is rather crippled, as _any_ read will destroy the high long. So if you have interrupts in the GPU, there is no way of doing so. Other from this, I have to admit, I use it only in OP-list update (interrupt). And my lists are short. So I have no "hard" figures to judge from. But the code is shorter B-) 3 Quote Link to comment Share on other sites More sharing options...
42bs Posted January 25, 2019 Share Posted January 25, 2019 Would phrase read make sense? Given both the Cart bus and the GPU memory bus are both 32bit not 64? GPU/Blitter DRAM bus is 64bit. Cart bus can be 8,16 or 32 IIRC. 1 Quote Link to comment Share on other sites More sharing options...
LinkoVitch Posted January 25, 2019 Share Posted January 25, 2019 GPU/Blitter DRAM bus is 64bit. Cart bus can be 8,16 or 32 IIRC. I was more referring to the physical aspects of the bus(es). IIRC the GPU is only a 32bit bus, it is within the Tom silicone which obviously has the 64bit bus attached and used by blitter and OP, but the GPU bus is still only 32 bit is it not? Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 The GPU has full 64bit access to DRAM afaik but has a 32bit bus for local RAM and blitter access etc... Quote Link to comment Share on other sites More sharing options...
42bs Posted January 25, 2019 Share Posted January 25, 2019 The GPU has full 64bit access to DRAM afaik but has a 32bit bus for local RAM and blitter access etc... Right. From the docs: The graphics subsystem transfers data to or from external memory by becoming the master of the co- processor bus. This bus has a 64-bit (phrase) data path, and a 24-bit address, with byte resolution. This bus has multiple masters, and ownership of it is gained by a bus request/acknowledge system, which is prioritised, i.e. ownership can be lost during a request (but not during a memory cycle). The graphics subsystem actually contains two bus masters, the Graphics Processor and the Blitter. The graphics subsystem also acts as a slave on the IO bus. This bus normally has a 16-bit data path, and allows external processors to access memory and registers within the graphics subsystem. As the data path within the graphics subsystem is 32-bit, all reads and writes must be in pairs. 1 Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 I've managed to read, then re-read and verify every Jag ROM there is back to back... twice... without any hangs or failures. Looking promising. 14 Quote Link to comment Share on other sites More sharing options...
42bs Posted January 25, 2019 Share Posted January 25, 2019 I've managed to read, then re-read and verify every Jag ROM there is back to back... twice... without any hangs or failures. Looking promising. So up to which size of the SD card will you support? Large enough for _all_ games? Quote Link to comment Share on other sites More sharing options...
SainT Posted January 25, 2019 Author Share Posted January 25, 2019 Anything FAT32 formatted -- so up to 32gb. That should cover all ROM's and CD's. 11 Quote Link to comment Share on other sites More sharing options...
Hastor Posted January 25, 2019 Share Posted January 25, 2019 Anything FAT32 formatted -- so up to 32gb. That should cover all ROM's and CD's. I've formatted things up to 1TB as FAT32 - there was a point where that's all a Wii would read, which worked fine. I did have to use software other than Windows to do it, since Windows doesn't give the option at that size. So would a FAT32 formatted 64GB work or is there still a size limit even if it is FAT32? I'm sure there's stuff I don't understand here. Not that it's needed just curious. Quote Link to comment Share on other sites More sharing options...
42bs Posted January 25, 2019 Share Posted January 25, 2019 I've formatted things up to 1TB as FAT32 - there was a point where that's all a Wii would read, which worked fine. I did have to use software other than Windows to do it, since Windows doesn't give the option at that size. So would a FAT32 formatted 64GB work or is there still a size limit even if it is FAT32? I'm sure there's stuff I don't understand here. Not that it's needed just curious. Same here. FAT32 has plenty of space: It is 2^32-4 clusters! So 2GB with 512 byte cluster. But since the games are multiple of 2MB, one can increase the cluster. But then, the FATFS (let me guess: ChanFS) or better the µC must be able to handle it. Quote Link to comment Share on other sites More sharing options...
LinkoVitch Posted January 25, 2019 Share Posted January 25, 2019 Right. From the docs: The graphics subsystem transfers data to or from external memory by becoming the master of the co- processor bus. This bus has a 64-bit (phrase) data path, and a 24-bit address, with byte resolution. This bus has multiple masters, and ownership of it is gained by a bus request/acknowledge system, which is prioritised, i.e. ownership can be lost during a request (but not during a memory cycle). The graphics subsystem actually contains two bus masters, the Graphics Processor and the Blitter. The graphics subsystem also acts as a slave on the IO bus. This bus normally has a 16-bit data path, and allows external processors to access memory and registers within the graphics subsystem. As the data path within the graphics subsystem is 32-bit, all reads and writes must be in pairs. I just found the page, and looking at the diagram I still think the GPU only has 32bit external access. The diagram beneath that text doesn't label things clearly. At the bottom is the 64bit system bus, below the GPU bus gateway. The blitter connects directly to the 64bit side of the gateway. Above the gateway is the 32bit local GPU bus, which connects to the GPU core and GPU local RAM (as well as blitter, I imagine for speed and convenience). I believe that to say make a phrase read/write the GPU would fetch or write twice to the main RAM, I doubt the gateway does a 64bit r/w operation, I don't think it forms part of the RISC core. SCPCD probably knows 100% given his tinkering 3 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.