Jump to content

SCPCD

Members
  • Posts

    136
  • Joined

1 Follower

Contact / Social Media

Profile Information

  • Gender
    Male
  • Location
    France
  • Interests
    Informatic and Electronic

Recent Profile Visitors

8,362 profile views

SCPCD's Achievements

Chopper Commander

Chopper Commander (4/9)

188

Reputation

  1. All this work is very interesting. :) I see in the jaguar-sdk that bin/linux/gdbjag uses m68k-coff-gdb. Is it available in .exe form somewhere for windows ?
  2. I think that : - the picture is really converted in 16-bit - in your object list, you specify a 24-bpp bitmap - the video mode is in 16-bpp The whole of these make it print properly in "not flipped" mode as all 16-bit pixel will be written back in correct order in the linebuffer : (PIXx = 16-bpp) [PIX1,PIX2] [PIX3,PIX4] .... But, in flipped mode, 32-bit (as 24-bpp is in fact 32-bpp) data are written back into the linebuffer making 2 consecutive 16bpp written back into the linebuffer : this will shown as one colonne on two to be "reverted" making those artifacts : ...[PIX3,PIX4][PIX1,PIX2] instead of the proper : ...[PIX4,PIX3][PIX2,PIX1] To resolve the problem, I think that you just need to set the bitmap as 16-bpp (and eventually correct sizes values) On the Jaguar, you can't have 24-bpp bitmap in 16-bit mode, (and 16-bpp bitmap in 24-bpp mode) The parameter is in VMODE register.
  3. The blitter generate an interrupt when it goes to "idle" state. :)
  4. The pin 1 is a "LVI" (Low Voltage Indicator) https://pdf1.alldatasheet.com/datasheet-pdf/view/12076/ONSEMI/MC34163.html It is intended to be used as a reset signal if the output power is too low, but it's not used on jaguar. (the jaguar has another dedicated reset circuit) TestPoint 87 is made available for diagnostic only.
  5. I don't think that the CLR instruction is the "reverse" of TAS as the CLR sets flags according to destination value (which is always 0). TAS sets flags depending of what was here and what it now is. I would imagine that to reduce transistor count, CLR instruction probably share same or partially state machine from another instruction like the NEG instruction for exemple (as they both have same size field, effective address mode and timing).
  6. The title screen and the wrap level is way smoother than on real hardware too, from the previous video. It doesn't seems cycle accurate (at least on Blitter) as the GameOver screen is only limited by Blitter bandwidth and there isn't any frame limitation in this part of T2K code (more fps = faster effect).
  7. From what I quickly read, they all say the same : GPU in main is 10x slower than GPU in internal. That is totally different than "GPU in main is slower than 68k" that you claim we are saying...
  8. Yes, the time is different between each one : In sub 16-bit, the OP will do more data shift (ex: in 8bpp, it will do 3 shift insteed of 1 shift for 16-bit) that add extra cycle for each phrases as the OP can only write 2x16bpp at a time in the line buffer. Extra cycle will be also done for the CLUT read. By default, the OP lock the bus during the data shifting, but this can be disabled by a bit in the Bitmap Object description allowing other CPU to take the bus during large sub 16-bpp bitmap. I haven't found any chronogram in my archive, but I can make new one for those cases if needed. What I said is for unscaled Bitmap. For scaled bitmap it's totally different as the OP will write in the line buffer 1 pixel at a time insteed of 2 and this will increase significantly the time to draw bitmap in the line buffer. I mean that a frame is 1/60Hz = 16.6ms, the OP can only work during 15.252ms compared to the PSX where the GPU can work the full 16.6ms to render a frame. Then the OP have 1.4ms less time to render a screen.
  9. Yes, 32x32 is more realistic : 14cycles+ ((32pix-4pix)*2 Bpp / 8)*3cycles/phrases = 14cycles + 7phrases*3cycles/phrases = 35cycles 63.55µs*26.59MHz/35 = 48 objects by lines. dividing the screen to 224/32=7 bands, we will have arround 7*48=336 sprites of 32x32 16bpp
  10. Is there a bug in the matrix ? https://forums.atariage.com/topic/332827-neo-geo-to-jaguar-ports-in-321-go/?do=findComment&comment=5053480 As explain in my WebSite about F.Act.S : The 2nd screen has 1887 visible moving OP sprites. The 1st one has 1900 visible and grow up to 2090 visible moving OP sprites with RMW enabled. The OP has advantage and also disadvantage. The disadvantage is that it has only arround 64µs to draw the whole line because it works in linebuffer instead of the PSX that use a framebuffer and will have the full time frame (16.6ms) to render all sprites (as already said by CJ). In pure rendering time, the OP can work approx. during : 63.55µs*240lines = 15.252ms. it's arround 1.4ms less than the PSX. The most optimised object is a 4pix wide 16bpp that take 14cycles to be executed. this allow a theorical maximum of 120 objects by lines (63.55µs*26.59MHz/14 = 120). The wider the object will be (each extra readed bitmap phrase will take 3 extra cycles), less object you will have by line. it's not magic : it's mathematic. For a 128pix 16bpp it will be arround : 14cycles+ ((128pix-4pix)*2 Bpp / 8)*3cycles/phrases = 14cycles + 31phrases*3cycles/phrases = 107cycles. 63.55µs*26.59MHz/107 = 15 objects The theorical maximum on the OP is 15x 128wide 16-bit object by line. (i precise "theorical" because we don't take account of branch or DRAM refresh that probably reduce this number) In theorical, it can do 15object/line*240line = 3600 of 128wide x 1height 16-bit object, but does it make any sense to do 1pix height object ? Grouping those simple object into taller one and it will approximate at a theorical 15 object of 128x128 16bit on screen. maybe a little more depanding of positional of those objects but it will not be transcending. Way behind those 3000 or 700 lol.
  11. In the listed code, GT is ccdef to $15 which is "Not Negate and Not Zero", that's why I think it has one loop less than it should 😕
  12. I think that the compiler error is that there is no real way to do GT in signed for the DSP/GPU : https://www.mirari.fr/9d0k The NNNZ test can be wrong in some signed value as overlow flag doesn't exist in DSP/GPU. Over thing: how does it compile with "for (i = 0; i<2; ++i)" (does it push 2-1 in r20) ? Maybe i don't read the code and count properly but I feel like there is 1 loop less than it should do ?
  13. Replacing the FPM DRAM Memory Controller by a SDRAM Memory Controller will not change the face of the world in the Jaguar simply because : - the 68k is way slower than the FPM DRAM and only do Random Read-Write access => so no performance increase here - the DSP/GPU don't do FPM DRAM access => so no performance increase here - the Blitter do FPM DRAM access only when configured in write only mode, all other mode will be Random Read-Write access => so no performance increase in textured mode, only in flat/gouraud - the OP do FPM DRAM access for bitmap => the performance will be significantly increase for wide bitmap Random read/write on SDRAM is not free and SDRAM needed also a higher refresh rate. As is, Doom, Skyhammer, etc, will not have the boost you think it will have, as the botleneck is not the bandwith of the memory but the lack of cache for each CPU.
  14. All FPM DRAM can't reach that speed, it all depends of the speed grade of the memory given by the constructor (and the price you are ready to pay for the component). I'm just saying that those used in the jaguar has a speed grade of 70ns (and I think there are also some 80ns) that has a FPM of 45ns and by definition can't reach 25MHz. The problem of the limitation of ~100MBps is due to the speed grade of the DRAM used in the Jaguar and not the fact that there is a 68k.
  15. Hitachi-70ns DRAM, that is used in some jaguar, is rated to be 45ns for FPM cycle, this implies 2 cycles @26.59Mhz and have absolutly nothing to do with the 68k.
×
×
  • Create New...