-
Content Count
7,205 -
Joined
-
Last visited
-
Days Won
8
Content Type
Profiles
Member Map
Forums
Blogs
Gallery
Calendar
Store
Everything posted by Tursi
-
Our own super, maga, wonder, ultra cart board thing...
Tursi replied to matthew180's topic in TI-99/4A Development
Right... only reason I'm asking the way I am is that my code specifically has no access to the CPU memory bus, but Retroclouds is right, I have plenty of RAM I won't be using. I guess I'll make it available on one of the GROM bases, and if people want to use it, then it's there. As GRAM it WILL be different from the writable GROM at the rest of the bases, since they will need to use a sequence to erase the flash before writing is possible. -
Our own super, maga, wonder, ultra cart board thing...
Tursi replied to matthew180's topic in TI-99/4A Development
The Atmega1284's purpose on that board is solely GROM emulation - it's the ultimate extension of my single-chip GROM emulator taken to the ridiculous. The goal (which I need to finish coding as my next project) is to make it writable from the TI (tricky but doable), and to make it respect multiple GROM bases (easily feasible). (GROM read already works just fine). Making it respond as RAM was not something we planned on, since the flash chip will live at the cartridge memory address. But I should have lots of RAM free - if I made some RAM accessible through the GROM port, would that still be valuable? -
Some tactics to remember: -You can jump or avoid the plates by climbing a pole. Plates will drop through the hole made when a pole rises through the platform, so watch out if you are below -Knives are thrown either high or low - high knives can be ducked, low knives you must jump or avoid -Since you never know if the guy on the end will attack, be ready any time you see him On higher levels there is a greater chance of both appearing on your level and attacking. Plates spin at a consistent speed, so hitting up more than once is good for effect but doesn't have any effect on the gameplay. You can do this to reset a wobbly plate if you get to it in time. It is impossible on higher levels to get them all in one life, but the best strategies for best score aren't determined yet.
-
You can always review the FAQ here: http://harmlesslion.com/hl4m/viewtopic.php?f=12&t=330 or the manual here: http://www.harmlesslion.com/software/skunk Anyone who wants to develop the USB ports is welcome to write the drivers, it's a development tool, after all. Much more productive than sneering that they aren't already done.
-
Although the Jagware Flash cart is not yet released, we can see they are aimed at two different markets. Skunkboard is aimed at developers. It allows you to upload software to the Jaguar, test it, communicate with it, and reset the Jaguar, all from your keyboard over USB. The Jagware flash cart, as I understand it, is aimed at publishers, people who need an inexpensive way to release their product. It can also be used for development, using a parallel port and a BJL cable, but it's unclear how much software support for this will be included. Both devices are capable of running existing software and betas, although Skunk blocks some current titles and the Jagware flash does not. Both devices use similar technology and will have similar incompatibilities. Jagware's flash is much cheaper, so if that's all you want to do, maybe it's the better choice. It might even be cheap enough to buy several and leave the betas on them (final price not being set yet). Skunkboard has got a few years in the field now, so we have a bit of proof of concept going. It's a more expensive device but this is because it's designed to do more. And as always we're willing to share the datasheets (not that we need to, the chips are labelled) to anyone who wants to develop the USB ports further.
-
Skunkboard is kind of off topic, so I'll keep it brief, but to answer the questions that keep coming up, no official Atari titles or betas are deliberately locked out of the Skunkboard BIOS. Only software that current developers have requested be locked. Without the permission of those developers, I am bound by my agreements not to be more specific. I would expect people to respect that. If you flash one these titles to either bank, you will not be permitted to run ANY software until you overwrite it with something else - this is to prevent backdoor workarounds. The failure mode is a red screen. If you are trying to launch via JCP, you will also get the message "Unauthorized. You must flash a different rom to proceed. (Remember to reset the jag with 'jcp -r'!)" Skunkboard's purpose is development. Therefore, if you are developing a title that triggers this message, get ahold of me. I will either fix my BIOS or tell you how to fix your code not to trigger it. That's pretty easy. The signatures are very specific. If you are trying to run an old beta or commercial title that triggers it, you can ask nicely and I'll try to help, but I have always made it clear that my policy is NO SUPPORT of these titles, and they either just work or don't work. But I can take a look, and either modify my BIOS (if it's severe) or hack the beta (if it's not) to make it work. Nobody has shown me a file that falsely triggers the unauthorized message, however (I have tried an Atari Karts beta, I don't know which one is "the" Atari Karts beta). That said, I /have/ seen some crashes make it look like an unauthorized result, but resetting the Jaguar and trying to launch again (with the DPad) showed that it was just a crash, and not a true false positive. There's not much I can do about code that scribbles all over memory or code that won't work on the board at all, but others are welcome to try and fix them.
-
I assume you're talking about the Skunkboard ? It's a bit surprising ; the "ban list" has not been released, but I was under the impression that the only banned game was Battlesphere. It could be a false positive. Most likely. But people tend to whine and moan more than try and fix problems. Nobody has ever sent me any information on false positives for me to fix, and I've tested it against every file in my collection numerous times. That said, I don't obtain every possible file either, I rely on people to let me know when there is a problem.
-
Thanks for the hotel tip, Matthew! I'm booked now at the same Best Western. I was going to go Orrington just to avoid being lost, but you're right, it really is a simple walk! Now I just have to hope I have anything at all to show.
-
The GPL COINC routine just does a distance compare, Owen, we already taught you how to do that in assembly.
-
There actually was an agenda of sorts last year. it's not right to put all the blame on Hal there, because the PLAN was for a full day - the library cut that plan in half and he didn't want to tell people they couldn't present. I think that's reasonable given some people travelled just to present. I was frustrated, too, but credit where credit's due.
-
I still don't know why you added the technically correct 4 sprite limit... It is going to take some serious convincing to get me to see why I would want to have the limit set by default. I like my idea of a register that you can set a limit if you want it. Because, I have one program that relied on it for correct graphics, and it bugged me. Also because someone was being twitty about it in the mailing list. My point is, either have it set on by default (with a register to DISABLE the flicker, rather than enable it), or don't include it at all. Having an option to turn it on doesn't help because I can't see anyone writing software for a chip without sprite limits but wanting to turn them on -- said software won't work with the real 9918. Software on the real 9918 that DOES expect the limit won't work correctly on yours, because it doesn't know to set the new register you are proposing. There's so little that relies on it my personal thinking is that either of these options are equal in desire. But saying you need to set a register to ENABLE 4-sprite-on-a-line -- I can't think of a case that would be used. Not sure what you mean by "tiles that flicker when you don't want the to"? Because I wrote "titles", not "tiles". ie: games like the one you often quote, Munchman. No boards. I have a working single-chip emulation of GROMs that runs on AVRs are full speed. AVRs are not pin-compatible with GROMs, unfortunately (at least none I've found are), but you could wire them into the console. In fact one Mega168 can replace two GROMs. I haven't released it yet because I need to code up the write mode, and for larger AVRs, multiple GROM bases. Soon, I hope.. my last day at work is Friday, then I'm out of the country for a couple of weeks, then I need to knuckle down.
-
Sega EVERDRIVE Flash Cart Official Mass Order Thread
Tursi replied to the.golden.ax's topic in Buy, Sell, and Trade
Mine arrived today, thanks Ax! -
hehe, this is moving along beautifully!
-
Great to see you announce this!! If I had a choice, I'd like to see the 4-sprite on a line limit active by default if it's there at all, but I only have one project that used it deliberately (to mask out sprites on certain lines), Classic99 survived for a very long time not supporting the limit without any real issues, too. I think that only software which expects the limit will be written for it, so a switch to enable it probably is not very useful. That said, enhancing existing titles that flicker when you don't want them to is nice, too. Perhaps we could release a hacked GROM that lets you select certain VDP modes from the master title page?
-
No, you set an address without the prefetch inhibit bit, but you write without reading first. A huge amount of confusion of the way the 9918 address counter works was caused by TI telling you to "set a read address" or "set a write address". When you get down to the lowest levels, there is no such thing, the bit that you set for a "write" address is actually a prefetch inhibit. So the normal approach to set a "write" address already stored in R1 (and leaving it untouched in the end) is something like this: VDPWB DATA >4000 SOC @VDPWB,R1 * Make it a 'write' address SWPB R1 * LSB first MOVB R1,@VDPWA * Write to VDP address register SWPB R1 * Get MSB and delay MOVB R1,@VDPWA * Write to VDP address register SZC @VDPWB,R1 * Get rid of the 'write' bit This works just as well, and is slightly faster: DEC R1 * Make one less to account for VDP prefetch SWPB R1 * LSB first MOVB R1,@VDPWA * Write to VDP address register SWPB R1 * Get MSB and delay MOVB R1,@VDPWA * Write to VDP address register INC R1 * Get back the original value In emulation it only works on emulators to get the VDP prefetch correct. Emulators that get it wrong will also fail on the Diagnostic cartridge memory "checkerboard test" and the game Popeye will leave graphical glitches when a bottle is thrown. Knowing about the way that VDP address register works can help your code a little, too, since you can freely change between reads and writes without changing the address register, if your data layout happens to work with that. For that matter, if you need to skip one or two bytes of VDP memory, it's generally faster to just read them than to set the address explicitly again. (Since it takes two VDP writes to set the address again). You can do this even if you are writing data, you just have to be careful of the address counter, since reads and writes increment it at different times (reads increment before you read the data due to prefetch, writes after you write it). For instance, let's say I want to move two sprites in the sprite table, but not touch color or character data. The sprite table layout is: Y, X, Char, Color (For simplicity, assume sprite one X and Y is in R0,R1, and sprite two is in R2,R3, and the SAL is at >0300): SALTAB EQU >0043 * another trick - pre-swapping the defined address saves a SWPB in the code. Note Ive set >4000 here for write. LI R5,SALTAB * Get address of sprite attribute list MOVB R5,@VDPWA * pre-swapped, dont need the first SWPB SWPB R5 * get MSB and delay MOVB R5,@VDPWA * write MSB - address is now set. On a stock 99/4A we dont need to delay unless * we use register-only addressing to access the VDP in the very next instruction * but you can if you are nervous or want to work on accelerated machines. MOVB R1,@VDPWD * write sprite 0 Y. No delay needed between writes on a stock 99/4A MOVB R0,@VDPWD * write sprite 0 X. The address pointer now points to Spr0.Char, we want to skip two. MOVB @VDPRD,R5 * read garbage from prefetch and increment address pointer. Prefetch now has Spr0.Char and address is Spr0.Color MOVB @VDPRD,R5 * read Spr0.Char from prefetch. Prefetch now has Spr0.Color and Address is Spr1.Y. No delay needed on stock 99/4A * between reads unless you are using register-only addressing (even that is on the edge). MOVB R3,@VDPWD * write sprite 1 Y. Address counter increments as you expect. MOVB R4,@VDPWD * write sprite 1 X. We're done. The trick really, is that the VDP doesn't have a "read mode" or a "write mode". It has an address register and a prefetch register, and a read port and a write port. Accessing the read port returns the prefetch register, fetches the data at the address register into the prefetch register, then increments the address register. Accessing the write port writes the data byte to memory at the address register, then increments the address register (it may store the byte temporarily in the prefetch register, I need to test that still). There is a caveat to all the above, though. Later versions of the chip had separate read and write address pointers, meaning that these tricks will NOT work on the 9938 or 9958. If you want to be compatible then you do need to think of it in terms of how TI specified a "Read address" and a "write address".
-
Well, a similar trick to Theirry's with Compare is that you can increment /two/ registers in one instruction: C *R1+,*R2+ -- increment each register by 2 CB *R1+,*R2+ -- increment each register by 1 and since it supports full addressing modes, you can increment memory locations as well as registers. The idea is you get two for one of whatever it is. One I've used a few times was to use the parity status bit to test a bit - sort of. It's not often thought of, but if you know that the number of set bits will change in a byte, you can use JOP rather than masking and comparing, since the MOVB will set the parity bit if there are an odd number of set bits in the moved byte. (Doesn't work with words!) Storing a commonly used memory address or even a commonly used number in a register can make a huge difference in the size of your program. Smaller programs on the 9900 execute faster. We have 15 general purpose registers - put them to work! Registers should pretty much always be in scratchpad if you care at all about performance of the code, or you use no registers. Not really a trick, I guess, but pretty important. Likewise, if you can spare the space, copying commonly used code to scratchpad becomes a performance win after about 4 instructions. This one is common to all architectures, but tail recursion is a common optimization, and it works very well on the 9900 where there's no stack. A common way to deal with a subroutine that calls a subroutine is to move R11 to another temporary register or memory location. Something like this: SUB1 DO_SOME_WORK MOV R11,R10 * Save return address BL @SUB2 B *R10 SUB2 DO_SOME_OTHER_WORK B *R11 If your second subroutine call is the last thing you do, then don't bother saving the return address. Branch to the subroutine instead and it will use YOUR R11 when it returns. Saves memory, and instructions. This does assume that SUB2 doesn't need R11 to point into SUB1 for any reason. SUB1 DO_SOME_WORK B @SUB2 * SUB2 will return for us SUB2 DO_SOME_OTHER_WORK B *R11 Another common-to-all-archs one reminds that shifting is much faster than multiply and divide, if you are doing a power of two. For instance, SLA R1,2 is much faster and saves a register over multiplying R1 by 4. If you are not doing a power of two to multiply, it is commonly reputed faster if you can break it into two multiplies and an addition. For example, to multiply R1 by 10, use a second register to make a copy, multiply one by 8 and one by 2, then add them, like so: * 10 can be done in powers of two as 8 + 2 R1X10 MOV R1,R2 SLA R1,3 * multiply by 8 SLA R2,1 * multiply by 2 A R2,R1 * put result in R1 Of course, the TI has a funny architecture. All other things being equal, the above doesn't work out for multiplies. The above code takes 14+18+14+14 cycles and 14 memory accesses, plus 8 bytes. In 8 bit RAM, Regs in scratchpad, that would total 76 cycles. The MPY would probably look like this: R1X10 LI R0,10 MPY R0,R1 This takes 12+52 cyles and 8 memory accesses, plus 6 bytes. In 8-bit RAM, Regs in scratchpad, it would total 76 cycles. Note it's the same, and the MPY takes less code space. If you don't need to load the 10 (ie: it's already loaded elsewhere), the MPY can actually be faster. Worse, if registears are in 8-bit RAM, the shift/shift/add approach takes 116 cycles, while the MPY approach only goes up to 96 cycles. So the rule there is SLA if it's a multiple of 2, this is much faster and saves a register, but if you need two shifts the MPY is nearly always the better choice, unintuitively. DIV, on the other hand, has a best case of 92 cycles (unless it overflows), but I don't think two shifts and an add work out.. but if you can find a way to avoid DIV your code will appreciate it. On the other hand, DIV is the slowest instruction on the chip and could be used for delays. Remember that almost every instruction that touches memory does an implicit compare against zero. If you can make that meaningful, you can skip explicit compares after most operations. For instance, JNE/JEQ jump on exactly zero, JLT/JGT jumps based on the status of the highest bit (treating it as a sign bit), JOP jumps based on the number of '1' bits if it was a byte operation. Arranging your data will almost always give you the best savings, if you can. I've used this VDP trick once or twice - it works on hardware but not so much on some of the emulators. Remember that the only difference between setting a read address and setting a write address is whether the prefetch occurs. So if you are desperate and don't care about compatibility, setting a read address one less than the one you want will still have the correct result because the prefetch will bump the address up before you write, and is faster than the operations needed to set (and clear) a bit. For instance: DEC R1 / INC R1 - 10 cycles each (write address as a read address one less than desired) ORI R1 / ANDI R1 - 14 cycles each plus extra program memory read (slowest method) XOR R1 / XOR R1 - 14 cycles each plus extra memory read SOCB / SZCB - 14 cycles each plus extra memory read Unfortunately, to reiterate, this trick is not compatible with some emulation.
-
My involvement does end at "is it up?"... but dropping Rich a note that we're talking about it again with suggestions might be a good idea. I'm still 5 years behind on updating my own page. But as far as I know Ninerpedia shouldn't need any work, just contributors, yes?
-
Please fill out and submit the following...
-
Please, go ahead! It's just lack of input so far.
-
99er.net has been around for longer than most of the other TI sites out there, and both Rich and I have access to keep it running and maintain it. It has been ignored a lot for not being actively promoted, but I don't see the fear that it'll disappear. Should Rich decide not to host it any longer he knows there's an open offer to move it to my server. If backups are a concern, we can add a way to download the wiki if it's not already there.
-
I second that, Hal's an interesting guy online, but he's pretty damn fun to hang out with RL. Especially after a few beers.
-
That's got it covered, Owen
-
Oh, my going isn't predicated on finishing my project, only on time/money. I will try to get it booked this week and report back, but this has been a very expensive month.
-
How does that work? The PEB needs a computer attached to drive it... is there some trick I'm missing (or does the Nano Peb operate standalone?)
-
I'm not up to date with what the kids are using for their devkits lately, so I don't know what library functions you have available to you. You know your addresses, so it's pretty easy. To use psuedo-code, just create a buffer in CPU memory 8 bytes long. Read VRAM into the buffer, then write VRAM from it. It is possible to do it without an intermediary buffer, but because you have to change the VDP address twice for every byte, it's (comparably) very slow. If 8 bytes RAM is critical to you, and speed isn't, then just do it one byte at a time. Anyway, something like this: unsigned char vram_buf[8]; /* defines an 8-byte buffer in VDP RAM */ get_vram(0x480, vram_buf, ; /* assuming VRAM source is first parameter */ put_vram(0x408, vram_buf, ; /* assuming VRAM dest is first parameter */ I made a lot of assumptions there, though. But I do have an actual example that I use in my code, though with all the extra defines it looks longer... // used for patcpy unsigned char tmpbuf[8]; // base address in VDP RAM of the pattern table #define gPATTERN 0x1000 // copies a VDP character pattern void patcpy(int8 from, int8 to) { cvu_vmemtomemcpy(tmpbuf, ((int)from<<3)+gPATTERN, ; cvu_memtovmemcpy(((int)to<<3)+gPATTERN, tmpbuf, ; } I'm using the library from http://www.colecovision.eu/ColecoVision/development/libcv.shtml for that one (though I've been adapting everything to resemble TI Extended BASIC, so slowly been replacing the functions, for better or for worse ). Still, the names should be straight forward enough to adapt to what you have. I made the temporary buffer a global because I share it with several other functions, including my screen scroll. The #define sets the base address of the pattern table. I need this because, as you see, my patcpy function takes character numbers, rather than addresses. This does cost a little more CPU time - there's a shift and an addition in each function call, but it's straightforward to remember. And that's it.. the first line copies 8 VDP bytes into the tmpbuf, and the second line copies them back out to a new location. hth
