7.16mhz 1200XL

dmlloyd · September 6, 2008

I don't see the XL7 as a platform to run existing games at high speed.

Oh I don't know about that. Some games should continue to run at the same speed but just *better* overall. Particularly games that render frame by frame (Flight Simulator II comes to mind, or perhaps the converted Knight Lore?), should be able to render more FPS making for a smoother gameplay experience.

peteym5 · September 7, 2008

Something I am wondering about as a retro-game developer. How fast can you write to the ANTIC/GTIA/PIO registers now with a 7.16CPU. I know these chips still need to run on the original bus. Do they like get updated like once per clock cycle now? I am asking this because I am wondering if all the color, player/missile position, and other registers can be changed with a DLI and do it fast enough where it will not go by several scan lines. You usually can change between 3 and 4 registers before the next change starts appearing on another scan line. I have experimented with instructions that take up less CPU, like storing the X and Y registers in page 0 instead of pushing them to the stack, doing a self modifying NMI routine that jumps directly to the DLI routine instead of being vectored at memory location 512. But you still can only do so much with a DLI or else you notice a color changing mid line or on the next line. Someone said that a 65816 CPU can change these as fast as an equivalent of 1 clock cycle on a main bus. I know it takes 4 cycles to store something outside of page 0 (absolute addressing, no indexing). So loading immediate values and storing them probably won't be faster than that limit. I know people were looking to run these 65816 chips faster, but now I can see where it may start running into issues with the hardware area by attempting to change more than one register per main bus cycle.

Rybags · September 7, 2008

Hardware registers would be the same story as ROM - 1 "long" or 4 "short" cycles.

So, what I said before about being able to do much more colour changes per scanline isn't quite as rosy as we might think.

I guess it'll bring up a whole new aspect of cycle counting. Potentially you could "lose" 3 or 7 short cycles for a store operation depending on when the instruction starts.

In any case, GTIA always has that half "long" cycle lag between when you store a colour and when it comes into effect.

Edited September 7, 2008 by Rybags

+bob1200xl · September 7, 2008

Can you make some code that will test this? You will not be able to store consecutively to I/O hardware but I would expect that you will complete more changes than a stock machine. You write the routines - I'll try it for you...

Bob

Something I am wondering about as a retro-game developer. How fast can you write to the ANTIC/GTIA/PIO registers now with a 7.16CPU. I know these chips still need to run on the original bus. Do they like get updated like once per clock cycle now? I am asking this because I am wondering if all the color, player/missile position, and other registers can be changed with a DLI and do it fast enough where it will not go by several scan lines. You usually can change between 3 and 4 registers before the next change starts appearing on another scan line. I have experimented with instructions that take up less CPU, like storing the X and Y registers in page 0 instead of pushing them to the stack, doing a self modifying NMI routine that jumps directly to the DLI routine instead of being vectored at memory location 512. But you still can only do so much with a DLI or else you notice a color changing mid line or on the next line. Someone said that a 65816 CPU can change these as fast as an equivalent of 1 clock cycle on a main bus. I know it takes 4 cycles to store something outside of page 0 (absolute addressing, no indexing). So loading immediate values and storing them probably won't be faster than that limit. I know people were looking to run these 65816 chips faster, but now I can see where it may start running into issues with the hardware area by attempting to change more than one register per main bus cycle.

Rybags · September 7, 2008

You'd need "optimised" RAM-based code, such that your stores to harware regs don't waste the "before" cycles, ie 1,2,3.

Normally, if you want to do multiple colour changes then you load 2 or 3 registers and store in quick succession.

I'd guess in this instance, you might benefit most by just using 2 registers in some cases, since LDA #data and LDX #data will use 4 cycles exactly.

Then your store instruction is another 4 cycles each.

Assuming we've started that sequence on cycle 0, then the STA gtiareg instruction would also start on cycle 0, with the actual store on cycle 3 which means it is delayed until cycle 0 again (1 cycle wasted).

Same again with the STX gtiareg.

If for example we've optimised the code such that the first store happens exactly on cycle 0, we still have the situation where the second store would happen on cycle 3. But, we have saved 1 short cycle compared to the first situation.

Maybe using A, X and Y and staggering stores and subsequent reloads in a certain fashion, the cycle wastage could be minimised.

Of course, there's also DMA and Refresh to consider, so it all gets a bit complicated.

Still, significant saving so far as that you can execute a LDA # and LDX # instruction sequence in half the time the normal 6502 does a single LDA # instruction.

With this, I am assuming that any read/write operations to HW Regs or ROM are always delayed such that they occur on Cycle 0.

Edited September 7, 2008 by Rybags

peteym5 · September 7, 2008

You probably can just use one register, do LDA #xx, STA $Dxxx, which takes 6 cycles total. If you send to the hardware area 4x speed, it would be updating the registers every 1.5 clock cycles. We can have Bob or someone do DLI tests on a 65816 and see what happens, after all we are all guess at this point.

ataridano · September 7, 2008

First of all, after reading the answers to some of the questions posted here, I have to say that I agree with keeping this project simple instead of trying to add features like extra memory, etc. It seems like a shame not to use the 65816's extra addressing ability, but keeping the scope of the project smaller just seems better. That said, there are still a few questions that I thought I'd throw out there:

-I understand why there may be problems with this board working with a 130XE because of it's memory scheme, but why would it not work with the 400, 800, 65XE, and XEGS? If it won't work, it won't work... I was just wondering what differences there were in the hardware.

-Is PBI access done at high speed or slow speed? If high speed, could there be problems with devices being too slow or cables being too long?

-How hard would it be to un-install? Do I just stick the RAMs back in their sockets and the old CPU back in? Are there any board mods (such as cut traces)?

ndary · September 7, 2008

Bob,

Here is a ZIP file with few games on an ATR files... can you test then on the XL7 machine?

is Colusus chess calculate moves faster?, Does Mercenary Game draws the wireframes graphics faster?, is the Writeframes graphics in Assult force move faster?, does snooker or leaderboard golf games improove its performace?

Games_to_Test_on_XL7.zip

+bob1200xl · September 7, 2008

OK - I'd like to do DLI tests but STA $Dxxx is going to give me an error, I think. Does anyone have the code to try DLIs?

Bob

You probably can just use one register, do LDA #xx, STA $Dxxx, which takes 6 cycles total. If you send to the hardware area 4x speed, it would be updating the registers every 1.5 clock cycles. We can have Bob or someone do DLI tests on a 65816 and see what happens, after all we are all guess at this point.

+bob1200xl · September 7, 2008

The XE machines should work, it's just that they will need different mechanical and electrical layout. I don't think you will be able to use the same board and you may have some added or modified logic. I don't know - have not looked in an XE. Same thing goes for a 400/800, only more of a challenge.

The PBI is an open issue. You could run them either from RAM or ROM, at high speed or low, but existing devices would all run low. The 1200XL7s we have so far do not have PBIs.

You have to cut traces on the 1200XL in order to isolate the SO pin. You also have to disable the old 3.58mhz clock by removing components. (you could cut a trace if you prefer) Five wires are then added to the bottom of the board. Pull the memory circuits. Plug in the new CPU board.

To remove the upgrade, plug the old memory ICs back in, remove the added wires, add a wire jumper to SO, re-install the clock components (a transistor and a resistor), and plug in the old CPU.

Bob

First of all, after reading the answers to some of the questions posted here, I have to say that I agree with keeping this project simple instead of trying to add features like extra memory, etc. It seems like a shame not to use the 65816's extra addressing ability, but keeping the scope of the project smaller just seems better. That said, there are still a few questions that I thought I'd throw out there:

-I understand why there may be problems with this board working with a 130XE because of it's memory scheme, but why would it not work with the 400, 800, 65XE, and XEGS? If it won't work, it won't work... I was just wondering what differences there were in the hardware.

-Is PBI access done at high speed or slow speed? If high speed, could there be problems with devices being too slow or cables being too long?

-How hard would it be to un-install? Do I just stick the RAMs back in their sockets and the old CPU back in? Are there any board mods (such as cut traces)?

tebe · September 7, 2008

bob1200xl how works programs from attachment ?

prg65816.zip

Rybags · September 7, 2008

Quick and dirty DLI program (load from BASIC - the ATR only contains the SAVEd Basic program, and not DOS).

The DLI attempts to reuse the same player in 5 instances per scanline.

I guestimated the delays so it probably won't work perfectly without some tweakage on your part.

To make it easy, just change the values of H1 through H5 in the program. They correspond to Player X positions onscreen.

Just edit lines 30-70 and GOTO 30 once you've run the program already and want to tweak the values.

By tweaking the HPos values, you might be able to get them all onscreen. Probably best to do them in left to right order (ie H1, H2 etc).

If anything, I'd probably guess that they need to be seperated a little more from what they are currently set at - onscreen values range from 48 to 200.

XL7Test.zip

The DLI code is fairly simple too - here's hoping it will kinda work.

0600: 48		PHA
0601: 8A		TXA
0602: 48		PHA
0603: 8D 0A D4  STA $D40A  ;WSYNC
0606: 98		TYA
0607: 48		PHA
0608: A0 20	 LDY #$20
060A: A9 30	 LDA #$30
060C: 8D 0A D4  STA $D40A  ;WSYNC
060F: 8D 00 D0  STA $D000  ;HPOSP0
0612: A9 50	 LDA #$50
0614: A2 08	 LDX #$08
0616: CA		DEX
0617: D0 FD	 BNE $0616
0619: 8D 00 D0  STA $D000  ;HPOSP0
061C: A9 70	 LDA #$70
061E: A2 04	 LDX #$04
0620: CA		DEX
0621: D0 FD	 BNE $0620
0623: EA		NOP
0624: EA		NOP
0625: 8D 00 D0  STA $D000  ;HPOSP0
0628: A9 70	 LDA #$70
062A: A2 04	 LDX #$04
062C: CA		DEX
062D: D0 FD	 BNE $062C
062F: EA		NOP
0630: EA		NOP
0631: 8D 00 D0  STA $D000  ;HPOSP0
0634: A9 90	 LDA #$90
0636: A2 04	 LDX #$04
0638: CA		DEX
0639: D0 FD	 BNE $0638
063B: 8D 00 D0  STA $D000  ;HPOSP0
063E: 88		DEY
063F: D0 C9	 BNE $060A
0641: A9 00	 LDA #$00
0643: 8D 00 D0  STA $D000  ;HPOSP0
0646: 68		PLA
0647: A8		TAY
0648: 68		PLA
0649: AA		TAX
064A: 68		PLA
064B: 40		RTI

peteym5 · September 7, 2008

I will write some ML DLI examples with a test screen this afternoon. gonna try changing all 9 color registers.

potatohead · September 7, 2008

I agree with simple too.

Just got done catching up on this thread.

I like it! IMHO, the extra speed will enable lots of interesting and new things without really changing the machine all that much.

Watching for now to see what works and what does not. At some point in the future, I would very much like to mod a machine, maybe an 800 XL, or 130 XE, and enjoy this.

+bob1200xl · September 7, 2008

This is what you see if you run it under Atari BASIC using a ROM OS... the 'players' blink at maybe 10hz.

This is what you see if you run a RAM OS using Atari BASIC... no blinking.

Bob

Quick and dirty DLI program (load from BASIC - the ATR only contains the SAVEd Basic program, and not DOS).
The DLI attempts to reuse the same player in 5 instances per scanline.

I guestimated the delays so it probably won't work perfectly without some tweakage on your part.

To make it easy, just change the values of H1 through H5 in the program. They correspond to Player X positions onscreen.

Just edit lines 30-70 and GOTO 30 once you've run the program already and want to tweak the values.

By tweaking the HPos values, you might be able to get them all onscreen. Probably best to do them in left to right order (ie H1, H2 etc).

If anything, I'd probably guess that they need to be seperated a little more from what they are currently set at - onscreen values range from 48 to 200.

XL7Test.zip

The DLI code is fairly simple too - here's hoping it will kinda work.
0600: 48		PHA
0601: 8A		TXA
0602: 48		PHA
0603: 8D 0A D4  STA $D40A ;WSYNC
0606: 98		TYA
0607: 48		PHA
0608: A0 20	 LDY #$20
060A: A9 30	 LDA #$30
060C: 8D 0A D4  STA $D40A ;WSYNC
060F: 8D 00 D0  STA $D000 ;HPOSP0
0612: A9 50	 LDA #$50
0614: A2 08	 LDX #$08
0616: CA		DEX
0617: D0 FD	 BNE $0616
0619: 8D 00 D0  STA $D000 ;HPOSP0
061C: A9 70	 LDA #$70
061E: A2 04	 LDX #$04
0620: CA		DEX
0621: D0 FD	 BNE $0620
0623: EA		NOP
0624: EA		NOP
0625: 8D 00 D0  STA $D000 ;HPOSP0
0628: A9 70	 LDA #$70
062A: A2 04	 LDX #$04
062C: CA		DEX
062D: D0 FD	 BNE $062C
062F: EA		NOP
0630: EA		NOP
0631: 8D 00 D0  STA $D000 ;HPOSP0
0634: A9 90	 LDA #$90
0636: A2 04	 LDX #$04
0638: CA		DEX
0639: D0 FD	 BNE $0638
063B: 8D 00 D0  STA $D000 ;HPOSP0
063E: 88		DEY
063F: D0 C9	 BNE $060A
0641: A9 00	 LDA #$00
0643: 8D 00 D0  STA $D000 ;HPOSP0
0646: 68		PLA
0647: A8		TAY
0648: 68		PLA
0649: AA		TAX
064A: 68		PLA
064B: 40		RTI

Curt Vendel · September 7, 2008

See now this would be quite desirable for BBS operations or potentially (with the ethercart) making the 1200XL as an FTP or webserver, if the CPLD code could be made available it would allow us to all tinker around and potentially create variations of the code to have mega-speed 1200XL's were graphics/compatibility aren't an issue.

As for the board as a whole - I LIKE IT !!! :-)

Curt

There is no such provision for doing that - it would mean re-programming the CPLD. You'd be on your own there...

Bob

I'd buy one too!

Would it be possible to vary the speed? I can imagine situations where 200% or even 125% of original speed would be desirable...

+bob1200xl · September 7, 2008

I have no problem with releasing the code but I will wait until ABBUC is over. If someone would like to make these things for people, they may not want the code out there - then, we have to decide if we want products or playthings.

Glad you like it!

Bob

See now this would be quite desirable for BBS operations or potentially (with the ethercart) making the 1200XL as an FTP or webserver, if the CPLD code could be made available it would allow us to all tinker around and potentially create variations of the code to have mega-speed 1200XL's were graphics/compatibility aren't an issue.

As for the board as a whole - I LIKE IT !!! :-)

Curt

There is no such provision for doing that - it would mean re-programming the CPLD. You'd be on your own there...

Bob

I'd buy one too!

Would it be possible to vary the speed? I can imagine situations where 200% or even 125% of original speed would be desirable...

+bob1200xl · September 7, 2008

I haven't looked at this in hardware, but I think what is happening is that we are executing the code ahead of the scan line. If I set line 30 to H1=40 and line 50 to H3=44, I see the whole player at 44 and the player at 40 in only the first scan line. It appears that we set HPOS way before the scan line reaches that point - eventually setting it to zero before any players are drawn.

Bob

Quick and dirty DLI program (load from BASIC - the ATR only contains the SAVEd Basic program, and not DOS).
The DLI attempts to reuse the same player in 5 instances per scanline.

I guestimated the delays so it probably won't work perfectly without some tweakage on your part.

To make it easy, just change the values of H1 through H5 in the program. They correspond to Player X positions onscreen.

Just edit lines 30-70 and GOTO 30 once you've run the program already and want to tweak the values.

By tweaking the HPos values, you might be able to get them all onscreen. Probably best to do them in left to right order (ie H1, H2 etc).

If anything, I'd probably guess that they need to be seperated a little more from what they are currently set at - onscreen values range from 48 to 200.

XL7Test.zip

The DLI code is fairly simple too - here's hoping it will kinda work.
0600: 48		PHA
0601: 8A		TXA
0602: 48		PHA
0603: 8D 0A D4  STA $D40A ;WSYNC
0606: 98		TYA
0607: 48		PHA
0608: A0 20	 LDY #$20
060A: A9 30	 LDA #$30
060C: 8D 0A D4  STA $D40A ;WSYNC
060F: 8D 00 D0  STA $D000 ;HPOSP0
0612: A9 50	 LDA #$50
0614: A2 08	 LDX #$08
0616: CA		DEX
0617: D0 FD	 BNE $0616
0619: 8D 00 D0  STA $D000 ;HPOSP0
061C: A9 70	 LDA #$70
061E: A2 04	 LDX #$04
0620: CA		DEX
0621: D0 FD	 BNE $0620
0623: EA		NOP
0624: EA		NOP
0625: 8D 00 D0  STA $D000 ;HPOSP0
0628: A9 70	 LDA #$70
062A: A2 04	 LDX #$04
062C: CA		DEX
062D: D0 FD	 BNE $062C
062F: EA		NOP
0630: EA		NOP
0631: 8D 00 D0  STA $D000 ;HPOSP0
0634: A9 90	 LDA #$90
0636: A2 04	 LDX #$04
0638: CA		DEX
0639: D0 FD	 BNE $0638
063B: 8D 00 D0  STA $D000 ;HPOSP0
063E: 88		DEY
063F: D0 C9	 BNE $060A
0641: A9 00	 LDA #$00
0643: 8D 00 D0  STA $D000 ;HPOSP0
0646: 68		PLA
0647: A8		TAY
0648: 68		PLA
0649: AA		TAX
064A: 68		PLA
064B: 40		RTI

potatohead · September 7, 2008

IMHO, this is a worthy discussion.

In this community, there exists a small set of users capable of just doing this. Of this set, there are a significant fraction who value their time and would be highly likely to purchase a kit, instead of just rolling their own.

The rest of the community then are targets for a product, kit or service.

My only worry about the code being out there is too many variations.

If we are to see some development on this, and a user base, perhaps it's wise to establish said kit, product and or service before the code sees greater distribution. In the end, having it out there is probably good, but maybe not so good right at the start.

mos6507 · September 7, 2008

This is a cool project. I agree with the other poster that not using the full capabilities of the 65816 is a waste, though. But it's cool that as much backwards compatibility can be accomplished and still net some speed increases. I thought that would have been impossible. The ram separation and dual clocking reminds me of the "fast ram" vs. "chip ram" solution with Amigas. I wonder how this comapres to John Harris' approach to acceleration.

dmlloyd · September 7, 2008

I don't know if it's really a waste not to utilize the '816 fully... after all it is a fairly goofy chip overall. I'd even go so far as to suggest that perhaps a current-generation 65C02 would be a better choice: for a 64K machine, the extra instructions that the 65C02 has (some of which are missing from the '816 by the way) are just as useful, if not more so in some cases. And WDC's 65C02 goes up to 14MHz, just like the '816.

mos6507 · September 7, 2008

I don't know if it's really a waste not to utilize the '816 fully...

That depends on whether you just want to accelerate existing software or write new stuff. The Chimera (if it ever gets finished) will have 256K minimum SRAM, and more likely 512K. If the 2600 can get that much RAM in 2008, seems reasonable to at least have that much linear RAM on something like this, if it can be done reliably. Actually, just having a RISC share the hardware with the base 6502 like in Chimera might be better as far as new programs go.

Edited September 7, 2008 by mos6507

Rybags · September 8, 2008

Bob: I calculated the delay constants in my head quickly, so they're probably a bit out.

Maybe I should have just done a colour change instead.

If you edit the data and change any "0" before "208" to a "26", then it'll just change the background colour instead.

The H1.. H5 values will then simply be the colour values it stores.

drac030 · September 8, 2008

SD3.2D, SDX 4.20 and 4.21 all work in high speed/low speed SIO on the XL7. As long as you are using the serial clock gererator and not bit-banging, (like they seem to do in the 810 and 1050) you're OK. If you're trying to count cycles, you're probably going to croak.

This is probably pure luck, that your serial devices are fast enough. Every SIO driver on this world contains a busy loop that makes some delay between setting the COMMAND line on the SIO, and sending the actual command to the device. When you clock CPU 4 times faster, this loop is 4 times shorter, of course. When you clock the CPU 5 or 8 times faster, or even more (on the accelerator I have), the loop becomes yet shorter. So no worry, the loop calibration is necessary anyways, to compensate this.

Edited September 8, 2008 by drac030

Rybags · September 8, 2008

The OS (and others, I'd guess) uses a pessimistically long delay to cater for a worse than worst case device lag.

Given that Bob's upgrade isn't strictly 4 times faster and that ROM code doesn't run much quicker, it wouldn't make a lot of difference.

But, RAM-based code is a different story, although if it's running in normal Gr. 0 then it's a bit closer to a normal machine.

Maybe trying the various SIO turbo modes with DMA turned off might unearth any potential incompatibilities.

7.16mhz 1200XL

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members