-
Content Count
145 -
Joined
-
Last visited
Posts posted by vol
-
-
7 minutes ago, Faicuai said:I will prepare an .ATR for you, with what you need (give some time). We will guide you. Altirra will allow you to enter Indus-console buttons, to bootstrap CP/M, exactly as you do it with my real drives. After than, everything is relatively easy. As far as I have seen, there is nothing out-there like Altirra and you should stick to it, as much as you can. Very, very powerful and complete.
Thank you very much. However without information how to get timings it will be no use.
Can we somehow reach the host memory?
-
On 2/11/2021 at 7:05 PM, Faicuai said:In a nutshell, it is possible because of the abstraction provided by SIO and its bus, which is how the Atari (and its floppy drive, The Indus/GT) communicate.
As far as I have seen, used and tested, The Indus/GT is gram-by-gram, byte-for-byte, out-of-the-box, the best drive ever made for the Atari. I got it with my 800XL in 1984 (when jumping late from my 400).
Thank you very much for your information. The Indus/GT was a really super-thing! I thought that such intellectual disk drives were available only from Commodore. The Indus/GT capability to run CP/M is astonishing! I am going to make a special pi-spigot variant for it. However I need help. Is it possible to get timings in a program running under CP/M? Maybe it is possible if a CP/M program asks the Atari host? Anyway I need information about details how to get a timer value in Atari CP/M program code. If I understood correctly I can use the Altirra emu to run CP/M, didn't I? I have downloaded this emu and run it under my old Microsoft Windows XP under VirtualBox.
It is really a very good emu. However under my old Windows help system doesn't work.
I have been able to find Indus/GT ROM and CP/M disk images but how to combine all these things and to start CP/M? I booted the terminal disk, run E for 80 column mode and got an empty screen.
The instruction says "After the terminal program has been booted, insert the CPM disk into the Indus and while holding down the "drive type" button, press the "error" button; this will boot up CPM, which will ask you to hit the return key to continue. At this point your are booted up and running CPM". It is too cryptic for me, where can I find those two buttons? Are they on Atari keyboard or they are special buttons on the drive?
I am also curios about 80-column mode. Does it use 4x8 font matrix?-
1
-
-
On 2/8/2021 at 4:38 PM, drac030 said:Well.
I can also ask you to read this material which has a cite from a very good hardware engineer
QuoteReaders are always dazzled by the speed. Motorola talks about 12 megahertz. The Apple [Macintosh] computer has about 7 megahertz with the 68000. The equivalent speed on the 65816 would be in the neighborhood of 2 to 3 megahertz.
In other words, a 2- to 3-megahertz Apple [II-series] has the same kind of performance as an 8-megahertz Macintosh. -
On 2/9/2021 at 8:40 AM, 777ismyname said:It is very easy to find many sort algorithms for the Atari. Compute published several machine language versions with program listings. There were quite a few other listings, too. A quick search at atariarchives dot org, atarimagazines dot com, atarimania, et al will yield quite a few. I haven’t looked at them in ages, but succinctly recall one of the Compute compendiums having the machine language code listed for some sorting algorithm.
Thank you. I have done some searches and found almost nothing. I have been able to find Ultrasort for various 8-bit Commodores and its improvement Lightning sort. However the latter is not available in texts, only in scans. There are several more quicksorts for the 6502 published after 2005 but all of them have some drawbacks which reduce their usability.
My quicksort implementation is safe so it never crashes your stack and it is never quadratic.
-
16 hours ago, Faicuai said:Getting 508 ticks on Atari/NTSC.
The "canonical" version of SIEVE's Byte can run in 444 ticks, NTSC, as I attached on this thread, before.
Nevertheless, pretty good time, and interesting difference between +4 and Atari, though...
The difference between timings shows that the compiler wrongly counts timings for one system. We have 1.2 times excess that means that 60 and 50 Hz are mixed up somewhere. Do you know that the Plus4 (as the C64) system software timers are 60 Hz for both PAL and NTSC models?
-
16 hours ago, Faicuai said:FYI, just got results for Atari 800 / Z80 (IndusGT+64KB RamCharger, CP/M v2.2):
100 digits: 1.65s
1000 digits: 145.83s
3000 digits: 1301.59s
MAX digits: 8292
Results are verifiable on ALTIRRA, as it very nicely supports and emulates Indus/GT roms and CP/M operation.
You may want to update your database with these (interesting) results.
What are interesting results! They match a 4 MHz system. How is it possible?! The Commodore 64 people had to use a 1 MHz Z80 cartridge, the Commodore 128 people were allowed to use the Z80 at effective 1.7 MHz. What is your system exactly? Would you like to send me a link to its description? Thank you.
-
On 2/8/2021 at 4:38 PM, drac030 said:In my view it is like comparing a car with a steam engine vs a car with internal combustion engine. A state-of-the-art steam car may at places, in special circumstances, outperform a sligtlhy older internal combustion one (65C816 is 6 years later than m68k), but it is the latter one that is superior technology. This is, among other things, why Amiga, Atari, Apple went for m68k in middle '80s, and it was not accident.
Your ideas about the 68000 advantages over the 65816 sound quite plausible but they are too abstract and theoretical. We have one example (Sieve of Eratosthenes) where the 65816 shows that it requires 2 times less cycles than the 68000. Let me show you one more. I have another project which might be used for benchmarking - you can find out there that the NMOS 6502 requires even more than 2 times cycles less than the 68000. Indeed you can think that my code for the 68k is less optimized but Xlife-8 shows performance matches the best Amiga cellular automaton programs and this shows that code of Xlife-8 is quite optimized for the Amiga.
I am sure that for some data processing the 68000 can be faster than 65816 but I can assume that such processings are rather untypical.
The byte transfer is a very common and important operation because it is one of the base operation for text strings. The 65816 is about two times faster for this op than the 68000.
However there is something weird about the 6502. Computer companies wanted to be rid of it in the 80s. IMHO it was caused by still unknown reasons. I can only propose a hypothesis that the 6502 was too fast and cheap and could crash several respectable companies. The 68000 was a good choice and the lack of development of the 6502 for 7 years had also a strong effect. -
On 2/8/2021 at 11:21 PM, zbyti said:Back to the C+4 vs. A8 with blanked screen.
Do not forget to turn the plus4 screen off by POKE65286,11. It can be turned on later by POKE65286,27.
It is also possible to get the additional 25% speed boost for the PAL Plus4. -
11 hours ago, dmsc said:Hi!
What?????
There are hundred of quicksort implementations for the 8-bit computers.... some old ones:
Here: 1986, in BASIC for the Atari: https://archive.org/details/analog-computing-magazine-42/page/n43/mode/2up
Here: 1983, for the Commodore: https://archive.org/details/1983-09-compute-magazine/page/n195/mode/2up
Have Fun!
Thank you but your first link gives us quicksort written in Basic.
The second link connects us with a kind of real ML quicksort but there are no sources only a lot of Basic DATA items. We need to decipher it. So you are rather wrong about hundreds implementations for the 6502 which we can directly use for instance in the Atari 800, I doubt that you can find only one.
-
On 2/5/2021 at 11:31 PM, drac030 said:65808 I have not heard of. Maybe you mean the 65802? So this is just a crippled down 65C816, i.e. this exact CPU in a package which deprives it of the extra bus signals, e.g. there is no 24-bit address bus. I am also curious what is the ultimate point of looking for options. Are we looking for a way to avoid 65C816 at all costs? Even if so, the options are as such:
a) WDC65SC02 - in production, good code base, but: a poorer version of WDC65SC816.
b) 65C802 - out of production, and a poorer version of 65C816.
c) CSG65EC02 - out of production, code base probably around zero.
It seems to me that there are generally only two options and it is 65C02 and 65C816.
One of my colleagues has a Commodore 65. I am not sure if that computer is in a working condition, but there are chances. So there is perhaps also a chance to run your benchmark on it.
I had been there, I wrote some programs for m68k in assembler, and I am also writing code for 65C816, and my impression is that this claim is not plausible. Sure, if we treat m68k as if it were an 8-bit processor, i.e. do mostly 8-bit operations and do everything on memory yet using absolute addressing modes, the CPU will of course perform poorly. But man, add.l d0,d1 takes only 8 clock cycles, there is no way you can to a 32-bit addition so fast on 65C816, the best what you can have is 26 on zp.
Besides, your own benchmark proves that one needs a 20 MHz 65C816 to beat a 7 MHz m68k. I wonder how an Atari ST would perform, its CPU is faster (8 MHz) and the DMA load is probably smaller.
Indeed I meant the 68502. Sorry for this typo. It would be great to test the C65, it is very sad that this fantastic computer was appeared so late and eventually was not released at all.
Indeed the 68000 beats the the 65816 when it performs 32-bit arithmetic, but code needs branches, memory transfer, byte and word ops. The 65816 works directly with memory while the 68k needs to load/store registers. Sieve of Eratosthenes quite clear shows that the 68516:68000 speed ratio is close to 1:2 for this task.
Indeed I have plans to make a version for the Atari ST. I know that its architecture is easier to understand than the Amiga but it also requires some time.
You mentioned combosort, I have checked information about it and found out that it is a variant of shell sort. My tests show that the best shell sort implementations are about 50% faster than combosort. And combosort needs floating point...
I have just started a new topic about it -
the 68000 beats the 65816 on pi-spigot because the 68000 has hardware division and it is the longest op in pi-spigot. So the ARM results which also doesn't have hardware division are really fantastic.
-
I have implemented several known sort algorithms for the 6502. It is an open github project. I dare to think it is likely that a quicksort implementation there is maybe among the first known. It is odd that quicksort was not realized for 8-bit processors until the second decade of the 21th century.
Does anybody know about other implementations of fast sorting methods for the 6502?
-
5
-
-
22 hours ago, drac030 said:65C02 is not a problem anyway, I am not aware about a tendency to put 65C02 in Ataris, if someone wants to replace the processor, the most logical decision would be to go for the 65C816.
And yup, legacy 6502 code. So, even without any trickery, a 32-bit addition is 50 clock cycles (abs addressing mode) on 6502, whereas the same operation is 49 clock cycles on 65EC02. But on 65C816 you can do the same in 32 clock cycles best case, 33 average case, 36 bad case, 42 very bad case. So the statement that optimizing the prefetches gives 25% speed gain in comparison is, well, optimistic
I am not sure if a 4 MHz 65C816 is as fast as 8 MHz m68k. The above 32-bit addition m68k does in 48 clock cycles if I am not mistaken, so certainly not twice slower, only slightly slower, and yet with the somewhat unrealistic assumption that we do not do the computation in the registers.
Besides, there is the entire matter of the bus (synchronous 65xx vs asynchronous m68k), the variety, price and supply of the peripheral chips (ACIA etc.) for m68k is not the CPU alone, and all this being made by a big corporation which cannot disappear overnight (which probably is not the case of WDC, and besides, as you noticed yourself, this producer can sit still for 30 years doing nothing, so its reliability regarding the supply of future better product can be doubted and it probably could have been doubted already in 1985).
There is another option - to use the 65808 which is pin-compatible with the 6502.
The 65CE02 has faster INC and DEC which are used very often. So the 25% boost is quite plausible.
There was a discussion about the 68000 and 65816 - http://forum.6502.org/viewtopic.php?f=4&t=4310 - it shows that a fact that the [email protected] matched the [email protected] is plausible too. Commodore could ask for more advanced processor with 16-bit data bus and Apple limits lifted off. The 65816 memory access cycle is FOUR times faster than this cycle of the 68000. A lot of details about the 68k are gathered here - in particlular there are details about drawbacks of the 68k which coerced companies which used the 68000 to seek an alternative. You know in the 90s no company wanted the 68k. The upgraded 6502 could have had much better prospective.
BTW the 65CE02 RTN instruction is quite useful. I remember some programs for the 8086 which used RETN with the parameter quite often.18 hours ago, stepho said:The 8-bit machines died only because technology moved on.
Indeed in the 90s, 8-bit computers looked poor. But it was because they weren't upgraded. MOS Technology men told that they had plans for the 16-bit 6502 in 1976. Commodore stopped all R&D there, then they were producing the C64 for 12 years without modifications! Try to imagine if IBM tried to sell the original IBM PC in 1993! But ppl bought C64 or Amstrad CPC/PCW even in 1994. You know that they also stopped Atari 800, Apple II, ... just voluntaristic without reasons we can know.
5 hours ago, zbyti said:This topic was mentioned on linked wiki:
Endresult is 1.212 MHz for a normal screen and 1.633 MHz for a blanked screen (PAL).
The Plus4 has about 1.11 Mhz effective CPU frequency in normal screen on mode and about 1.7 Mhz in screen off mode. So the pi-spigot benchmark perfectly matches this information. Thank you very much.
-
1
-
-
1 hour ago, Faicuai said:No wonder why they all died?
Modern pc video cards are often more powerful beasts than their CPU's.
IMHO it was something weird about the fate of home computers in the 80's. The Commodore C64 could have been improved much, the C65 could have been made in 1985 or even 1984... GEOS still has some features which modern systems missed. A user if he switched on the Amiga today would use it because it has modern style user friendly GUI. Apple successfully attacked IBM - what if Commodore were working along with Apple? We can also say that home PC's were transformed into game consoles.
1 hour ago, drac030 said:He must have really meant a different game title, because I can see under emulation that Asteroids works without problems on 65C02. Also, its VBL routine is very short, it is shorter than 4 rasterlines, there is no way it could overlap due to extended BCD timings, and besides, that interrupt handler does not even use BCD. Or I am missing something here.
Not so serious if it can be worked around: say we have a 1.773 MHz clock, CLC is 2 cycles in 65C816 and 1 cycle in 65EC02. So the latter is faster. But if we deadly want to reproduce this on 65C816, we can do it, as the CPU tells us when it is about to do the superfluous cycle. We then can cut it off the bus and switch the clock to e.g. 14.18 MHz (8x1.773) for just this one cycle. Then our CLC is effectively taking 1 and 1/8 clock cycle (not 16, but 9 cycles of the 14 MHz clock). The instruction's execution is still longer than on 65EC02, but already not 2 times and the overall performance is very close.
As much as I like the 65xx family of processors, I cannot deny that (despite its few oddities and nuisances) 68k is simply a better processor, besides, it is a 16-bit processor designed to become 32-bit from the very beginning, so Apple just switched to something with better perspectives. Also 68k was produced by Motorola, so they probably thought this producer to be in longer time more reliable than the small WDC. That Motorola has ultimately stupidly destroyed the 68k line is another matter.
So wikipedia is lying that it was discontinued in 1988. No surprise.
Even I can think some methods of avoiding extending the interrupt latency, one can e.g. implement MUL as one-instruction loop doing the task in relatively short iteration steps. Or you can make the multiplication-division unit in a fashion of a built-in coprocessor, which provides own registers that become unavailable when the unit is busy with the computation, and so the integer unit may accept interrupts in meantime. Or you provide an actual coprocessor. Or if you do nothing of that, the board's designer can still implement the mul/div unit in FPGA
It is something strange about this case with the Asteroids game. I remember there were sites where was a list of Atari software which was incompatible with the CMOS 6502. I cannot find those sites now.
I have been talking about the 65CE02 and 65816 at the same frequency.
They wrote that the first is about 25% faster than the second on the 6502 code. The 65C02 is on the contrary slightly slower even than the NMOS 6502 because of slower BCD ops and JMP (abs).
I don't know exact timeline of the 65CE02 but it was used in the Amigas which were produced until 1994.
You know Commodore initially wanted the upgraded 6502 for the Amiga but they thought that the 68000 was cheaper and chose it. It is known that the [email protected] matches the [email protected] Try to imagine the Amiga 500 which could have been two times faster!
The 68k was not bad but it was expensive and not fast.
IMHO the easiest way to reduce the interrupt latency is to make long executable instructions interruptible.
-
43 minutes ago, Faicuai said:In my humble view of things, the main issue itself with Atari's architecture is that it was designed with the "ray-trace/chase-the-beam" concept at its very core. It is not the 6502 who actually calls the shots on the system. It is ANTIC who's running the show here. Even the 6502 was specifically modified to include HALT-logic on-board. The idea was always to stop the 6502, and have ANTIC do its magic (in concert with GTIA). In other words, an entire computer system designed around the notion of an electron-beam flying over the screen, at close to the speed of light. Everything dances to this tune.
IMHO almost all home computers from the 80s have the same architecture: a video chip is more important than a CPU.
-
9 hours ago, stepho said:Not in significant numbers and not backed by the government like they were in the UK.
Mostly it was the Apple II or the Microbee (Z80 CPM machine).It is sad that so superior computers couldn't reach schools.
1 hour ago, drac030 said:I did not know that Asteroids crashes on 65C02. Due to the BCD timings? Man, I must try it to see the crash with own eyes (I do not have a 65C02 Atari, but I can try it under emulation). Regardless, the 65C816 executing 6502 code is not slower than 6502, these instructions which were slower in 65C02 have the same timings as in 6502 (or at least the WDC manual says so).
By improving the prefetching pipeline, I see. This however only applies to instructions which do superfluous prefetches (like CLC for example), say JMP is not improved (nor can be on this architecture). On 65C816 the pipeline was not improved, but you can still simulate this (or similar) effect by clocking the processor faster on internal operation cycles - because, among other, it has legs to signal "valid" instruction fetches and data access to external circuitry. Nice feature of 65CE02, though.
Besides, as I can see, 65CE02 was introduced in 1988 and discontinued the same year (whereas 65C816's brand new units are available to this day). What was the reason?
I do realize that it is a cross-asm, for "platform" I meant C-64 mainly, as each 8-bit platform has own specific programming tools, also the cross-ones. I will try to use it to prepare the Rapidus specific version of the benchmark, although it may be difficult at places (as TMPX, do I understand correctly, is missing direct 65C816 support? That would be weird considering the existence of SuperCPU).
I do not know, I programmed sorting only once, and the algorithm I used was the combsort - it is fast enough when you want to sort a directory (even a long one, like 1000-something entries, it takes few seconds).
Besides, some things are just hardly possible without the flat memory. The notorious problem on 65xx is the lack of multiply and divide instructions. Bill Mensch of course was once promising a CPU with 32-bit fixed point multiply and divide, but also of course did not do that (if I was the Rapidus designer I would implement that into FPGA, but this also has not been done, unfortunately). Still, having a ton of flat memory you can precompute a 128k lookup table and using that perform 8x8=16 bit multiply in less than 20 clock cycles. On a 20 MHz processor this is like about 2 cycles on 6502, i.e. no time.
Bill Mensh told about a problem with the Asteroid game -
And, indeed, the the 65816 is generally better than the 65CE02, but the 65816 is slower and it is its serious flaw. Bill Mensch also told that Apple persuaded him to make the 65816 in their way and we all know that Apple strategy was to stop the whole Apple II family. IMHO they didn't want too good processor for the Apple IIgs they for some reasons were stuck in the 68k.
Commodore had a crazy management, they killed so many good things that all this is impossible to understand. Jack Tramiel "sold his soul" and acquired MOS Technology stopping all innovation development there and Irving Gould made things much worse. They didn't want to improve the C64, so some ppl suggest that they only developed the 65CE02 as a controller for the Amiga. IMHO the 65CE02 production was only stopped when Commodore stooped itself in 1994.
TMPX is a wrong tool for the 65816 code development. I used it because I had only small code.
Multiplication and moreover division can slow down IRQ response time and that is not good for a controller. So Bill Mensch is a controller king, he didn't want the fate of the MOS Technology.
He refused to make the ARM for this reason.
Do you have sources of the combosort?
-
22 hours ago, drac030 said:Even if it did not, within few seconds (without powering the machine down, just reboot) I can always switch the board into legacy 6502 mode. I rarely need to do that, however, mainly for testing purposes.
From the brief description I found here: http://www.zimmers.net/anonftp/pub/cbm/documents/chipdata/65ce02.txt I can see that it is just a form of a cross-breed between 65C02 and 65C816, which provides some features the latter has, but to a limited extent (save the bit manipulation instructions which however are also present in some 65C02 varieties). So as long as at least the Atari platform is concerned, I see no reason why 65CE02 should be preferred over 65C816.
Ah, the platform-specific tools. Is it this: https://style64.org/release/tmpx-v1.0-style ?
I did not know that. I only knew that on C-64 most interrupts of daily use are IRQs. I guess that the reason is the famous 6502 bug regarding the NMI handling you certainly heard of.
It is a program for 65C816-expanded Ataris. It is the "kind of expanded BASIC", as you have formulated it. I used it to load your binary into the address $021907 instead of the originally intended $001907. As this was only an one-time test to satisfy my curiosity, I did not bother writing a proper loader.
We know that WDC made the 65C02 not completely compatible with the NMOS6502. Some instructions have different timings for both. Asteroids crashed because of different timings for BCD instructions. It is rather crazy for me that the CMOS 6502 (or even the 65816 executing the 6502 code) is a bit slower than the NMOS 6502.
The 65CE02 was greatly accelerated it is generally 25% faster than the 6502! It has an additional index register that gives it a better addressing mode than the 65C02/65816 (zp)-addressing, the base register, 16-bit SP, ... It is also crazy for me that WDC didn't try to make their processors as fast as this old one.
Indeed the 65816 is generally more powerful because it has 24-bit address space and several other good features but it would have been better if it was based on the 65CE02 than based on the 65C02. BTW If you have interest in the 6502 history I dare to recommend this blog.
TMPX is not platform specific, you can run it from any Linux or Microsoft Windows as I do. IMHO it is not very good but I used to use it, I am trying to use vasm now. I have a project where I use vasm - BTW is there some library of good sorting algos for the Atari 800? My implementations show speed only about 2 times faster than the Z80...
Thank you. I didn't know about this 6502 bug around NMI. However it is rather a bug in the 6502 documentations so we can rather blame the Atari engineers who didn't test the Antic thoroughly and missed that NMI require one more cycle in some cases.
So I can't use MBI.EXE unless I have the Rapidus board?
15 hours ago, Faicuai said:Global-performance BBC:ATARI ratio is 1:1.1173 which suggests a target exec. time of 33.5195s on Atari... Yet it is clocking slower at 35.74s which implies a global ratio of 1:1.19 instead... That is a SIGNIFICANT difference. More than what the clock-speed suggests.
As far as I know the 6502 seems to *require* RAM clocked at twice its speed to perform at full potential. The BBC looks like it is juicing that 6502 all the way. But it also begs the question as to what is Atari's real ram speed is? Is it running at Antic's freq.? less?
All in-all-all, Atari's 6502 is not delivering everything it can (there's a bit lost somewhere). Not even with DMA=OFF and suppressing interrupts. That is for sure.
Do you mean ATARI:BBC = 1:1.1173 ? The BBC Micro must be faster. However it should be much faster because its fast RAM isn't utilized completely. BTW were there the Acorn Archimedes in Australian schools?
-
22 hours ago, drac030 said:I would actually prefer a newer version of the 65C816, not necessarily 20 GHz (50 MHz would be enough, although they are claiming IIRC that their softcores can run up to 200 MHz) with few its "features" rectified.
The 40 MHz prototype board runs a softcore 65C816.
Initially I thought that the benchmark was compiled from C on all platforms. Later I saw that this is a hand-optimized assembler.
I will take a look at the scpu sources, but I am under the impression that you already have all the tools neessary to prepare such a version in binary form. Could you do it? If not, no problem.
I am curious about your accelerators. How is it about compatibility? We know that Atari refused to replace the NMOS 6502 with the CMOS because some programs didn't work under the CMOS 6502, for example game Asteroid.
WDC has been doing nothing for more than 25 years.
So it is very unlikely that it can produce anything new.
IMHO it would be good if somebody makes the high frequency 65CE02 - it is much better than the 65C02. I know little about softcores but a man ran the pi-spigot using his Acorn Atom with such a core at 100 MHz a year or two ago.
You can also read that this implementation of π-spigot is claimed as the fastest but everybody is invited to make it faster.
Some ppl tried to make it better for the 6502, x86, PDP-11, 68k, ...
No problem. IMHO you can find all information about my tools in the sources. I use tmpx-assembler, awk, sed, and maybe several other standard Unix utilities. The most interesting part is maybe the branch optimizer - you can find it in bbc-folder. It helps to keep all branches within the same pages - you know when a branch crosses a page boundary we have a timing penalty.
16 hours ago, drac030 said:Just for the records, I was wondering if the program could be run outside the first 64k and if that makes a difference. It has the advantage of code running entirely in FastRAM without interference from Antic. Also, entire 64k is available for the program, minus its size, which effectively makes a bit more than 58k. The disadvantage is that the system calls (like OUTCHAR) impose bigger overhead.
I used still the same 8-bit module from pack-45, and a BASIC interpreter capable of accessing more than 64k address space (MultiBASIC). The interpreter had to be fixed first, because one of the necessary keywords did not work correctly:
The results are not much better, though: 136.62 for 3000 digits (vs 137.14 before) at 20 MHz, and 92.75 (vs 94.12 before) at 40 MHz.
The maximum amount of digits to be printed is 8484.
Thank you for these interesting results. It seems your systems have a sophisticated MMU - they allow such tricks! I have just uploaded pipack-46 which uses a proper way to set $222-vector - thank you. This work with NMI gives the Atari a very interesting flavor. Do you know that the Commodore +4 doesn't have NMI at all? They just cannibalized this processor pin!
Sorry I missed what purpose does file MBI.EXE have? Is it a program for Microsoft Windows?
Indeed the version for the SuperCPU also uses JMP (abs,X) - it is quite a useful instruction and especially for ROM-coding.
However you use a kind of expanded Basic. In my project I have to rely only on stock ROM variants.
16 hours ago, Mazzspeed said:Furthermore, as another member stated, the ram in the BBC runs an effective 4Mhz. I knew there was a reason why Acorn didn't cheap out on the memory used in their machine at the time.
What is the reason? IMHO the BBC Micro was too expensive, they could utilize its fast RAM much better. If they used it like the plus4 uses it they could get more than 3 effective MHz!
13 hours ago, tebe said:MadPascal (FreePascal), Pi Bench
Do you know that authors of the pi-spigot algo published it in Pascal?
https://www.maa.org/sites/default/files/pdf/pubs/amm_supplements/Monthly_Reference_12.pdf
-
22 hours ago, drac030 said:Okay, so some technical information now:
1) the BASIC interpreter I used is U-BASIC. It is not very much more than the ordinary Atari BASIC recompiled so that it runs in the memory under the OS instead of hogging the cartridge area. You can run it on vanilla 800XL or 65XE (not on 800 - it requires a 64k machine).
The maximum memory (out of the first 64k as the benchmark rules stipulate) I can get here in this BASIC without falling back to the cassette recorder (which I do not have) as the only storage is:
It would however be better if the benchmark could avoid involving a BASIC interpreter and be run directly from the CP.
2) the hardware: PAL Atari 65XE with 320k RAM and Rapidus Accelerator. The results number 1 is the regular board, 65C816/20 MHz. The results number 2 is a prototype where the CPU is clocked at 40 MHz. So yes, this is "a kind of SuperCPU for Atari" as you correctly guessed.
This however raises questions about the SuperCPU results listed in your table. Namely, what is the difference between the entries 31/34 on one side, and 44/45 on the other side? Both list 65C816/20 MHz, but as you see, 31/34 are actually faster (by about 10%) than a 65C816/40 MHz. So how this result has been achieved on the SuperCPU board or maybe it is a mistake and 31/34 are really faster than the 20 MHz listed?
If you can compile the main module for the 65C816 target, we could see.
Basic is not a necessity you can use USR($1907) that gives you 200 digits. Basic provides the use of a friendly interface to set number of digits and optionally to set screen off mode. It also helps with floating point division: 49.86 or 59.92 are not easy divisors to handle in ML-code. For the Atari, Basic is used to load ML code too.
Many other ports (The Commodore 64/Plus4/128/SuperCPU), BBC Micro, Amstrad CPC, ZX Spectrum, Dragon 32/64, ...) also use Basic. Unix ports use C instead of Basic.
Could you tell me when was the Rapidus Accelerator production started?
What is this prototype? WDC stopped developing the 6502 long ago.
Bill Mensch just says sometimes something strange but does nothing.
Three years ago he says that he could make the 6502 at 10 GHz, this winter he even declared 20 GHz!
The higher results are caused not the increased frequency but the driver. You can easily finds that the higher results based on the scpu64-5 driver which means that this code uses all advantages of the 65816. The lower results are based on the c64-9 driver which is just the C64 program which was used to test the C64 itself. It is easy to transform the current Atari program for the use of the 65816 advantages, just replace code between start and finish by code from the c64scpu folder.
12 hours ago, Faicuai said:It works!
Now getting 1.85s for first 100 digits, and 184.9465s for 1000 digits... This is in "console mode", output via XEP80 Ultra-drivers (80-cols) and NO blank-screen (XEP80 drivers service (and keep) E: driver at full 6502 tilt, all the time).
All the above with Atari 800/Incognito (RAM/ROM/HD board) and SDX dos.
Thank you. You results confirms that Atari800 is a very accurate emulator. Your numbers are a bit less that those in the table which is an outcome of the faster char output of the XEP80. For 3000 digits you must get a number almost equal to that in the table.
4 hours ago, Rybags said:I'm fairly sure that doesn't happen - the screen blanking just prevents the cycles lost for screen generation in the same fashion as setting DMACTL to 00 on the Atari - the PAL/NTSC setting would just toggle between 312/262 scanlines at ~ 50 & ~ 60 FPS. There's no valid reason for the CPU to overclock to that speed (though it does seem somewhat close to half the PAL colour clocking rate)
The Plus4 can toggle the base dot clock divider which is 8 for NTSC and 10 for PAL. This gives 25% speed boost for PAL systems and 25% slowdown for NTSC systems.
2 hours ago, Mazzspeed said:I can tell you right now, it definitely runs at 2.2Mhz. This has been known for quite some time now.
All it needs is the 2Mhz ram of the BBC and it would be a rip snorter as long as the screen is off. All our 8bit machines run 1Mhz ram as Atari and Commodore wanted to make things as affordable as they could without the backing of the British Broadcasting Corporation.
In fact the BBC is a really good 8bit machine. It lacks somewhat in graphics capabilities, but as a development machine it's a great device - Especially with that Tube port and an ARM second processor.
The Plus4 uses 2 MHz RAM and 2 MHz CPU, the BBC Micro uses 2 MHz CPU and 4 MHz RAM. The Beeb never has memory contention and the price for this is its costly DRAM. The Plus4 sometimes gives 2 MHz access to its DRAM for its CPU. The CPU always uses DRAM at 2 MHz (exactly 1.76) when it is in screen off mode - the only exception are 5 cycles each raster line which the CPU uses at 1 MHz giving 5 cycles to DRAM refreshing. Sometimes when screen is on, the Plus4 CPU is just stopped and all ticks gets its video - this is a famous bad line effect. The Plus4 has such lines twice as many of them as the C64 in standard screen on modes but it can be programmed to use more badlines to show more colors and resolution.
-
1 hour ago, drac030 said:Okay, my results number 1:
100 digits - 0.16
1000 digits - 15.26
3000 digits - 137.14
5228 digits - 416.80
My results number 2:
100 digits - 0.10
1000 digits - 10.46
3000 digits - 94.12
5228 digits - 285.98
It seems you have a kind of SuperCPU for your Atari, the timings are almost identical. So if you use special version of code for the 65816 your systems results will even become approx. 50% better.
Thank you very much for your help, the Atari is a very interesting system.
-
30 minutes ago, Rybags said:Refresh on C64 AFAIK never steals cycles. Many accesses on the C64 are transparent since the CPU runs at half the relative speed of the video generation compared to the Atari.
You can alter the refresh cycle generation on the Atari but at the cost of losing most of the cycles to the character and map fetches. With programing tricks you can make 240 badlines which disables the bulk of the refresh cycles in a frame.
This explains why the C64:Atari800 performance ratio is about 1:1.65 and not 1:1.75
-
1
-
-
I've just uploaded version 3 of the program. It uses a portable OUTCHAR now and sets MEMTOP at $1907. So BASTOP after its loading will be $1902 for drac030's system and that leaves only 5 bytes free. So it may cause problems. However Faicuai's system should work nice now.
Moving MEMTOP up leaves less memory for π-digits, now the program can print only 3916 digits when its previous version can print 4356.
-
1
-
-
4 minutes ago, dmsc said:Hi!
Because the IOCB#0 could be redirected (for example, to a file). The value of the X register then is the IOCB you are calling - #0 in this case.
About BASTOP, the whole reason of the existence of BASTOP is that it is not constant - it depends on the specific DOS used (many to choose from), the BASIC interpreter and the current program. You should not depend on that location having a fixed value.
Have Fun!
Thank you! So is it the safe lowest address from which it is possible to load ML-code? The program assumes BASTOP at $8xx now. Of course, it is about a value after NEW or reset.
-
BTW should I use LDX #0 before RTS in the new OUTCHAR routine?
-
9 minutes ago, drac030 said:Correct:
+ICBLL = $0348
into
+ICPTL = $0346
and it will be fine.
I hope you realize that this value and also the amount of memory available over it will be different depending on the particular Atari setup and even the BASIC interpreter used...
Thank you it works now.
I have asked for a particular system BASTOP-value. The program changes MEMTOP and uses its old value to calculate the total amount of memory which the program can use. So I only need the value of BASTOP.

Benchmarking the Atari 800XL
in Atari 5200 / 8-bit Programming
Posted · Edited by vol
Maybe I should start a new topic about how to get access to the host resources from the Indus GT... The BBC Micro CP/M systems have detailed docs about how to access the host resources. So maybe later we can discover how to do the same things with the Indus.
Anyway I am trying to run CP/M and I am not successful. I have installed the Indus GT device, attached ROM for it. It can boot from the Indus Master disc but it refuses to boot from the Terminal disk.
The system boots from this disk when I disable the Indus. When the Indus is enabled it requires something extra to boot from the Terminal disk. Once I accidentally could boot it and then start booting CP/M but after some progress all was stopped.
You know every program in my project must satisfy 4 requirements and one of them is capability to measure time.