JamesD
Members-
Content Count
8,999 -
Joined
-
Last visited
-
Days Won
6
Content Type
Profiles
Member Map
Forums
Blogs
Gallery
Calendar
Store
Everything posted by JamesD
-
No sort change, unrolled data statements, names and numbers, 561 names MC-10 New ROM Scan data, dimension, load data, sort, check sort, print names 171 seconds Scan data, dimension, load data, sort 149 seconds Dimension, load data, sort, check sort, print names 159 seconds *edit* Non-unrolled data statements are a second or so faster
-
With comb sort as is, 561 names, no numbers... the MC-10 New ROM drops to 167 seconds. No point in timing more since that's the fastest of the lot till I look at the "banana sort" or FastBasic's sort. FasBasic supports INTs so that may not be feasible. Those would all be single precision on MS BASIC. *edit* That's dimension, load, sort, print btw.
-
I went to do a new build of the MC-10 ROM with the 16 bit string compare disabled, and found the define to enable it was commented out. Using the 16 bit string compare drops the last test I did from 270, to 268, beating the standard MC-10 ROM by a couple more seconds. Most of the speed improvement probably comes from the last 4 or 5 passes where the data is almost in order, and there are a lot more compares. The more strings that are similar, the more of a speed improvement it will offer, especially with large data sets. I'm guessing the numbers will converge if the amount of data is reduced, and the similarity is also reduced. While it is faster here, it's not by much. A faster sort should have fewer compares, so the speed improvement will probably be even less. Dropping the test to see if the data is in order will also reduce the speedup. The only reason for that in the first place is to see if the sort is working, and that has been fixed. Verdict... meh. Okay, so we have a lot of numbers, and several approaches to solving this sort of task. But where does that leave a common benchmark? How many names do we need to limit it to so we can time using the same data on every version? What happens if we sort on every data field in the original data once, what would the impact be? That shouldn't be difficult in any version, but will sorting based on the numeric fields in the data have a significant impact on the Atari code using a line for each item? Memory will be the biggest impact in MS BASIC.
-
Microsoft BASIC doesn't tokenize constants, so it's best left that way for MS BASIC. Your last suggestion should be faster, but given the number of times it's executed, I won't be able to tell the difference.
-
I start with the same gap, but calculate it differently in the loop.
-
BASIC XL didn't include a fast math ROM, but if Atari had given a shnizzle, they could have included BASIC XL in the XL series. They probably could have included a fast math ROM as well, that's the main difference between XL and XE versions. Then they could have offered upgrades for the 400/800. Sucks that they didn't. One of the biggest talking points against the Atari could have been turned into one of it's strengths. The quirk with Comb sort is probably the calculation of the gap. Try my modification for calculating the gap. It stays with larger gaps longer. It takes more passes with 891, but the time was less.
-
Reminder, they support DEEK, and DOKE, so it looks like the code could be faster.
-
The real advantage to Turbo Basic is definitely going to be the compiler, but you are basically in a compiler/accelerator arms race at that point. The Turbo BASIC version also allows the Atari to skip the data reading loop entirely, which might be considered cheating on the benchmark. My question here is, but what if the data is in a file? Does that specific optimization go out the window? You can always switch the data read loop to a file read, but this? It works well for constant data that can fit in DATA statements, but not as well for data like a regularly updated contact list. I only used data statements because the MC-10 only supports tape out of the box, and benchmarking tape/drives doesn't make much sense. If you use a different sort, every version would be faster by doing the same, and all my timings were the larger list of names. If using a smaller list is required, I'm okay with that, and I'm okay with a different sort as long as every machine has the same. If you benchmark an Apple II with one sort, and again with another sort, the results are going to be different, so it's not going to make for a meaningful comparison machine vs machine either.
-
Check CPU timing settings?
-
Converting from ASCII to BCD might be faster than ASCII to single precision floating point. It makes around 20 seconds difference give or take a little by machine on MS BASIC. Clearly, the array indexing is a bit of a roadblock for Atari BASIC. I wonder about the OSS BASIC XL, and OSS BASIC XE carts. The XL manual is copyright 1983, so that also pushes up when a fast version might be possible, and the XL manual mentions string arrays. BASIC XE has SORTDOWN, and SORTUP commands btw, but that's a bit of a cheat here. *edit* DEEK, and DOKE are also mentioned in the XL manual.
-
A funny thing happens when you drop the numeric data. It means they can be dropped on the other versions, which drops 3564 string reads, and ASCII to float conversions. Is this where your speedup is actually coming from? Results with 891 names, no numbers in data Plus/4 417 Plus/4 with screen blanked for sort 292 MC-10 New BASIC 270 seconds (the 16 bit string compare may be slowing this down) MC-10 286 seconds (this can be sped up just under 3% with a software patch that can be installed at runtime, no ROM change) Apple II Plus 337 seconds Apple II (1982 or later with accelerator) ~120 seconds I don't have one of those boards to test with, so I made a conservative division using the MHz listed on the internet. Later boards were faster, and a modern 20MHz accelerator will drop that below 20 seconds. This is why I suggested trying to stay in the early years, and higher CPU speeds became more common after 1985. The Apple Language System would be somewhat the equivalent of Fast BASIC back in the day, but you'd be using PASCAL. It also uses a bytecode interpreter.
-
It is definitely very different than regular BASIC interpreter you are used to. Bytecode interpreters work more like an emulator of a virtual CPU, and the compiler generates code for the virtual CPU.
-
If you want to move the DATA in front of the code, and put more on a line, go for it. I think the amount of DATA on a line was due to a line length limit, probably on the MC-10. The DATA follows the code, because under MS BASIC any code that searches for a line # <= the current line number causes the interpreter to look for it starting from the first line, but if Atari does it differently, that's not going to change the code that should stay the same so I have no problem with it. If you want to reverse the loops so they go from the bottom up, using the same core algorithm but in reverse, with the same gaps, that means you are doing the same number of compares, the same number of swaps, and the same number of passes, so I see no problem with that. I have no problem with changing the graphics mode to make the most cycles available, or even turning off the display. I even tried that on the Plus/4 for the sort, and I think I already said I didn't care about that, if you are running a job for a long time or overnight, you might turn off the monitor anyway. What I have repeatedly said I have a problem with, is switching sort algorithms for one machine, not timing the same things (skipping steps), switching to assembly language... stuff that makes it an apples and oranges comparison, rather than an Apples and Atari's comparison. What I did on the last timings, was read the data, sorted the index, checked the results, and printed the results. If you want to skip any of that, I'd have to time everything over. I also used the gap calculation in the loop that only used a multiply. It makes more passes, but it seemed to be a little faster, and I posted the two versions of the sort, with and without that. *edit* The code might not be doing the same number of swaps, but I think it won't matter much here.
-
As I said, I never designed it to benchmark anything but my own code, I never designed it to pick on the Atari. This is pretty standard code for a MS BASIC machine. I sorted an index, because it's faster than sorting strings, and it could be sorted according to any field in the data statements. If you want to swap strings in Atari BASIC instead, give it a try. I'm not sure it will be any more "fair". If you want to use assembly... then you aren't benchmarking the BASIC anymore. If it doesn't work well on the Atari, it doesn't work well on the Atari. So what? Guess what? MS BASIC uses floats for everything on the Apple II, C64, CoCo, MC-10, Plus/4, etc... to use integers, it has to convert from floats to INTs for every index. All addition, subtraction, multiplication, division... floats. Adding a constant to a variable? It has to be converted from ASCII a byte at a time every time the code is executed, and it converts it to... a FLOAT! There are no integer data types. Not only that, but there are two types of floats. Packed, and unpacked, which the interpreter has to convert between as it even processes floats. Run time status of the interpreter... you are dealing with 16 bit ints for line numbers, a few pointers, and some bytes for are we running, # of nested FOR loops, etc... but everything else is pretty much... FLOAT. I could bitch about a lot of crap in MS BASIC Every one of these BASICs has a huge amount of overhead that shouldn't be there. The MC-10 ROM footprint... 8K for everything. Standard Color BASIC on the CoCo? 8K for everything. But the Atari is soooo picked on in it's little 8K ROM space Part of what benchmarks do is show how one program (in this case an interpreter) is faster than another The goal of a benchmark isn't to optimize the piss out of it until you win. I know it's your goal, but that's not benchmarking. You try to optimize the benchmark, change the benchmark, and then wonder why people say you missed the point. You did the same thing with Ahl's benchmark, and I wasn't the only one to point out that's not how a benchmark works. Your response to Stephen (his post is 128 in that thread)... you said he was wrong because, and posted a picture of a sports car being chased by a cop. Well, there's a great argument Last time? I cried too? What thread was this? I would have said of course it's faster, same code, higher clock speed. But then a IIgs, Laser 128EX, and IIc Plus are faster than an Atari because they have a faster clock speed than the Atari. Same response, of course it's faster, same code, higher clock speed. But this has nothing to do with benchmarking BASIC.
-
And it's not reading data, it's just timing the sort, it's not the same sort, compiler vs interpreter...
-
I've complained about you tuning benchmarks before. You tune the benchmark, say look how fast, and the other machines don't have the same optimizations. #2 is definitely an issue when you don't make the same optimizations for all machines, and I've NEVER seen you make the same optimizations for other platforms.. You've used the CoCo 3 and a 6309 how many times? *edit* at least I think it was you tuning benchmarks. I'd have to look.
-
By all means, show how the code order, or sort order is somehow biased against the Atari. Alphabetical order is alphabetical order. How on earth would that be biased? If you have some logical explanation of how this isn't brutal on Microsoft BASIC, but intentionally picks on Atari, I'd like to hear it. Favoritism, fairness? I didn't even write the benchmark to compare different machines. I wrote it as a torture test to see if my new string compare worked, and if it was any faster. I'm pretty sure I didn't try to discriminate against Atari. Even with graphics cards, you have standard benchmarks that are absolute torture tests, and they run on DirectX no matter which card you have. So it's not like you have a completely different benchmark for each machine. You have different implementations of DirectX targeting different hardware, just like you have different versions of BASIC. But the benchmark is the same. I also think you'll find a lot of criticism over "tuning" drivers for modern benchmarks. Manufacturers make changes just so they get better numbers on a benchmark, but they offer no performance increase on real games. FWIW, I won't say you can't turn in better numbers, but don't expect to compare a BASIC compiler from 1985 against uncompiled BASIC from 1979 without getting a not so fast in response.
-
Yes, but it's also compiled so it doesn't have to perform any parsing, or evaluate order of operations for math on the fly. Most time consuming tasks like that are handled at compile time. While it's not native code, the overhead is significantly lower than with a regular interpreter.
-
Given the differences between Atari BASIC and MS BASIC, some leeway is required to even get the benchmark to run, but the code should be as close to the same as possible if we want a meaningful speed comparison. That concept always seems to get thrown out the window in favor of "look how fast we can do it if we completely change the code!" So far the fastest way is a modern compiler where the code looks nothing like the original, or a version of BASIC that wasn't available until Dec 1985. I was hoping for a faster solution that could work on original Atari BASIC so it could compete better with the Apple II series in the early years. I say early years because the first Applesoft BASIC compiler was released 1981, and the first accelerator board in 1982, so if you want to pull out all the stops, keep in mind that becomes an option on other machines as well. If you want to "win" a battle, you might want to at least try to limit the time frame from 1978-1980. If it's not possible, then it's just not possible, but close would certainly go a long way to disprove criticism of Atari BASIC string handling.
-
FWIW, an inefficient sort is more of a torture test for string functions than an efficient one. I just don't want to wait on a bubble sort.
-
Just be aware that when I made the original read through the data twice, checked the results of the sort, and printed the results, it was intentional. It wasn't just about the sort. People missed that point from the start. I wanted a benchmark that wasn't just limited to one thing. Ahl's benchmark, prime number generation, etc.. mostly deal with math. I just wasn't in the mood to argue. But the "benchmark" tuning is getting out of hand at this point.
-
Worst case performance is the same as quick sort. Best case and average depend on number of passes. It might be the same as quick sort... or not. And then you have to do QuickSort on every version. Why can't we use a machine language sort is the next question. It's not about getting to the end the fastest, it's about how long it takes to get to the end the same way.
-
I hate to be nit picky, but that isn't doing even close to the same amount of work, so it's an apples and oranges comparison. But the sort is fast. *edit* FWIW, I think it's awesome to be able to define data like that.
-
The MC-10 with an HD6303 should complete this somewhere around 4 minutes (240 seconds) +-10 seconds depending on the speedup it offers here. I haven't modded a machine to verify % speedup yet, but the HD6309 speeds up the CoCo by around 20% in native mode with no 6309 specific code, so I guessed between 15% and 20% faster.
-
The timer does appear to work while the Plus/4 is in high speed mode. The sort alone takes over 4 minutes.
