phaeron #51 Posted June 30, 2019 2 hours ago, DrVenkman said: In a FB discussion with some friends recently we were discussing mathematical implementations for numerical modeling (*) and happened to run across a support article from Apple briefly discussing how they have moved the calculation engine for their Numbers spreadsheet to a 128-bit base. That got me wondering how well my various calculator apps, programs and old computers would be able to calculate the simple expression "2^128", and how many decimal places out the scientific notation the errors would creep in. The current version of Excel expresses the result (to 11 decimal places) as 3.40282366921E+38, which is the same result provided by my HP-48G and my iOS HP emulators. (**) Altirra Extended BASIC has a markedly more accurate result. Still not objectively "correct", but a couple decimal places closer. I don't expect anyone is coding astronomical, cosmological or subatomic problems via numerical modeling on our 8-bits, but it's always nice to see increased accuracy in the math when possible. The reason for this is rounding: all results in ATXBasic's math pack are rounded. The AltirraOS math pack does this as well, so you will get similar results with Atari BASIC under that OS. I think there was one corner case of double-rounding in the AltirraOS math pack that I couldn't avoid for code size reasons but was able to fix in ATXBasic. Power operations in Atari BASIC compatible interpreters are implemented as 10^(log(x) * y). The intermediate log(2) result computed by ATXBasic is 0.3010299956, which is 1 ulp off from correct -- should be 0.3010299957. Still, not a bad end result compared to IEEE 754 double arithmetic, which has ~15 digit precision. Atari decimal FP has 9-10 digits of precision which limits the best possible result from this formula to 3.40282371E+38. Note that Turbo-Basic XL and ATXBasic have an optimization where integer powers are converted to multiplication trees, which is both faster and more accurate. It doesn't kick in this case because the exponent is larger than two digits, and Turbo-Basic XL actually gives an overflow error (!), but you can see the result by manually squaring 2^64. This does manage to give a fully correct result within representable precision. 1 hour ago, billkendrick said: Does AEB support bare "NEXT"s (no avar), like I recall TBXL supporting? (The docs don't show the avar being optional) No, just tried it in Turbo-Basic XL 1.5 and it isn't accepted there either. 2 Quote Share this post Link to post Share on other sites
Steve Mynott #52 Posted June 30, 2019 7 hours ago, DrVenkman said: I haven't spent much time coding in the last, oh, 3 decades or so, but it's wonderful to see this version of BASIC being maintained and improved by the one singular living person who probably knows best how the Atari hardware and software works and cooperates together. In a FB discussion with some friends recently we were discussing mathematical implementations for numerical modeling (*) and happened to run across a support article from Apple briefly discussing how they have moved the calculation engine for their Numbers spreadsheet to a 128-bit base. That got me wondering how well my various calculator apps, programs and old computers would be able to calculate the simple expression "2^128", and how many decimal places out the scientific notation the errors would creep in. The current version of Excel expresses the result (to 11 decimal places) as 3.40282366921E+38, which is the same result provided by my HP-48G and my iOS HP emulators. (**) Atari BASIC, using the built in FP package gets bitten by rounding errors: Altirra BASIC version 1.5 and BASIC XE seems to also rely on the same built-in FP package, or at least one with the same inaccuracies with large numbers: Altirra Extended BASIC has a markedly more accurate result. Still not objectively "correct", but a couple decimal places closer. I don't expect anyone is coding astronomical, cosmological or subatomic problems via numerical modeling on our 8-bits, but it's always nice to see increased accuracy in the math when possible. ALSO: I probably should break down and start playing with Altirra Extended BASIC some more - looks like it's already great, on its way to pretty spectacular. (*) I last coded some orbital mechanics simulations as an undergrad studying aerospace engineering around 1988 or 1989. I coded in OSS Personal Pascal on my 1040ST because ST BASIC sucked ass and gave me terribly inaccurate results. (**) The actual value, expressed as a number, is big. I mean, really freaking big: 340,282,366,920,938,000,000,000,000,000,000,000,000 😮 I just tried Fast BASIC 4 FP and it looks like its the winner correct to 8 decimal places rather than 7! 2 Quote Share this post Link to post Share on other sites
billkendrick #53 Posted June 30, 2019 Discovering the Player/Missile Graphics commands, which are exciting. (Had never used BASIC XL before, but knew of it, thanks to Compute! books I had as a kid.) I'm realizing that PMG stuff gets turned off when the program ends (last line executed, or END issued). Makes it a bit hard to toy with things in immediate mode. Noticed that, e.g., "PMMOVE 0,100" results in an unexpected "Error- 30 ?Label at line NNN". So I guess "STOP" it is... Quote Share this post Link to post Share on other sites
billkendrick #54 Posted June 30, 2019 PS to get it to work on my The Ultimate SD Cart, I had to use the little "RomToCar" (html+JavaScript (!)) tool to convert the "atxbasic.bin" to a suitable "CAR" file. (Slightly different header, I guess?) Also, these PMG commands aren't in the online help, and also not noted under the list of non-Atari BASIC-compatible commands in the PDF docs. Oh, and listing the version # in the online help would be groovy. Quote Share this post Link to post Share on other sites
dmsc #55 Posted June 30, 2019 Hi! 8 hours ago, Steve Mynott said: I just tried Fast BASIC 4 FP and it looks like its the winner correct to 8 decimal places rather than 7! Well, FastBasic cheats a little, by parsing the expression as "integer power", it simply uses a multiplication tree to calculate the result, same as TuboBasicXL and Altirra Extended Basic for numbers less than 100. You can test this code to see the difference, as we can force a floating point exponent: ? 2^128 ' Integer exponent ? 2^128.0 ' FP exponent Of course, if you use a better math pack (for example, the excellent Altirra Mathpack), you get the best results: I don't like the power optimization on runtime, because it can cause discontinuities in the function, this leads to errors in algorithms and plots, see this: The power function is not monotonic! And in TurboBasicXL it is worse: But I agree, that because you don't have an integer data type in Altirra Extended Basic, this optimization is the best possible. Have Fun! 3 Quote Share this post Link to post Share on other sites
phaeron #56 Posted June 30, 2019 49 minutes ago, dmsc said: I don't like the power optimization on runtime, because it can cause discontinuities in the function, this leads to errors in algorithms and plots, see this: The power function is not monotonic! Yeah, this is basically the much faster replacement for the Atari BASIC rounding hack. It's so much faster than the full formula that it can't be ignored since TBXL is used for so many games. It might be possible to retune the LOG10/EXP10 constants to improve the accuracy and thus range of monotonicity -- think currently it's just using the same constants as the AltirraOS math pack, which has to worry about compatibility with the rounding hack, whereas ATXBasic's math pack doesn't. PLYEVL probably also evaluates in the wrong order for accuracy given the values passed into it. I'm not actually overly concerned about accuracy/monotonicity and am fine with a 1ulp error on non-identity values for complex functions, but it's always nice to have. The more pragmatic reason for the improved accuracy is that it's a way to get to "at least as good as Atari BASIC / TBXL" without copying the specific errors from the original math pack, which is in turn difficult to do without actually copying bits from the math pack, which is a no-no here. 3 Quote Share this post Link to post Share on other sites
dmsc #57 Posted July 1, 2019 Hi! 18 hours ago, phaeron said: Yeah, this is basically the much faster replacement for the Atari BASIC rounding hack. It's so much faster than the full formula that it can't be ignored since TBXL is used for so many games. It might be possible to retune the LOG10/EXP10 constants to improve the accuracy and thus range of monotonicity -- think currently it's just using the same constants as the AltirraOS math pack, which has to worry about compatibility with the rounding hack, whereas ATXBasic's math pack doesn't. PLYEVL probably also evaluates in the wrong order for accuracy given the values passed into it. I'm not actually overly concerned about accuracy/monotonicity and am fine with a 1ulp error on non-identity values for complex functions, but it's always nice to have. The more pragmatic reason for the improved accuracy is that it's a way to get to "at least as good as Atari BASIC / TBXL" without copying the specific errors from the original math pack, which is in turn difficult to do without actually copying bits from the math pack, which is a no-no here. I tried brute-forcing the constants around a min-max approximation, minimizing the relative error, constraining that EXP10(1)=10 and EXP10(0)=1 exactly and searching for the most zeroes in the coefficients (to minimize runtime). Problem is, my code does not emulate exactly the rounding done in your math-pack, so the result is not 100% accurate, but good enough. My current best coefficients are (first is always 1 with the constrain EXP10(0)=1): 2.65094503 2.03478568 1.17018241 0.5447325000 0.1921384000 0.0919451600 -0.002005300000 0.0146910000 With those, I got a maximum error of 9.99948e-09, this is one ULP at small values of X, this is the plot of relative error. I also tried searching for sets of 9 coefficients (one less than current), there best error is 7*10^-8. Quote Share this post Link to post Share on other sites
phaeron #58 Posted July 2, 2019 18 hours ago, dmsc said: I tried brute-forcing the constants around a min-max approximation, minimizing the relative error, constraining that EXP10(1)=10 and EXP10(0)=1 exactly and searching for the most zeroes in the coefficients (to minimize runtime). Problem is, my code does not emulate exactly the rounding done in your math-pack, so the result is not 100% accurate, but good enough. Do you know the case that differs? The trickiest case is destructive cancellation during addition/subtraction, I seem to recall some icky cases involving a one digit pair exponent difference with a borrow where it was easy for the result to be over-rounded, because the rounding bit is injected into the carry but during the subtraction but the borrow then requires one normalization shift, which shifts the rounding pair back into range. Don't remember how/if I managed to solve this. Quote Share this post Link to post Share on other sites
dmsc #59 Posted July 2, 2019 Hi! 4 hours ago, phaeron said: Do you know the case that differs? The trickiest case is destructive cancellation during addition/subtraction, I seem to recall some icky cases involving a one digit pair exponent difference with a borrow where it was easy for the result to be over-rounded, because the rounding bit is injected into the carry but during the subtraction but the borrow then requires one normalization shift, which shifts the rounding pair back into range. Don't remember how/if I managed to solve this. I tried about 24 million adds/subs and 50 mullion multiplications, the only differing case was on normalization after add: Here I was getting 101 as result, but your code does not round up. Also, I discovered a bug in my SUB implementation, I was rounding on scaling down, but that does not work when the end part is exactly 0.5000... (on sub, you must round down, on add, round up), so your code was correct. Have Fun! Quote Share this post Link to post Share on other sites
thorfdbg #60 Posted July 2, 2019 On 6/30/2019 at 7:40 PM, dmsc said: Hi! Well, FastBasic cheats a little, by parsing the expression as "integer power", it simply uses a multiplication tree to calculate the result, same as TuboBasicXL and Altirra Extended Basic for numbers less than 100. I wouldn't call this cheating. Actually, what you want to do is to split off the integer part, use a multiplication tree for it, and then use the log/exp formula for the fractional part to get consistent results. Quote Share this post Link to post Share on other sites
+DrVenkman #61 Posted July 4, 2019 All this discussion of mathematical precision has gotten me motivated - or at least as motivated as I can be at my age. I’m going to try to convert my undergraduate aerospace engineering orbital mechanics assignment into Altirra Extended BASIC from the original OSS Personal Pascal, originally programmed on my ST in September 1988. Yes, at one point my brain understood this math. 😛 (These pics are snips of the handwritten notes I made about the Runge-Kutte method to solve the time-step equations and a couple pages of the Pascal code. This will be fun - assuming my motivation survives my short attention span and multiple hobbies). 3 Quote Share this post Link to post Share on other sites
dmsc #62 Posted July 4, 2019 Hi again! On 7/2/2019 at 4:21 AM, phaeron said: Do you know the case that differs? The trickiest case is destructive cancellation during addition/subtraction, I seem to recall some icky cases involving a one digit pair exponent difference with a borrow where it was easy for the result to be over-rounded, because the rounding bit is injected into the carry but during the subtraction but the borrow then requires one normalization shift, which shifts the rounding pair back into range. Don't remember how/if I managed to solve this. I fixed my code so that it produces the same results that the Altirra math-pack and re-run the optimization for the EXP10 coefficients. This time, I minimized the maximum relative error and the mean relative error over small intervals, this is slower but produces stabler results. Also, I tried to get the most zeroes in the coefficients possible. The best turned out this: .byte $3F, $01, $47, $00, $00, $00 ; 0.0147 .byte $7E, $20, $30, $00, $00, $00 ; -0.002030 .byte $3F, $09, $19, $68, $00, $00 ; 0.091968 .byte $3F, $19, $21, $32, $00, $00 ; 0.192132 .byte $3F, $54, $47, $30, $44, $00 ; 0.54473044 .byte $40, $01, $17, $01, $83, $62 ; 1.17018362 .byte $40, $02, $03, $47, $86, $04 ; 2.03478604 .byte $40, $02, $65, $09, $44, $76 ; 2.65094476 .byte $40, $02, $30, $25, $85, $14 ; 2.30258514 .byte $40, $01, $00, $00, $00, $00 ; 1 This is a plot of the mean interval error (this is the mean error over 1M numbers) with respect to the correctly rounded result, about 10% better than the original: Also, zooming in at the end of the interval you see that this gives a lot better results near 1.0, this makes EXP10 more stable around integer values. I tested the coefficients in your Altirra Extended Basic implementation, average runtime for EXP10 went from 21347 to 20351 cycles, about 5% faster, and testing over 870000 values from 0.01 to 1.0 mean relative error went from 2.6789621e-09 to 2.4933891e-09. Over the weekend I will try to optimize LOG10, for this I will need to implement division on my code Have Fun! 3 Quote Share this post Link to post Share on other sites
phaeron #63 Posted July 4, 2019 That's some really great work, but I should fix that rounding issue first in case it influences the results. That particular case doesn't look too bad, as it's in the addition renormalization code and I should be able to re-round the result with a slightly altered rounding constant. It'll be off by a tiny amount, but the ATXBasic math pack doesn't do all out round-to-nearest-even-with-sticky-bits kind of full rounding anyway. Quote Share this post Link to post Share on other sites
dmsc #64 Posted July 8, 2019 Hi! On 7/4/2019 at 12:59 AM, phaeron said: That's some really great work, but I should fix that rounding issue first in case it influences the results. That particular case doesn't look too bad, as it's in the addition renormalization code and I should be able to re-round the result with a slightly altered rounding constant. It'll be off by a tiny amount, but the ATXBasic math pack doesn't do all out round-to-nearest-even-with-sticky-bits kind of full rounding anyway. Don't worry, I can re-run the optimizations after you fix the rounding, I suspect the result will be about the same. Have Fun! Quote Share this post Link to post Share on other sites