Jump to content

dmsc

Members
  • Content Count

    553
  • Joined

  • Last visited

  • Days Won

    1

dmsc last won the day on July 14 2017

dmsc had the most liked content!

Community Reputation

951 Excellent

1 Follower

About dmsc

  • Rank
    Dragonstomper

Profile Information

  • Gender
    Male
  • Location
    Viña del Mar, Chile

Recent Profile Visitors

7,789 profile views
  1. Hi! But you can contribute in other ways : You could help writing documentation, tutorials and simple code examples to aid beginners. Write more test for the test-suite at https://github.com/dmsc/fastbasic/tree/master/testsuite/tests. Currently there are tests for almost all statements, but no specific test for functions and operators. For this, you write a "BAS" file and a "CHK" file that provides input and the expected output of the test. If you use notepad++, perhaps writing a syntax highlighting file and BAT scripts to automate cross-compiling and testing. And at last, you can simply provide with new ideas and bug reports! Have fun!
  2. Hi! On FastBasic, you can do A$ =+ B$ to copy B$ at the end of A$. I did not implement the easier to read " A$ = A$ + B$ " because it would need a temporary string (being slower), and normally you need just to concatenate and not to create an arbitrary string. Have fun!
  3. Hi again! This is bigger than needed, using two sectors instead of one making boot a little slower Attached is the same but shortened to 119 bytes, MADS source and XEX. Have fun! runbas.asm runbas.xex
  4. Hi Mr-Atari! Went to your page, and downloaded v2.07, but the changelog text does not list any other change... ¿typo in the post title, or really a new version? Have Fun!
  5. Hi Preppie! First, remember that there are currently two FastBasic compilers, the native one (the Atari IDE) and the cross compiler. Both share the same parser and interpreter, but the cross compiler adds a peephole optimization pass that reorder and transform the code, so it can give different speed for the same code. About byte variables, the problem is that currently all arithmetic is performed as words, and also the stack is word sized, so it would only be faster for reading / writing the variable, and I don't think that the added size to the IDE would be worth it. Also, currently I have "GET", "GET #", and "INPUT" accept byte references, but "FOR", "INC" and "DEC" only work with words. In the cross compiler, the expressions "A = 0" and "A = 1" are special cased so are faster than other values (as the VM have tokens to load 0 and 1), but that optimization is not done in the IDE. Also, FastBasic uses byte expressions for the "boolean" operators, as they are assumed to be always "1" or "0", so "A AND B OR C" converts A, B and C to booleans before doing the AND/OR: VAR_LOAD "A" COMP_0 PUSH VAR_LOAD "B" COMP_0 L_AND PUSH VAR_LOAD "C" COMP_0 L_OR CJUMP jump_lbl_1 In the cross compiler, the code is a little shorter, because the "load variable" are merged with the "push" tokens: VAR_LOAD "A" COMP_0 PUSH_VAR_LOAD "B" COMP_0 L_AND PUSH_VAR_LOAD "C" COMP_0 L_OR CJUMP jump_lbl_1 Don't worry, I'm happy with anyone finding FastBasic useful. And also, instead of money, you could try to contribute to the code, it is open source after all Well, you are right on the times. I just measured about 81 cycles for each "INC VAR" (this is in the cross-compiler) and 291 cycles for "VAR = VAR + ##", so it is a valid optimization. Currently the cross compiler transforms "A = A + 1" and "A = A - 1" to "INC A" and "DEC A", so I could teach the cross-compiler to also transform "A = A + 2", "A=A+3", "A=A-2", "A=A-3" and "A=A-4" also, this would bring the gain to all cases, and it is simple (only add a new peep-hole rule). I don't like adding something like "INC A, B", because I think is not clear enough and the speed is not that much faster. Also, in my prof-of-concept compiler to 6502 code (available in a branch over github) the addition would be faster than multiple INC's. Currently the VM has tokens for shift-left by one and shift-left by eight, and in the cross compiler if you write " A = A * 2 " or " A = A * 4" it is transformed to one or two shifts. Using the cross-compiler you can discover how the code is transformed by reading the resulting ASM file, and if you pass the "-n" option the code produced is exactly the same as the IDE, so you can compare both. Have Fun!
  6. Hi Preppie! Sorry for not replying earlier, I was very busy last week. Great that you solved it! The PMGRAPHICS command is modeled after the Altirra BASIC one, it places the P/M area *bellow* the current MEMTOP value. This means that there is free memory between the last player and the display list, as this code shows: GR. 0 PMGR. 1 ? DPEEK($2E5) - ( PMADR(3) + 256 ) This shows 1055 bytes free in graphics 0, and 1953 in graphics 7, but only 53 bytes in graphics 8+16, 9, 10, 11 and 15+16. The memory bellow the missiles ( so, bellow PMADR(-1) ) is the new top of RAM, and will be used for your program arrays and strings. A simple memory map of FastBasic RAM usage is: $2000: Start of runtime - jump table bytecode_start: End of runtime, start of program bytecode bytecode_end: End of bytecode. Next bytes are free up to next page, variables are always aligned to a page. var_page: Page of start of variable data, 2 bytes for each integer, string and array variable, 6 bytes for each floating-point variable. array_ptr: Start of array and string data - this area grows each time an array is DIMensioed or a new string is assigned. array_ptr: End of array data (array_ptr holds current end of array data, start at top and grows). ----------: Free memory PGMBASE: Base of player/missile memory ----------: Free memory up to display-list MEMTOP: Top of application memory (set by OS, depend on available RAM) Note that the runtime size depends on the interpreter (integer only or floating-point), and in the cross-compiled only the used functionality is included, so the start of bytecode is not predictable. Have Fun!
  7. Hi Preppie! Thank you! And please, do tell if you have any suggestion to the documentation (or to FastBasic itself) to make it easier to port programs from TurboBasicXL. I would use the cross compiler plus a small batch script to run Altirra with the generated .XEX as an argument. I don't use Windows, but I understand that the NppExec plugin allows running arbitrary commands just pressing F6 (or control-F6). So, you could simply run: C:\path\to\fb myprog.bas || exit /b %errorlevel% C:\path\to\Altirra myprog.xex Have fun!
  8. Hi! Don't worry, I can re-run the optimizations after you fix the rounding, I suspect the result will be about the same. Have Fun!
  9. Hi again! I fixed my code so that it produces the same results that the Altirra math-pack and re-run the optimization for the EXP10 coefficients. This time, I minimized the maximum relative error and the mean relative error over small intervals, this is slower but produces stabler results. Also, I tried to get the most zeroes in the coefficients possible. The best turned out this: .byte $3F, $01, $47, $00, $00, $00 ; 0.0147 .byte $7E, $20, $30, $00, $00, $00 ; -0.002030 .byte $3F, $09, $19, $68, $00, $00 ; 0.091968 .byte $3F, $19, $21, $32, $00, $00 ; 0.192132 .byte $3F, $54, $47, $30, $44, $00 ; 0.54473044 .byte $40, $01, $17, $01, $83, $62 ; 1.17018362 .byte $40, $02, $03, $47, $86, $04 ; 2.03478604 .byte $40, $02, $65, $09, $44, $76 ; 2.65094476 .byte $40, $02, $30, $25, $85, $14 ; 2.30258514 .byte $40, $01, $00, $00, $00, $00 ; 1 This is a plot of the mean interval error (this is the mean error over 1M numbers) with respect to the correctly rounded result, about 10% better than the original: Also, zooming in at the end of the interval you see that this gives a lot better results near 1.0, this makes EXP10 more stable around integer values. I tested the coefficients in your Altirra Extended Basic implementation, average runtime for EXP10 went from 21347 to 20351 cycles, about 5% faster, and testing over 870000 values from 0.01 to 1.0 mean relative error went from 2.6789621e-09 to 2.4933891e-09. Over the weekend I will try to optimize LOG10, for this I will need to implement division on my code Have Fun!
  10. Hi! I tried about 24 million adds/subs and 50 mullion multiplications, the only differing case was on normalization after add: Here I was getting 101 as result, but your code does not round up. Also, I discovered a bug in my SUB implementation, I was rounding on scaling down, but that does not work when the end part is exactly 0.5000... (on sub, you must round down, on add, round up), so your code was correct. Have Fun!
  11. Hi! I tried brute-forcing the constants around a min-max approximation, minimizing the relative error, constraining that EXP10(1)=10 and EXP10(0)=1 exactly and searching for the most zeroes in the coefficients (to minimize runtime). Problem is, my code does not emulate exactly the rounding done in your math-pack, so the result is not 100% accurate, but good enough. My current best coefficients are (first is always 1 with the constrain EXP10(0)=1): 2.65094503 2.03478568 1.17018241 0.5447325000 0.1921384000 0.0919451600 -0.002005300000 0.0146910000 With those, I got a maximum error of 9.99948e-09, this is one ULP at small values of X, this is the plot of relative error. I also tried searching for sets of 9 coefficients (one less than current), there best error is 7*10^-8.
  12. Hi! Well, FastBasic cheats a little, by parsing the expression as "integer power", it simply uses a multiplication tree to calculate the result, same as TuboBasicXL and Altirra Extended Basic for numbers less than 100. You can test this code to see the difference, as we can force a floating point exponent: ? 2^128 ' Integer exponent ? 2^128.0 ' FP exponent Of course, if you use a better math pack (for example, the excellent Altirra Mathpack), you get the best results: I don't like the power optimization on runtime, because it can cause discontinuities in the function, this leads to errors in algorithms and plots, see this: The power function is not monotonic! And in TurboBasicXL it is worse: But I agree, that because you don't have an integer data type in Altirra Extended Basic, this optimization is the best possible. Have Fun!
  13. Hi! That is not correct. A proper implementation of binary floating point routines will always print an equal or smaller representation than the original. But yes, the implementation could have bugs The problems in floating point are more subtle, like this in the C64:
  14. Hi!, Well, if you need floating point arrays in FastBasic, just ask! Attached is a new beta (testing) version, compiled from current sources, adding floating point arrays. This version has many changes from the current released 4.0, I should do a proper release soon. With this version, and using the "AtariOS-800XE-Rev03-FastMath.rom" rom that you posted on another thread, the timings are: The benchmark source is in the ATR image, the changes are like this (line 24): Have Fun! fastbasic-4.1-beta2.atr
  15. Hi! Thanks. I'm not near the PC now, but you must use te following command line: lzss -6 input.rsap test.lz12 [code] This is the same as: [code] lzss -b 16 -o 8 -m 1 input.rsap test.lz12
×
×
  • Create New...