tebe Posted June 20, 2020 Author Share Posted June 20, 2020 (edited) spiral2 (additional optimization) three possible ways to optimise (slow, medium, fast) SLOW perlin_l : array [0..0] of byte; perlin_h : array [0..0] of byte; vram: PByte; zp: PByte; ; p := perlin_h[zp[k] + time]; lda ZP add K tay lda ZP+1 adc #$00 sta :bp+1 lda (:bp),y add TIME sta :STACKORIGIN+9 lda #$00 adc #$00 sta :STACKORIGIN+STACKWIDTH+9 lda PERLIN_H add :STACKORIGIN+9 tay lda PERLIN_H+1 adc :STACKORIGIN+STACKWIDTH+9 sta :bp+1 lda (:bp),y sta P ; vram[0] := perlin_l[zp[k] + time] or p; lda ZP add K tay lda ZP+1 adc #$00 sta :bp+1 lda (:bp),y add TIME sta :STACKORIGIN+10 lda #$00 adc #$00 sta :STACKORIGIN+STACKWIDTH+10 lda PERLIN_L add :STACKORIGIN+10 tay lda PERLIN_L+1 adc :STACKORIGIN+STACKWIDTH+10 sta :bp+1 lda (:bp),y ora P mvy VRAM+1 :bp+1 ldy VRAM sta (:bp),y MEDIUM perlin_l : array [0..255] of byte; perlin_h : array [0..255] of byte; vram: PByte; zp: PByte; ; p := perlin_h[zp[k] + time]; lda ZP add K tay lda ZP+1 adc #$00 sta :bp+1 lda (:bp),y add TIME tay lda adr.PERLIN_H,y sta P ; vram[0] := perlin_l[zp[k] + time] or p; lda ZP add K tay lda ZP+1 adc #$00 sta :bp+1 lda (:bp),y add TIME tay lda adr.PERLIN_L,y ora P mvy VRAM+1 :bp+1 ldy VRAM sta (:bp),y FAST perlin_l : array [0..255] of byte; perlin_h : array [0..255] of byte; vram: PByte absolute $e0; zp: PByte absolute $e2; ; p := perlin_h[zp[k] + time]; ldy K lda (ZP),y add TIME tay lda adr.PERLIN_H,y sta P ; vram[0] := perlin_l[zp[k] + time] or p; ldy K lda (ZP),y add TIME tay lda adr.PERLIN_L,y ora P ldy #$00 sta (VRAM),y spiral2.obx spiral2.pas Edited June 20, 2020 by tebe 2 Quote Link to comment Share on other sites More sharing options...
tebe Posted June 20, 2020 Author Share Posted June 20, 2020 (edited) other tips, STRINGS writeln('john has a cat'); writeln('john has a dog'); these strings are stored in the memory as $0e,'john has a cat',$00 $0e,'john has a dog',$00 SHORTER writeln('john ','has ','a ','cat'); writeln('john ','has ','a ','dog'); these strings are stored in the memory as $05,'john ',$00, $04,'has ',$00, $02'a ',$00,$03,'cat',$00 $03,'dog',$00 Edited June 20, 2020 by tebe 4 Quote Link to comment Share on other sites More sharing options...
tebe Posted June 20, 2020 Author Share Posted June 20, 2020 (edited) other tips, BOOLEAN = TRUE if SKIP = true then writeln('ok'); lda SKIP cmp #$01 jne l_0077 @printSTRING #CODEORIGIN+$0007 @printEOL l_0077 SHORTER if SKIP then writeln('ok'); lda SKIP jeq l_0071 @printSTRING #CODEORIGIN+$0007 @printEOL l_0071 Edited June 20, 2020 by tebe 4 Quote Link to comment Share on other sites More sharing options...
zbyti Posted June 20, 2020 Share Posted June 20, 2020 (edited) @tebe It's nice that you show the implementation of the code in ASM, it helps us to choose the right structure depending on our needs. Please do so in the future :] Edited June 20, 2020 by zbyti 1 Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 18, 2020 Share Posted July 18, 2020 Any C expert here? Less is better :] Mad Pascal 942 frames CC65 1728 frames cl65 -t atari -Osir -Cl --add-source -o millionare.xex millionare.c 2ml-for-downto.pas millionare.c millionare.xex 2ml-for-downto.xex Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 18, 2020 Share Posted July 18, 2020 register signed char a,b,c,d,e,f,g; for (g = 9; g >= 0; --g){} 1539 frames millionare.c Quote Link to comment Share on other sites More sharing options...
devwebcl Posted July 18, 2020 Share Posted July 18, 2020 (edited) --my bad-- Edited July 18, 2020 by devwebcl Quote Link to comment Share on other sites More sharing options...
ivop Posted July 18, 2020 Share Posted July 18, 2020 If you change it to unsigned and change the test to <=9, it will probably faster. But a proper C compiler would turn this in to one big NOP, like gcc-6502 -O3, unless you declare them volatile. CC65 does not seem to distinguish beween volatile and non-volatile. But I agree with @devwebcl, this belongs in another thread. On topic, it's great that Mad Pascal compiles a useless loop way more efficient. If I'm not mistaken, the new incarnation of Effectus (Action compiler) uses Mad Pascal as a backend (i.e. it translates Action to Pascal) and it surpasses the original Action compiler. Does Pascal have a way to indicate whether a variable is non-volatile or not? Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 18, 2020 Share Posted July 18, 2020 (edited) @ivop Thanks for the answer Due to the algorithm counting from 1 999 999 to 0 it is how it is in for loop. Quote On topic, it's great that Mad Pascal compiles a useless loop way more efficient. Not really, I run simple performance tests of various languages on the 6502. Check Mad Pascal xex ---------------------- I noticed that the stack is working intensively while the CC65 is running. ---------------------- Quote If you change it to unsigned and change the test to <=9, it will probably faster. Nop... It's 1728 frames. Edited July 18, 2020 by zbyti useless loop Quote Link to comment Share on other sites More sharing options...
+Stephen Posted July 18, 2020 Share Posted July 18, 2020 1 hour ago, zbyti said: I noticed that the stack is working intensively while the CC65 is running. C is inherently a stack heavy language which is why I've always questioned its use on a register starved 6502 with fixed 256 byte stack. To be fair though, once I started coding C or C++ it was in the 486 days and it's very hard to go back after starting that way. To this day, I prototype on the Atari in BASIC and when I need speed I just go right to 6502. No compiler to get in my way, no behind the scenes magic to worry about. Probably also why I never finish anything 1 Quote Link to comment Share on other sites More sharing options...
tebe Posted July 18, 2020 Author Share Posted July 18, 2020 (edited) for a:=1 downto 0 do for b:=9 downto 0 do for c:=9 downto 0 do for d:=9 downto 0 do for e:=9 downto 0 do for f:=9 downto 0 do for g:=9 downto 0 do ; without a redundant "begin" / "end" in this case Edited July 18, 2020 by tebe 1 Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 18, 2020 Share Posted July 18, 2020 (edited) To be fair, I haven't written a single-row DL yet like for other languages, so as far I understand ANTIC steals cycles. If I determine the best version of the FOR loop for CC65 I will create DL. Edited July 18, 2020 by zbyti typo Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 18, 2020 Share Posted July 18, 2020 (edited) 2 hours ago, Stephen said: C is inherently a stack heavy language which is why I've always questioned its use on a register starved 6502 with fixed 256 byte stack. I was wrong, stack on page one seems to works as usual. Instead write DL I'll turn off the ANTIC and repeat the tests ----------------------- Countdown from 1 999 999 to 0, FOR loop, ANTIC OFF MadPascal: 946 frames - 18,92 sec. CC65: 1131 frames - 22,62 sec. Edited July 18, 2020 by zbyti ANTIC OFF Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 19, 2020 Share Posted July 19, 2020 (edited) Mad Pascal Compiler version 1.6.4 [2020/07/19] for 6502 NEW! Mad Pascal 900 PAL frames = 18 seconds 2ml-for-downto.pas 2ml-for-downto.xex Edited July 19, 2020 by zbyti animated gif Quote Link to comment Share on other sites More sharing options...
ivop Posted July 19, 2020 Share Posted July 19, 2020 (edited) 22 hours ago, zbyti said: CC65: 1131 frames - 22,62 sec. Do you print with printf() or puts()? edit: perhaps it would be better to not print the count down at all. Just disable Antic. Register the time. Turn on the screen, and print the result. This eliminates all inefficiencies that might be present in library functions in either language. It looks like you want to test the speed of nested loops and not that of the standard libraries Edited July 19, 2020 by ivop Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 19, 2020 Share Posted July 19, 2020 (edited) @ivop #406 - there you can find C source. But showing countdown was a part of the task and it's shown via DL directly from memory, any print function was not used. I done this like that: Edited July 19, 2020 by zbyti like that Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 19, 2020 Share Posted July 19, 2020 The same benchmark but this time it is IF version 844 PAL frames = 16,88 seconds 2m-if.pas 2m-if.xex Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 19, 2020 Share Posted July 19, 2020 (edited) The same benchmark but this time it is WHILE version 865 PAL frames = 17,3 seconds 2ml-while.pas 2ml-while.xex Edited July 19, 2020 by zbyti Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 19, 2020 Share Posted July 19, 2020 (edited) The same benchmark but this time it is WHILE version with "G" loop done in ML 583 PAL frames = 11,66 seconds 2ml-plus-ML.pas 2ml-plus-ML.xex Edited July 19, 2020 by zbyti Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 19, 2020 Share Posted July 19, 2020 (edited) The same benchmark but this time it is FOR version with "G" loop done in ML 571 PAL frames = 11,42 seconds 2ml-for-ml.xex 2ml-for-ml.pas Edited July 20, 2020 by zbyti Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 20, 2020 Share Posted July 20, 2020 (edited) Time for a little clarification because @mono pointed it out. The program with the ML inset doesn't do 2 000 000 iterations, my goal was to formally see the correct countdown on the screen, this allows us to use the numbers from 0 to 9 and end the loop reaching 0. The code in ASM built on this principle by @mono is executed in 6,22 seconds! But unfolding the "G" loop on the BNE to 9x DEC allows us to go down in Mad Pascal to 8,14 seconds. 2ml-for-ml.pas 2ml-for-ml.xex millionzch9.asx millionzch9.obx ------------------ EDIT: I confused the threads, sorry, but let this post stay. Edited July 20, 2020 by zbyti confused the threads Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 24, 2020 Share Posted July 24, 2020 (edited) Action! defeated by Mad Pascal in table sort competition. Action! 151 frames Mad Pascal 112 frames bsort.xex bsort.pas Edited July 24, 2020 by zbyti refactored bsort.pas 1 Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 24, 2020 Share Posted July 24, 2020 (edited) Mad Pascal Benchmark Suite on GitHub Edited July 25, 2020 by zbyti attachments removed 1 Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 25, 2020 Share Posted July 25, 2020 (edited) Final version committed on GitHub Dedication for CC65 developers: suite.xex Edited July 26, 2020 by zbyti Dedication ;) Quote Link to comment Share on other sites More sharing options...
zbyti Posted July 26, 2020 Share Posted July 26, 2020 (edited) If someone would be tempted to write a performance test for Mad Pascal it's enough to use this unit as a template and implement new benchmark procedure: unit countdown_for; {$i '../inc/header.inc'} //---------------------- IMPLEMENTATION ---------------------------------------- procedure benchmark; var za : byte absolute counter.lms + $21; zb : byte absolute counter.lms + $22; zc : byte absolute counter.lms + $23; zd : byte absolute counter.lms + $24; ze : byte absolute counter.lms + $25; zf : byte absolute counter.lms + $26; zg : byte absolute counter.lms + $27; begin for za := 1 downto 0 do for zb := 9 downto 0 do for zc := 9 downto 0 do for zd := 9 downto 0 do for ze := 9 downto 0 do for zf := 9 downto 0 do for zg := 9 downto 0 do; end; {$i '../inc/footer.inc'} //---------------------- INITIALIZATION ---------------------------------------- initialization name := 'Countdown 2ML: FOR'~; isRewritable := false; end. // add to suite.pas uses new_unit_benchmark; // add to startRunners procedure new_unit_benchmark.run; Edited July 27, 2020 by zbyti update snippets Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.