intvnut Posted November 1, 2018 Author Share Posted November 1, 2018 (edited) Yikes, OK. I hadn't realized level 8 had that property. *d'oh* If I had once known that, I'd forgotten it. I re-ran at level 6, and profiled around 2hrs of play. It looks like the EXEC consumes about 2.7% of the total cycles. I wrote a Q&D perl script to total up the time spent in the EXEC (cycles spent in $1000 - $1FFF) vs. time spent in the game (cycles spent in addresses above $1FFF). It reported: . Exec cyc: 187689134 Game cyc: 6663889944 . When I divide that out, I get 2.7%. chess-6-dump-hst.txt Edited November 1, 2018 by intvnut 2 Quote Link to comment Share on other sites More sharing options...
mr_me Posted November 2, 2018 Share Posted November 2, 2018 I'm guessing with typical exec based cartridges, the exec uses a lot more than 3% of the processing time. With chess does that happen naturally because there's no sounds, no moving objects or did the chess programmers have to work around the exec. What would the exec processing time be for a simple ecs basic program test. Quote Link to comment Share on other sites More sharing options...
intvnut Posted November 3, 2018 Author Share Posted November 3, 2018 I'm guessing with typical exec based cartridges, the exec uses a lot more than 3% of the processing time. With chess does that happen naturally because there's no sounds, no moving objects or did the chess programmers have to work around the exec. What would the exec processing time be for a simple ecs basic program test. Chess installs its own ISR to handle display blanking, the chess clock, and "FORCE MOVE". It chains back to the EXEC ISR I believe to handle animation and sound. There is some animation and sound, basically at the start and end of a move. Quote Link to comment Share on other sites More sharing options...
mr_me Posted November 3, 2018 Share Posted November 3, 2018 Yeah but with 12 moves in over two hours, there's no sounds, no animation other than the clock, 99% of the time. So even though it hooks back to the exec sprite routines, they aren't taking much cpu time. If it's similar, an ecs basic program should have low exec usage as well. Quote Link to comment Share on other sites More sharing options...
intvnut Posted November 3, 2018 Author Share Posted November 3, 2018 (edited) Yeah but with 12 moves in over two hours, there's no sounds, no animation other than the clock, 99% of the time. So even though it hooks back to the exec sprite routines, they aren't taking much cpu time. If it's similar, an ecs basic program should have low exec usage as well. It's harder to tell with ECS BASIC than it is with Chess. Because ECS BASIC primitives synchronize with the EXEC, I believe there are delays in the ECS BASIC interpreter where it waits to sync with the EXEC, whether or not it needs to. This isn't time spent in the EXEC, but it is time spent in consequence of the EXEC. The lines are a bit blurrier between ECS BASIC and the EXEC. EDIT: FWIW, I did run my silly perl script on the dump.hst and dump2.hst I posted earlier from my 162 and 203 second runs. For dump.hst, 23% of the cycles were spent in the EXEC. For my optimized version, only 19% of cycles were spent in the EXEC. Clearly there's a bit of non-linear behavior here, as I did not optimize the interaction between the code and the EXEC. Edited November 3, 2018 by intvnut Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted November 4, 2018 Share Posted November 4, 2018 It's harder to tell with ECS BASIC than it is with Chess. Because ECS BASIC primitives synchronize with the EXEC, I believe there are delays in the ECS BASIC interpreter where it waits to sync with the EXEC, whether or not it needs to. This isn't time spent in the EXEC, but it is time spent in consequence of the EXEC. The lines are a bit blurrier between ECS BASIC and the EXEC. EDIT: FWIW, I did run my silly perl script on the dump.hst and dump2.hst I posted earlier from my 162 and 203 second runs. For dump.hst, 23% of the cycles were spent in the EXEC. For my optimized version, only 19% of cycles were spent in the EXEC. Clearly there's a bit of non-linear behavior here, as I did not optimize the interaction between the code and the EXEC. Is the ECS BASIC really just blocked waiting on the EXEC? I thought that it was synchronized with the EXEC in that the EXEC is the one which "ticks" the interpreter engine. In that way, the ECS BASIC is just another "game" running off the EXEC in EXEC time. Consequently, keys are read at 20 Hz, and each statement is executed on an EXEC "tick" boundary (every 20 Hz). Perhaps this is what you meant, but I guess I make a distinction between something like an IntyBASIC program, which runs in its own game loop and then has to "WAIT" for the ISR to synchronize, and an EXEC program which are just a bunch of subroutines triggered by the EXEC itself as it chug alongs in its game engine loop. I always thought the ECS BASIC was the latter. -dZ. Quote Link to comment Share on other sites More sharing options...
intvnut Posted November 4, 2018 Author Share Posted November 4, 2018 (edited) Is the ECS BASIC really just blocked waiting on the EXEC? I thought that it was synchronized with the EXEC in that the EXEC is the one which "ticks" the interpreter engine. In that way, the ECS BASIC is just another "game" running off the EXEC in EXEC time. Consequently, keys are read at 20 Hz, and each statement is executed on an EXEC "tick" boundary (every 20 Hz). Perhaps this is what you meant, but I guess I make a distinction between something like an IntyBASIC program, which runs in its own game loop and then has to "WAIT" for the ISR to synchronize, and an EXEC program which are just a bunch of subroutines triggered by the EXEC itself as it chug alongs in its game engine loop. I always thought the ECS BASIC was the latter. ECS BASIC is not implemented as an EXEC "process", in terms of its timer-driven process table. The ECS BASIC interpreter loop does, however, synchronize with the EXEC's 20Hz phase counter, blocking the "RUN" loop from progressing. The top of the keyword interpretation loop has this: . ; main interpreter outer loop during 'RUN' L_2E25: MVI G_0102, R0 ; 2E25 Get current EXEC phase CMPI #$0002, R0 ; 2E27 Is it phase 2 or higher? BGE L_2E2C ; 2E29 Proceed with execution PULR R7 ; 2E2B Otherwise, don't. . When you launch BASIC in the fastest execution mode—it is actually sensitive to the "slow down" mode based on pressing 1, 2, 3 instead of DISC, if memory serves—it watches the EXEC variable at $102 to determine which part of the 20Hz cycle it's in. It ordinarily counts down 2, 1, 0, 2, 1, 0. (In slower modes, it counts down from a higher number, which is how it achieves its slow-down.) Here's a trace of $102 from an EXEC based game (Astrosmash). You'll note it briefly takes on the value 3, but that's in anticipation of it getting decremented. It's a 20Hz cadence if you do the math on the cycle counts. ECS BASIC would only see the 2, 1, and 0, if I'm not mistaken. . WR a=$0102 d=0003 CP-1610 (PC = $1097) t=5247326 WR a=$0102 d=0002 CP-1610 (PC = $1130) t=5259726 WR a=$0102 d=0001 CP-1610 (PC = $1130) t=5274666 WR a=$0102 d=0000 CP-1610 (PC = $1130) t=5289600 WR a=$0102 d=0003 CP-1610 (PC = $1097) t=5291985 WR a=$0102 d=0002 CP-1610 (PC = $1130) t=5304533 WR a=$0102 d=0001 CP-1610 (PC = $1130) t=5319468 WR a=$0102 d=0000 CP-1610 (PC = $1130) t=5334402 WR a=$0102 d=0003 CP-1610 (PC = $1097) t=5336693 . So, the run loop is literally synchronizing with the EXEC by watching an EXEC state variable and busy-waiting. If an ECS BASIC statement takes less than 1/20th of a second to execute, it gets rounded up to 1/20th of a second. If an ECS BASIC statement takes longer than 1/20th of a second to execute, its execution time effectively gets rounded up to the next 20Hz boundary. Every statement ultimately seems to take some multiple of 1/20th of a second thanks to this busy-wait. That's why I say that the execution penalty incurred due to the EXEC doesn't reside entirely inside the EXEC. Much of it is in the busy-wait outside the EXEC in the main interpreter loop, in this synchronization point that waits for the EXEC to get into one of the 3 phases of its 20Hz cycle. Ironically, it appears you can speed up ECS BASIC by selecting a slower EXEC speed (pressing 1, 2, or 3 at the menu). I'll have to experiment with that later. Edited November 4, 2018 by intvnut 1 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted November 4, 2018 Share Posted November 4, 2018 (edited) ECS BASIC is not implemented as an EXEC "process", in terms of its timer-driven process table. The ECS BASIC interpreter loop does, however, synchronize with the EXEC's 20Hz phase counter, blocking the "RUN" loop from progressing. The top of the keyword interpretation loop has this: . ; main interpreter outer loop during 'RUN' L_2E25: MVI G_0102, R0 ; 2E25 Get current EXEC phase CMPI #$0002, R0 ; 2E27 Is it phase 2 or higher? BGE L_2E2C ; 2E29 Proceed with execution PULR R7 ; 2E2B Otherwise, don't. . When you launch BASIC in the fastest execution mode—it is actually sensitive to the "slow down" mode based on pressing 1, 2, 3 instead of DISC, if memory serves—it watches the EXEC variable at $102 to determine which part of the 20Hz cycle it's in. It ordinarily counts down 2, 1, 0, 2, 1, 0. (In slower modes, it counts down from a higher number, which is how it achieves its slow-down.) Here's a trace of $102 from an EXEC based game (Astrosmash). You'll note it briefly takes on the value 3, but that's in anticipation of it getting decremented. It's a 20Hz cadence if you do the math on the cycle counts. ECS BASIC would only see the 2, 1, and 0, if I'm not mistaken. . WR a=$0102 d=0003 CP-1610 (PC = $1097) t=5247326 WR a=$0102 d=0002 CP-1610 (PC = $1130) t=5259726 WR a=$0102 d=0001 CP-1610 (PC = $1130) t=5274666 WR a=$0102 d=0000 CP-1610 (PC = $1130) t=5289600 WR a=$0102 d=0003 CP-1610 (PC = $1097) t=5291985 WR a=$0102 d=0002 CP-1610 (PC = $1130) t=5304533 WR a=$0102 d=0001 CP-1610 (PC = $1130) t=5319468 WR a=$0102 d=0000 CP-1610 (PC = $1130) t=5334402 WR a=$0102 d=0003 CP-1610 (PC = $1097) t=5336693 . So, the run loop is literally synchronizing with the EXEC by watching an EXEC state variable and busy-waiting. If an ECS BASIC statement takes less than 1/20th of a second to execute, it gets rounded up to 1/20th of a second. If an ECS BASIC statement takes longer than 1/20th of a second to execute, its execution time effectively gets rounded up to the next 20Hz boundary. Every statement ultimately seems to take some multiple of 1/20th of a second thanks to this busy-wait. That's why I say that the execution penalty incurred due to the EXEC doesn't reside entirely inside the EXEC. Much of it is in the busy-wait outside the EXEC in the main interpreter loop, in this synchronization point that waits for the EXEC to get into one of the 3 phases of its 20Hz cycle. Ironically, it appears you can speed up ECS BASIC by selecting a slower EXEC speed (pressing 1, 2, or 3 at the menu). I'll have to experiment with that later. Gotcha! So essentially, all statements are executed on an EXEC "tick" boundary (20 Hz), but that's accomplished by synchronizing the ECS BASIC "RUN" loop with a busy-wait. That's interesting. I guess it is one way to avoid putting the onus on the user of having to synchronize themselves (like IntyBASIC does) when using ISR-sensitive features (like sprite updates or color modes). It's the sort of thing I would have done with P-Machinery in order to simplify the programming model of users. The problem is that the EXEC is already running at 20 Hz and the BASIC interpreter is a bit slow, so that is one heck of a penalty. Thanks for the details. -dZ. Edited November 4, 2018 by DZ-Jay Quote Link to comment Share on other sites More sharing options...
carlsson Posted February 15, 2021 Share Posted February 15, 2021 On 10/30/2018 at 11:08 AM, carlsson said: It concludes that ECS BASIC at least: * Offers one of the lowest possible text resolutions on the market at the time, possibly adequate 2-3 years earlier but not when it was released * Runs 5 times slower than the slowest competitor on the market, and we shouldn't even bother comparing it to a proper home/business computer By the way, I believe I have found a new candidate for the slowest BASIC. One that never was manufactured but still somehow exists! Many of you are familiar with the RCA Studio II, one of the first video games to use ROM cartridges, in low res and B&W. For instance @decle previously simulated it running natively on the Intellivision. Now contrary to what most people think, RCA didn't stop with the Studio II. Instead they developed the colourized Studio III and a few new games to go with it (including an infamous mostly unlicensed cash-in on Star Wars) but RCA never manufactured those themselves. Instead they licensed the new console and the games to Far East firms like Conic/Sheen as well as Hanimex/Soundic Victory MPT-02. These systems were previously known as clones of the Studio II, but all evidence hints that RCA were licensing their tech to them. Just like every other manufacturer (Mattel included), RCA were planning another 2 generations ahead with the Studio IV that would verge more towards the home computer side, and even ideas for a Studio V system far into the horizon! Alas none of these ever got produced, but remain as documents, pseudo code etc. The 1802 CPU instead lived on through the other arm, the Cosmac VIP and Elf, eventually Telmac, Comx, Pecom and so on (and the space shuttle). Now thanks to combined efforts, it has been possible to reverse engineer how the Studio IV would have worked, and run it in the Emma02 emulator. Needless to say, none or very little software exists for this never manufactured console. However it was planned to have Tiny BASIC, and thanks to combining efforts from the Cosmac VIP with documents about the Studio IV, the emulator now is capable of running an emulated Tiny BASIC environment. It is indeed a tiny environment. No FOR loops (??), requirement of LET for every variable, has a colon operator but doesn't support more than instruction per line anyway. Thus in order to run @intvnut's benchmark program, I had to rewrite it slightly: 10 LET I=1 20 LET A=A+I 30 LET A=A*I 35 LET A=A/I 40 LET I=I+1 45 IF I<1000 THEN 20 50 PRINT A We must remember that this program took 322 seconds on original ECS BASIC and 253 seconds on the optimized one. The Comx BASIC took 72 seconds and the CreatiVision about 75-80 seconds. So what does this emulated RCA Studio IV clock in at? Roughly 16 minutes and 35 seconds = 995 seconds! That is just over 3 times as much time as the ECS BASIC and almost 14 times longer than the Comx BASIC that among micros is considered to be slow. Actually at first I ran the benchmark without the division on line 35 which then took about 760 seconds, or 3.5 times as long as ECS BASIC. Also the emulator outputs a final value of A = -29 which I suppose it the cause of multiple times overflow and other funky business but we didn't look for accuracy here, we looked for speed! Now I'll dig into which other features this BASIC has, if it is possible to use for anything meaningful. I'm still considering checking the VideoBrain (using a Fairchild F8) and possibly the Interact / Victor Lambda to see if anyone of these two also is slower than ECS BASIC. But yay! We found a system - although never manufactured and in theory intended to be on the market at least 3-4 years before the ECS if RCA had not dropped it all - that at least through emulation is much slower. 4 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted February 15, 2021 Share Posted February 15, 2021 Yikes! That's is slow! Still, if it was never manufactured, I guess the ECS still holds the crown for slowest BASIC ever. Go Mattel! ?? 2 Quote Link to comment Share on other sites More sharing options...
carlsson Posted February 15, 2021 Share Posted February 15, 2021 (edited) Still inside the Emma02 emulator, I also tested the Cosmac VIP II, which should be a fairly similar animal except for its 1802 is clocked at half the frequency (1.79 MHz vs 3.58 MHz). Imagine my surprise when the built-in BASIC not only supports FOR and works without LET, but runs the above program in roughly 190 seconds. That is 60% of the execution time of ECS BASIC, or just below 20% of the execution time on the Studio IV I just benchmarked! Probably the two BASIC implementations are quite different under the hood, in particular if the VIP II runs on half the clock frequency but is five times faster... Edit: Also the F8 powered VideoBrain is out of competition because it doesn't have BASIC, it came with APL/S as its programming language! Edit 2: The Hector 2HR+ running BASIC III executes this program in about 6 seconds?!? That is twice as fast as the Apple II. Now this is a far newer machine than the Interact, so I'm trying to get that one to run as well. Edit 3: I got Edu-BASIC running on the Interact. It is another Tiny BASIC that limits integers to the range -32768 to +32767, meaning that once we get to iteration 40 and try to multiply by 820, we get an overflow since 820*40 = 32800. That was the first case I've got stuck with this, meaning I have to look up Microsoft BASIC Level II instead. Edit 4: I found Microsoft BASIC 4.7 which loads, but as soon as I try to type in a program line, the system resets. Direct mode is OK. Edited February 15, 2021 by carlsson Quote Link to comment Share on other sites More sharing options...
intvnut Posted February 16, 2021 Author Share Posted February 16, 2021 14 hours ago, carlsson said: Probably the two BASIC implementations are quite different under the hood, in particular if the VIP II runs on half the clock frequency but is five times faster... Indeed. To get that degree of slowdown, it's either reparsing every time, or it has unnecessary VBlank sync (like ECS BASIC) or some other fundamental deep wrongness. Quote Link to comment Share on other sites More sharing options...
carlsson Posted February 16, 2021 Share Posted February 16, 2021 (edited) The emulator has a few different versions of BASIC. I tried a much shorter loop of only 10 iterations. 1978 version: about 6 seconds (in both PAL and NTSC modes) 2020 version NTSC: about 6 seconds (with emulator set to NTSC mode) 2020 version PAL: about 9 seconds (with emulator set to PAL mode) 2020 version 32K PAL: about 9 seconds (this was the version I ran yesterday) I'm not sure if these results can be synthesized by a factor 100 but it would seem to be pretty reasonable. The notable difference in speed between NTSC and PAL, only on the reimplemented 2020 versions throws me off. I probably need to run it for far longer than 10 iterations to determine if it is such a huge factor. But even going back to the 1978 version, surely it would run at half the speed vs the ECS BASIC. Edited February 16, 2021 by carlsson Quote Link to comment Share on other sites More sharing options...
Zendocon Posted February 16, 2021 Share Posted February 16, 2021 On 11/1/2018 at 11:45 AM, intvnut said: Yikes, OK. I hadn't realized level 8 had that property. *d'oh* If I had once known that, I'd forgotten it. I re-ran at level 6, and profiled around 2hrs of play. It looks like the EXEC consumes about 2.7% of the total cycles. I remember running jzIntv in my environment without speed throttling, and trying each difficulty just to have an impressive screenshot for my documentation. At Level 7, the CPU will make one move before getting stuck in a rabbit hole, and at Level 8, forget it. On Level 6, the game lasts a total of 31 hours and ends in Stalemate, because both players keep repeating the same moves back and forth. Each player has a chance to capture the opponent's one remaining Rook, but is unwilling to sacrifice its own Rook in the process. Same goes for the Queens. Quote Link to comment Share on other sites More sharing options...
intvnut Posted February 16, 2021 Author Share Posted February 16, 2021 On 2/15/2021 at 8:43 AM, carlsson said: So what does this emulated RCA Studio IV clock in at? Roughly 16 minutes and 35 seconds = 995 seconds! That is just over 3 times as much time as the ECS BASIC and almost 14 times longer than the Comx BASIC that among micros is considered to be slow. Actually at first I ran the benchmark without the division on line 35 which then took about 760 seconds, or 3.5 times as long as ECS BASIC. Putting this into perspective: The loop iterates 1000 times. It takes almost 1000 seconds. That's one iteration per second. The loop is only 5 lines long. So, somehow, it's taking an average of ~700K cycles (at ~3.5MHz) per line of BASIC code. What the heck would you spend 700K cycles doing? At 60Hz, you get a vertical retrace every ~60K cycles, so that's about 11 display frames per line of code, average. That's around the average line length of the loop. 20 LET A=A+I ; 12 characters, 7 tokens 30 LET A=A*I ; 12 characters, 7 tokens 35 LET A=A/I ; 12 characters, 7 tokens 40 LET I=I+1 ; 12 characters, 7 tokens 45 IF I<1000 THEN 20 ; 20 characters, 7 to 11 tokens, depending on number representation That works out to an average of 13 characters per line. It drops to 11 if you treat the entire line number as one "character." The BASIC interpreter executes roughly at the rate of one source character per display frame. Something is badly broken in that interpreter. I'm going to guess there's a SYNC opcode (or DO SYNC), perhaps as a debug statement, in the core interpreter loop. SYNC and DO SYNC are the pseudo-code opcodes for vertical retrace synchronization, according to this. ECS BASIC slow for similar reasons, but not to this level of insanity. ECS BASIC doesn't do a full Vsync; rather, it just blocks execution during 2/N frames while the EXEC does its thing, and runs "full speed" during the remaining (N-2)/N frames. (N is usually 3, but can be made higher.) Is there a pseudo-code disassembly of TinyBASIC visible somewhere on the web? It should be easy to look for such a SYNC/DO SYNC statement. EDIT: Never mind, I missed it when I'd looked for it from my phone previously. It's right here. EDIT 2: I fail at reading comprehension when I'm hungry and should eat some lunch. That's the pseudo-code interpreter, not the BASIC interpreter. My request stands. 1 Quote Link to comment Share on other sites More sharing options...
carlsson Posted February 16, 2021 Share Posted February 16, 2021 (edited) Yeah. I found this page which I believe describes the pseudo code instructions, but it is as much info as the site has. To be honest, 128x64 pixels in 8 colours would have been somewhat weak by 1979 or whenever the IV was forecast. Some comparisons: Bally Astrocade (April 1978): 160x102 in 8 colours (BASIC 160x88) or with expanded RAM 320x204. APF MP-1000 (October 1978): 256x192 in 4 colours or 128x192 in 8 colours Odyssey^2 (December 1978): 160x200 from a palette of 16 colours 1292 APVS & Interton VC-4000 (around 1978/79): 40x320 (??) from a palette of 8 colours with "multiple brightness levels" We don't have to mention the Atari 2600 nor the Intellivision here. Of course if RCA had remained in the business, specs may have improved and they would not have stuck with the CDP1861/1864, though both the ELF and the Telmac 1800 computer (1977) used that one. Edited February 16, 2021 by carlsson Quote Link to comment Share on other sites More sharing options...
intvnut Posted February 16, 2021 Author Share Posted February 16, 2021 14 minutes ago, carlsson said: Yeah. I found this page which I believe describes the pseudo code instructions, but it is as much info as the site has. It looks like members of the COSMAC ELF group may be able to see these files. Quote Link to comment Share on other sites More sharing options...
carlsson Posted February 16, 2021 Share Posted February 16, 2021 The binary files are included in the emulator package, the assembler source is not. By digging through the binaries and comparing to the online documentation for VP-701 Floating Point BASIC v2.2 (for the Cosmac VIP II that also never was made, but has been recreated from sources), I kind of got most of the commands to work with a few ones still missing. Quote Link to comment Share on other sites More sharing options...
intvnut Posted February 16, 2021 Author Share Posted February 16, 2021 2 hours ago, carlsson said: The binary files are included in the emulator package, the assembler source is not. I wasn't sure if you were a member of the COSMAC ELF group or interested in joining. I already have too much email to go through. Quote Link to comment Share on other sites More sharing options...
carlsson Posted February 17, 2021 Share Posted February 17, 2021 Thanks for the link in any case. Probably this is the wrong thread for this discussion, but I already kind of continued it in its rightful place. Feel free to join the other side if you have the time and are curious about it. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.