JamesD Posted March 4, 2015 Share Posted March 4, 2015 I made the following changes to get it to run on an Apple IIe emulator. The scale probably needs further adjustment to get it exact but it runs.Takes over 32 minutes to complete the image at standard speed.The Coleco ADAM would require similar changes but with different CX and SX values. 100 SX=120:SY=56:SZ=64:CX=280:CY=192 130 HGR2:COLOR 3 240 IF RR(X1)>Y1 THEN RR(X1)=Y1:HPLOT X1,Y1 260 IF RR(X1)>Y1 THEN RR(X1)=Y1:HPLOT X1,Y1 With TV emulation enabled on the emulator. Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 5, 2015 Share Posted March 5, 2015 MSX. The value of SX may need reduced.I tried testing it on an emulator but the keyboard emulation kept screwing up my typing so I said the heck with it. 100 SX=115:SY=56:SZ=64:CX=256:CY=192 130 SCREEN 2 240 IF RR(X1)>Y1 THEN RR(X1)=Y1:PSET(X1,Y1) 260 IF RR(X1)>Y1 THEN RR(X1)=Y1:PSET(X1,Y1) Quote Link to comment Share on other sites More sharing options...
Britishcar Posted March 5, 2015 Share Posted March 5, 2015 These mods are fun. It's interesting to see the increased speed via optimization, etc. JamesD, I think there's a typo in your A2E code: Line: 130 HGR2:COLOR 3 Should be: 130 HGR2:HCOLOR=3 Does anyone have a TI-99/4a Extended BASIC version worked out? Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 5, 2015 Share Posted March 5, 2015 (edited) These mods are fun. It's interesting to see the increased speed via optimization, etc. JamesD, I think there's a typo in your A2E code: Line: 130 HGR2:COLOR 3 Should be: 130 HGR2:HCOLOR=3 Does anyone have a TI-99/4a Extended BASIC version worked out? You are correct. I couldn't cut and past from the emulator so I had to retype the lines and messed up. TI-99/4a BASIC doesn't have support for bitmap oriented commands and neither does TI Extended BASIC. It would take a lengthy subroutine to implement SET(X1,Y1) It would have to decide which screen character needs modified, then read that character, modify the proper byte and write it back. Expect run times to be many hours *if* you can even modify a character like that. You end up DIV x1 by character width and y1 by character height, get the character, figure out which row of the character the bit is on and then or the specific bit with that row of the character. Keep in mind bit number produced by this math is opposite of bit order in the character so you need an IF THEN or ON GOTO setup with 8 possibilities. Now write the character back. I don't even know if modifying the character that way is even possible. The code isn't so much complicated as it is lengthy. SET(X1,Y1) is probably as long or longer than the code that would call it. That's why people used The Missing Link. I found the manual here: ftp://ftp.whtech.com/programming/The%20Missing%20Link%20software%20manual.pdf Super Extended BASIC appears to have commands that would do the trick as well. I'm not sure the code I posted in the TI area needs the call to "PD" before "PIXEL". Edited March 5, 2015 by JamesD Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 5, 2015 Share Posted March 5, 2015 You end up DIV x1 by character width and y1 by character height, get the character, figure out which row of the character the bit is on and then or the specific bit with that row of the character.Actually, for the required graphic mode, rather than getting the character # you calculate the address of that character, add the offset to the byte and then you need to PEEK it from video RAM (VPEEK?). Then you can OR the byte with the bit and write it back with a POKE. Whatever BASIC you use will need direct support for the graphics mode or the ability to manually send commands to the VDP. Given decent documentation I could figure it out but I'm guessing someone has previously done it. Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 6, 2015 Share Posted March 6, 2015 This is the TI-99/4a BASIC version.It requires Extended BASIC and a third party BASIC extension called The Missing Link that adds bitmapped graphics support.I didn't bother completely figuring out the proper scale. The reduced screen width (240) is needed because of how the TI VDP RAM is used. DIM does not appear to support a variable as a parameter.The CALL LINK commands are making calls to The Missing Link. 100 SX=100::SY=56::SZ=64::CX=240::CY=192 110 DIM RR(256) 120 FOR I=0 TO CX::RR(I)=CY::NEXT I 130 CALL LINK("CLEAR") 140 CX=CX*0.5::CY=CY*0.46875::FX=SX/64::FZ=SZ/64 150 XF=4.71238905/SX 160 FOR ZI=64 TO -64 STEP -1 170 ZT=ZI*FX::ZS=ZT*ZT 180 XL=INT(SQR(SX*SX-ZS)+0.5) 190 ZX=ZI*FZ+CX::ZY=CY+ZI*FZ 200 FOR XI=0 TO XL 210 XT=SQR(XI*XI+ZS)*XF 220 YY=(SIN(XT)+SIN(XT*3)*0.4)*SY 230 X1=XI+ZX::Y1=ZY-YY 240 IF RR(X1)>Y1 THEN RR(X1)=Y1:: CALL LINK("PIXEL",Y1,X1) 250 X1=ZX-XI 260 IF RR(X1)>Y1 THEN RR(X1)=Y1:: CALL LINK("PIXEL",Y1,X1) 270 NEXT XI::NEXT ZI 280 GOTO 280 Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 6, 2015 Share Posted March 6, 2015 (edited) Commodore Plus/4 changes 130 COLOR 0,1:COLOR 1,2:GRAPHIC 1,1 240 IF RR(X1)>Y1 THEN RR(X1)=Y1:DRAW 1,X1,Y1 260 IF RR(X1)>Y1 THEN RR(X1)=Y1:DRAW 1,X1,Y1 Edited March 6, 2015 by JamesD Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 6, 2015 Share Posted March 6, 2015 Keep in mind that I've been doing just enough to get things running, I'm not worried about the code being perfect. Quote Link to comment Share on other sites More sharing options...
devwebcl Posted March 6, 2015 Share Posted March 6, 2015 These mods are fun. It's interesting to see the increased speed via optimization, etc. I agree 100% Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 7, 2015 Share Posted March 7, 2015 (edited) The Coleco ADAM requires similar changes as the Apple II since it's BASIC is based on Applesoft II but with a screen size the same as the CoCo 1/2Output looks almost identical to the CoCo 1/2 with this color choice but the 9918 has more colors to choose from. Any difference with this color choice would be due to differences in the floating point library. 101 SX=.8*SX:SY=.8*SY:SZ=.8*SZ 130 HGR2:COLOR=3 240 IF RR(X1)>Y1 THEN RR(X1)=Y1:HPLOT X1,Y1 260 IF RR(X1)>Y1 THEN RR(X1)=Y1:HPLOT X1,Y1 *edit*It took a little over 13 minutes to render in MESS but I don't know how accurate the ADAM MESS driver is.Other benchmarks have shown the ADAM BASIC to be fast so the results might be accurate but this machine driver supposedly has problems so it might be a little off. Edited March 7, 2015 by JamesD Quote Link to comment Share on other sites More sharing options...
Tursi Posted March 8, 2015 Share Posted March 8, 2015 Sorry for peeking in, I was curious how fast you guys had the Atari 8-bit version running. Hope you don't mind if I cross-post too, then. I think I have the record in the TI forum for both slowest and fastest port. I have a version that runs in pure TI BASIC (redefining the characters similar to how JamesD describes above, though the routine isn't all that long). I didn't let it run to completion, and it had to be scaled down to draw most of it, as only 127 characters are available to be redefined, but based on how far it got I had a rough estimate of around 45 hours to complete. A later optimized version that has the draw enhancements and also let you scale down both the size of the output and the step of the loops was able to draw the whole hat at 0.4 scale, but that's not a fair test since it also did only 40% of the loops. 9 DIM COLS(255) 10 DIM CC$(126),HPAT$(3),UPAT$(3) 11 FOR A=0 TO 126 12 CC$(A)="0000000000000000" 13 NEXT A 15 OLDMR=-1 17 OLDDR=-1 19 OLDCH=33 20 CURCHAR=33 25 STARTCHAR=33 30 HEX$="0123456789ABCDEF" 32 HPAT$(0)="89ABCDEF89ABCDEF" 34 HPAT$(1)="45674567CDEFCDEF" 36 HPAT$(2)="23236767ABABEFEF" 38 HPAT$(3)="1133557799BBDDFF" 50 GOTO 5000 99 REM PLOT A DOT AT DOTCOL,DOTROW (0-BASED) 100 MR=INT(DOTROW/8)+1 130 MC=INT(DOTCOL/8)+1 135 IF (MR=OLDMR)*(MC=OLDMC)THEN 150 140 CALL GCHAR(MR,MC,CH) 145 OLDMR=MR 146 OLDMC=MC 150 IF CH>=STARTCHAR THEN 160 152 IF CURCHAR>159 THEN 310 155 CH=CURCHAR 156 CALL HCHAR(MR,MC,CH) 157 CURCHAR=CURCHAR+1 160 AROF=CH-STARTCHAR 170 TC$=CC$(AROF) 180 XC=DOTCOL-(MC-1)*8 190 P=(DOTROW-(MR-1)**2+1 210 IF XC<4 THEN 220 212 P=P+1 214 XC=XC-4 220 X$=SEG$(TC$,P,1) 260 TT$=SEG$(HPAT$(XC),POS(HEX$,X$,1),1) 265 IF TT$=X$ THEN 300 270 TC$=SEG$(TC$,1,P-1)&TT$&SEG$(TC$,P+1,16-P) 280 IF CH=OLDCH THEN 290 285 CALL CHAR(OLDCH,CC$(OLDCH-STARTCHAR)) 287 OLDCH=CH 290 CC$(AROF)=TC$ 300 RETURN 305 REM OUT OF INK! 310 CALL SCREEN(7) 320 RETURN 4998 REM MAIN CODE START - INIT THE DISPLAY 5000 FOR A=1 TO 16 5010 CALL COLOR(A,16,2) 5020 NEXT A 5022 FOR A=0 TO 255 5024 COLS(A)=193 5026 NEXT A 5030 CALL CLEAR 5040 CALL SCREEN(16) 5050 INPUT "SCALE? (0.5 RECOMMENDED):":SCALE 5053 PRINT "STEP? (";INT((1/SCALE)*10)/10;"RECOMMENDED):"; 5055 INPUT ST 5060 CALL CLEAR 5069 REM TIMING - REQUIRES A CLOCK DEVICE 5070 REM OPEN #1:"CLOCK" 5075 REM INPUT #1:A$,B$,SS$ 5089 REM DRAW A HAT! 5090 XP=144 5091 XR=4.71238905 5092 XF=XR/XP 5100 FOR ZI=64 TO -64 STEP -ST 5110 ZT=ZI*2.25 5111 ZS=ZT*ZT 5120 XL=INT(SQR(20736-ZS)+0.5) 5130 FOR XI=0-XL TO XL STEP ST 5140 XT=SQR(XI*XI+ZS)*XF 5150 YY=(SIN(XT)+SIN(XT*3)*0.4)*56 5160 DOTCOL=INT((XI+ZI)*SCALE+128+.5) 5161 IF (DOTCOL>255)+(DOTCOL<0)THEN 5190 5163 DOTROW=INT((96-YY+ZI)*SCALE+.5) 5164 IF (DOTROW>191)+(DOTROW<0)THEN 5190 5165 IF COLS(DOTCOL)<=DOTROW THEN 5190 5169 COLS(DOTCOL)=DOTROW 5170 GOSUB 100 5190 NEXT XI 5192 NEXT ZI 5199 REM FINISHED - SIT FOREVER 5200 REM INPUT #1:A$,B$,SE$ 5201 CALL SCREEN(2) 5205 CALL CHAR(OLDCH,CC$(OLDCH-STARTCHAR)) 5210 CALL KEY(0,K,S) 5220 IF S=0 THEN 5210 5230 PRINT SS$,SE$ 5240 END The assembly version manages the code in 26 seconds (emulated, should be close to right). It's using a 256 entry lookup table for sine and 9.7 fixed point numbers for most of the values. You can see that the limited accuracy impacts the image in places, but it's pretty close. A 24-bit number with more fractional bits would probably be enough, but 16-bit is the native word size on the TI. DEF START * THIS VERSION USES ALL THE OPTIMIZATIONS TO DATE. * PLUS SCRATCHPAD UTILITIES AND INLINE SINE LOOKUP * THANKS TO SOMETIMES99ER FOR WORKING OUT THE DATA! * relocated to scratchpad - addresses worked * out by hand! Use caution when modifying them! SQRT EQU >8324 PLOT EQU >8350 SMULT EQU >838E DRAWPX EQU >83A8 *FREE EQU >83F8 - only 8 bytes of scratchpad free! * LABELS FOR SAVE UTILITY SLOAD SFIRST B @START * array for highest pixel ROWS BSS 256 * backup for scratchpad, we're going to just * blindly decimate it. So we need to restore * it before we let the console interrupt run * at the end of execution. I could be picky, * selective, or careful, but this works too. SCRATCH BSS 224 * bits for pixel BITS DATA >8040,>2010,>0804,>0201 * SINE TABLE - 9.7 fixed point entries, 256 total SINTAB DATA 0,3,6,9,13,16,19,22 DATA 25,28,31,34,37,40,43,46 DATA 49,52,55,58,60,63,66,68 DATA 71,74,76,79,81,84,86,88 DATA 91,93,95,97,99,101,103,105 DATA 106,108,110,111,113,114,116,117 DATA 118,119,121,122,122,123,124,125 DATA 126,126,127,127,127,127,127,127 DATA 127,127,127,127,127,127,127,126 DATA 126,125,124,123,122,122,121,119 DATA 118,117,116,114,113,111,110,108 DATA 106,105,103,101,99,97,95,93 DATA 91,88,86,84,81,79,76,74 DATA 71,68,66,63,60,58,55,52 DATA 49,46,43,40,37,34,31,28 DATA 25,22,19,16,13,9,6,3 DATA 0,-3,-6,-9,-13,-16,-19,-22 DATA -25,-28,-31,-34,-37,-40,-43,-46 DATA -49,-52,-55,-58,-60,-63,-66,-68 DATA -71,-74,-76,-79,-81,-84,-86,-88 DATA -91,-93,-95,-97,-99,-101,-103,-105 DATA -106,-108,-110,-111,-113,-114,-116,-117 DATA -118,-119,-121,-122,-122,-123,-124,-125 DATA -126,-126,-127,-127,-127,-128,-128,-128 DATA -128,-128,-128,-128,-127,-127,-127,-126 DATA -126,-125,-124,-123,-122,-122,-121,-119 DATA -118,-117,-116,-114,-113,-111,-110,-108 DATA -106,-105,-103,-101,-99,-97,-95,-93 DATA -91,-88,-86,-84,-81,-79,-76,-74 DATA -71,-68,-66,-63,-60,-58,-55,-52 DATA -49,-46,-43,-40,-37,-34,-31,-28 DATA -25,-22,-19,-16,-13,-9,-6,-3 * note: NOT in memory, so don't use @XF * 9.7 signed fixed point variables in registers XF EQU 15 XT EQU 14 YY EQU 13 * INTEGER VALUES ZS EQU 12 * RET EQU 11 - for BL ZI EQU 10 XL EQU 9 XI EQU 8 * 32-bit temp, uses 6 and 7 T32B EQU 7 T32 EQU 6 * Temp vars T16 EQU 5 T1 EQU 4 T2 EQU 3 NEGFL EQU 2 * PIXEL VARIABLES X1 EQU 1 Y1 EQU 0 * out of registers, use RAM (these ARE @ZY) ZX EQU >8320 ZY EQU >8322 * return save SAVE BSS 2 * registers for bitmap (and 5A00 is the address of the sprite table) * background is transparent (the only color never redefined) * PDT - >0000 * SIT - >1800 * SDT - >1800 * CT - >2000 * SAL - >1B00 BMREGS DATA >81E0,>8002,>8206,>83ff,>8403,>8536,>8603,>8700,>5B00,>0000 START LWPI >8300 * LOAD THE ROWS ARRAY WITH 192 ENTRIES LI R0,ROWS LI R1,192*256 LI R2,256 LP1 MOVB R1,*R0+ DEC R2 JNE LP1 * backup scratchpad LI R0,>8320 * skip our WP LI R1,SCRATCH LI R2,56 * 4 bytes at a time LS1 MOV *R0+,*R1+ MOV *R0+,*R1+ DEC R2 JNE LS1 * now copy utilities in LI R0,SQRTX * first function LI R1,>8324 * first free word LC1 MOV *R0+,*R1+ * copy one word CI R0,SLAST * check for done (thus no unroll) JL LC1 * 140 GRAPHICS 8+16:SETCOLOR 2,0,0 BL @BITMAP * erase the pattern table CLR R0 CLR R1 LI R2,>1800 BL @VDPFILL * set the color table to white on black LI R0,>2000 LI R1,>F100 LI R2,>1800 BL @VDPFILL * 130 XP=144:XR=4.71238905:XF=XR/XP * I'm not sure why they spelled it this way... * goal of the above math is to covert the Y axis * of 192 pixels into one circle in Radians (2PI). * It would have been more clear if XP was 192 * and XR was 6.2831854, these values seem * obfuscated. Anyway, that's what it is. * To avoid conversion to radians then back to * my sine table units, we can just adjust the * scale factor. For me, 192 needs to equal * 256, so my ratio is 256/192=1.333333 * which is >00A9 in fixed point (169, losing the .3333) * As an added bonus, we can clip to the right * range by simply masking now. LI XF,>00A9 * 140 FOR ZI=64 TO -64 STEP -1 * Making this an integer! LI ZI,64 L160 * 150 ZT=ZI*2.25:ZS=ZT*ZT * We have to do two multiplies here, so we're going * to end up in a 32-bit value temporarily anyway. That * actually makes life a little easier. * 2.25 * 128 = 288, WHICH IS >120 * note: ZT not used LI T32,>0120 MOV ZI,T1 ABS T1 * this is okay, because we are going to square it anyway MPY T1,T32 * now T32 is 32-bits wide, and contains an 25.7 bit number. * ZI(16.0) times T32 (9.7) yields 25.7 bits. * So since we want a 9.7, we just have to take the least * significant word, no shifting needed! Of course we ignore * the possibility of overflow, but the largest value should * be 64*2.25 = 144, which fits in 9 bits. * now just put them into place, and multiply again * we know from analysis that the 'sign bit' shouldn't be set here MOV T32B,T32 MOV T32B,T1 MPY T1,T32 * So, T32 now contains a 32-bit 18.14 number, but for simplicity we * are going to move that down into ZS as a 16-bit unsigned integer * so we just need to extract 16 bits of integer, as we don't expect overflow * and don't want fraction. Of course, those 16 bits are split across the * two words... MOV T32B,ZS * least significant - we want two bits from this SRL ZS,14 * toss the rest SLA T32,2 * prepare the most significant SOC T32,ZS * and merge it in * 160 XL=INT(SQR(20736-ZS)+0.5) * ZS is a normal int, so this shouldn't be too bad to start * the result is also an int, and the +0.5 is just for rounding * our sqrt will return one of our fractional values, as noted, * to be consistent. LI T1,20736 S ZS,T1 BL @SQRT * T1 IN AS positive INT, T1 OUT AS 9.7 SRL T1,7 * make an integer for counting MOV T1,XL * and store it * 170 ZX=ZI+160:ZY=90+ZI MOV ZI,T1 AI T1,127 * smaller screen MOV T1,@ZX MOV ZI,T1 AI T1,90 MOV T1,@ZY * 180 FOR XI=0 TO XL * even this loop always executes once (0 to 0), so * I can put the condition at the bottom. CLR XI L190 * 190 XT=SQR(XI*XI+ZS)*XF * pretty similar to above, again we are squaring to get positive * so that makes the unsigned MPY easier to deal with * XT needs to be integer now, not 9.7 MOV XI,T32 * Integer (always positive now) MPY XI,T32 * XI*XI - 16.0 * 16.0 = 32.0, so just take the LSW MOV T32B,T1 * least significant - still 16.0 A ZS,T1 * add ZS (we're an integer so can just add - max is 41472, so unsigned!) BL @SQRT * T1 in as positive int, T1 OUT as 9.7 MOV XF,T32 * prepare to mult - we know these values are positive MPY T1,T32 * do it - 9.7*9.7 = 18.14 * it matters to keep the fraction for the XT*3 below, so, keep it SRL T32B,7 * make room, throwing away 7 fractional bits SLA T32,9 * get the more significant bits into the right place SOC T32,T32B * merge the two 16-bit words MOV T32B,XT * 200 YY=(SIN(XT)+SIN(XT*3)*0.4)*55 -- was 55, needed to adjust for rounding errors * order of op, we do SIN(XT*3)*0.4 first... MOV XT,T1 * prepare for second sine A XT,T1 * simpler than MPY by 3, no need to shift result A XT,T1 SRL T1,6 * shift out fraction, but multiply by 2 (we'll trim the extra bit below) INC T1 * rounding ANDI T1,>01FE * mask for lookup MOV @SINTAB(T1),T2 LI T1,>0033 * roughly 0.4 (actually 0.398) BL @SMULT * Signed multiply, result in T32B MOV T32B,T16 SRL XT,6 * shift out fraction, but multiply by 2 (we'll trim the extra bit below) INC XT * rounding ANDI XT,>01FE * mask for lookup (We don't use XT again) MOV @SINTAB(XT),T2 A T16,T2 LI T1,>1B80 * 55 x less than 1 will be less than 55, so it fits BL @SMULT * Signed multiply, result in T32B * We can just make YY an integer right here SRA T32B,7 * discard fraction (sign extend!) MOV T32B,YY * now go plot the two pixels BL @DRAWPX * 250 NEXT XI INC XI C XI,XL * I know it's always positive now, JLE L190 * so I can use an unsigned test * 255 NEXT ZI L255 DEC ZI CI ZI,-65 JGT L160 * 260 GOTO 260 * restore scratchpad before enabling interrupts LI R0,SCRATCH LI R1,>8320 * skip our WP LI R2,56 * 4 bytes at a time LS2 MOV *R0+,*R1+ MOV *R0+,*R1+ DEC R2 JNE LS2 WAIT LIMI 2 LIMI 0 JMP WAIT * VDP access * Write single byte to R0 from MSB R1 * Destroys R0 (actually just oRs it) VSBW ORI R0,>4000 SWPB R0 MOVB R0,@>8C02 SWPB R0 MOVB R0,@>8C02 MOVB R1,@>8C00 B *R11 * Write R2 bytes from R1 to VDP R0 * Destroys R0,R1,R2 VDPFILL ORI R0,>4000 SWPB R0 MOVB R0,@>8C02 SWPB R0 MOVB R0,@>8C02 VMBWLP MOVB R1,@>8C00 DEC R2 JNE VMBWLP B *R11 * Write address or register VDPWA SWPB R0 MOVB R0,@>8C02 SWPB R0 MOVB R0,@>8C02 B *R11 * load regs list to VDP address, end on >0000 and write >D0 (for sprites) * address of table in R1 (destroyed) LOADRG LOADLP MOV *R1+,R0 JEQ LDRDN SWPB R0 MOVB R0,@>8C02 SWPB R0 MOVB R0,@>8C02 JMP LOADLP LDRDN LI R1,>D000 MOVB R1,@>8C00 B *R11 * Setup for normal bitmap mode BITMAP MOV R11,@SAVE * set display and disable sprites LI R1,BMREGS BL @LOADRG * set up SIT - We load the standard 0-255, 3 times LI R0,>5800 BL @VDPWA CLR R2 NQ# CLR R1 LP# MOVB R1,@>8C00 AI R1,>0100 CI R1,>0000 JNE LP# INC R2 CI R2,3 JNE NQ# MOV @SAVE,R11 B *R11 * use this and a listing to get scratchpad addresses for the fctns * AORG >8324 * IN AND OUT IN T1 * T1 in = integer * T1 out = 9.7 signed fixed point * Uses T2,X1,Y1,T32 * http://samples.sains...mple_809121.pdf * modified a bit - we pretend the input is a 16.8 value (the * entire fractional part will be 0), that let's us get out a * 8.8 value, because the algorithm needs an even number of fractional * bits. Then we just shift once to get .7 SQRTX CLR X1 root CLR T2 remHi (t1 is remLo) LI Y1,16 count = ((WORD/2-1)+(FRACBITS>>1)) -> 11+4, +1 for loop SQRT1 SLA T2,2 remHi = (remHi << 2) | (remLo >> 14); MOV T1,T32 SRL T32,14 SOC T32,T2 SLA T1,2 remLo <<= 2; SLA X1,1 root <<= 1; MOV X1,T32 testDiv = (root << 1) + 1; SLA T32,1 INC T32 C T2,T32 if (remHi >= testDiv) { JL SQRT2 S T32,T2 remHi -= testDiv; INC X1 root += 1; SQRT2 DEC Y1 while (--count != 0); JNE SQRT1 MOV X1,T1 return( root); SRL T1,1 Get it down to x.7 fixed point B *R11 * INPUT X1,Y1 - kills T1,T2 as well PLOTX * use the E/A routine for address MOV Y1,T1 R1 is the Y value. SLA T1,5 SOC Y1,T1 ANDI T1,>FF07 MOV X1,T2 R0 is the X value. ANDI T2,7 A X1,T1 T1 is the byte offset. S T2,T1 T2 is the bit offset. * inline VDP! SWPB T1 set up read address MOVB T1,@>8C02 SWPB T1 MOVB T1,@>8C02 ORI T1,>4000 we need this later, and provides a VDP delay MOVB @>8800,R1 read the byte from VDP SWPB T1 set up write address MOVB T1,@>8C02 SWPB T1 MOVB T1,@>8C02 SOCB @BITS(T2),R1 or the bit and provide VDP delay MOVB R1,@>8C00 write the byte back B *R11 * signed fixed point multiply - T1 * T2 = T32B * ONLY T2 is allowed to be negative!! Result * will be negative if T2 was. * Uses T1,T2,NEGFL,T32,T32B SMULTX CLR NEGFL * temp flag for negative MOV T2,T32 * prepare for mult and test JGT NOTNEG1 SETO NEGFL * it is negative, so remember and make positive ABS T32 NOTNEG1 MPY T1,T32 * does the multiply - you know the drill, fix up number SRL T32B,7 * make room, throwing away 7 fractional bits SLA T32,9 * get the more significant bits into the right place SOC T32,T32B * merge the two 16-bit words MOV NEGFL,NEGFL * check if it should be negative JEQ NOTNEG2 NEG T32B * yes, it should NOTNEG2 B *R11 DRAWXX MOV R11,@SAVE * need this to get back! * 210 X1=XI*0.75+ZX:Y1=ZY-YY * XI can never be negative now, so we can remove all that code MOV XI,X1 * integer LI T32,>0060 * 0.75 MPY X1,T32 * now 25.7, so just take the LSW (unsigned mult!) AI T32B,>40 * 0.5 in x.7, for rounding SRA T32B,7 * make integer for the plot function (sign extend!) MOV T32B,X1 * get the integer A @ZX,X1 * add (integer) ZX MOV @ZY,Y1 * get ZY (integer) S YY,Y1 * subtract YY (integer) * 220 IF RR(X1)>Y1 THEN RR(X1)=Y1:PLOT X1,Y1 SWPB Y1 * stupid Big Endian.... MOV Y1,T16 * plot kills X1,Y1, and we need Y1 again CB @ROWS(X1),Y1 JLE L230 MOVB Y1,@ROWS(X1) SWPB Y1 * NOTE: PLOT EXPECTS THE PIXEL IN REGISTERS X1,Y1 BL @PLOT * 230 X1=ZX-XI*0.75 L230 MOV @ZX,X1 S T32B,X1 * use the scaled X1 on both sides of the origin * 240 IF RR(X1)>Y1 THEN RR(X1)=Y1:PLOT X1,Y1 MOV T16,Y1 * get it back, still swapped CB @ROWS(X1),Y1 JLE L250 MOVB Y1,@ROWS(X1) SWPB Y1 * NOTE: PLOT EXPECTS THE PIXEL IN REGISTERS X1,Y1 BL @PLOT * Return to caller L250 MOV @SAVE,R11 B *R11 SLAST END 3 Quote Link to comment Share on other sites More sharing options...
devwebcl Posted March 8, 2015 Share Posted March 8, 2015 A faster version, but only for the complete wireframe: It took 22 minutes 17 seconds in Altirra w/ TBXL (at full speed): http://manillismo.blogspot.com/2015/03/fedora-fast-wireframe.html 140 GRAPHICS 8+16:SETCOLOR 2,0,0 150 XP=144:XR=4.71238905:XF=XR/XP 151 REM VERSION RAPIDA COMPLETA (IN LINEA 240) 155 COLOR 1 159 REM LA MITAD? -64 160 FOR ZI=0 TO 64 170 ZT=ZI*2.25:ZS=ZT*ZT 180 XL=INT(SQR(20736-ZS)+0.5) 181 ZX=ZI+160:ZY=90+ZI 182 ZX2=-ZI+160:ZY2=90-ZI 185 REM LA MITAD? 0-XL 190 FOR XI=0 TO XL 200 XT=SQR(XI*XI+ZS)*XF 210 YY=(SIN(XT)+SIN(XT*3)*0.4)*56 219 REM TODO: SACAR 90- 220 X1 = XI+ZX:Y1 =ZY-YY 221 X12= XI+ZX2:Y12=ZY2-YY 222 X13=-XI+ZX:Y13=ZY-YY 223 X14=-XI+ZX2:Y14=ZY2-YY 230 trap 250: PLOT X1,Y1:PLOT X12,Y12:PLOT X13,Y13:PLOT X14,Y14 250 NEXT XI: NEXT ZI 260 GOTO 260 Quote Link to comment Share on other sites More sharing options...
dmsc Posted March 8, 2015 Share Posted March 8, 2015 Hi! A faster version, but only for the complete wireframe: It took 22 minutes 17 seconds in Altirra w/ TBXL (at full speed): http://manillismo.blogspot.com/2015/03/fedora-fast-wireframe.html You can do better (and still with hidden lines removed) by using a little of trigonometry, now runtime in TBXL (PAL) is 16min 16sec: 100 SX=144:SY=56:SZ=64:CX=320:CY=192 110 C1=2.2*SY:C2=1.6*SY 120 DIM RR(CX) 130 FOR I=0 TO CX:RR(I)=CY:NEXT I 140 GRAPHICS 8+16:SETCOLOR 2,0,0:COLOR 1 150 CX=CX*0.5:CY=CY*0.46875:FX=SX/64:FZ=SZ/64 160 XF=4.71238905/SX 170 FOR ZI=64 TO -64 STEP -1 180 ZT=ZI*FX:ZS=ZT*ZT 190 XL=INT(SQR(SX*SX-ZS)+0.5) 200 ZX=ZI*FZ+CX:ZY=CY+ZI*FZ 210 FOR XI=0 TO XL 220 A=SIN(SQR(XI*XI+ZS)*XF) 230 Y1=ZY-A*(C1-C2*A*A) 240 X1=XI+ZX 250 IF RR(X1)>Y1 THEN RR(X1)=Y1:PLOT X1,Y1 260 X1=ZX-XI 270 IF RR(X1)>Y1 THEN RR(X1)=Y1:PLOT X1,Y1 280 NEXT XI 290 NEXT ZI In NTSC, the runtime is slower, 17min 20sec. Note that if you turn off DMA during the drawing, the runtime is only 12min 15sec. 1 Quote Link to comment Share on other sites More sharing options...
devwebcl Posted March 8, 2015 Share Posted March 8, 2015 Hi! You can do better (and still with hidden lines removed) by using a little of trigonometry, now runtime in TBXL (PAL) is 16min 16sec: 100 SX=144:SY=56:SZ=64:CX=320:CY=192 110 C1=2.2*SY:C2=1.6*SY 120 DIM RR(CX) 130 FOR I=0 TO CX:RR(I)=CY:NEXT I 140 GRAPHICS 8+16:SETCOLOR 2,0,0:COLOR 1 150 CX=CX*0.5:CY=CY*0.46875:FX=SX/64:FZ=SZ/64 160 XF=4.71238905/SX 170 FOR ZI=64 TO -64 STEP -1 180 ZT=ZI*FX:ZS=ZT*ZT 190 XL=INT(SQR(SX*SX-ZS)+0.5) 200 ZX=ZI*FZ+CX:ZY=CY+ZI*FZ 210 FOR XI=0 TO XL 220 A=SIN(SQR(XI*XI+ZS)*XF) 230 Y1=ZY-A*(C1-C2*A*A) 240 X1=XI+ZX 250 IF RR(X1)>Y1 THEN RR(X1)=Y1:PLOT X1,Y1 260 X1=ZX-XI 270 IF RR(X1)>Y1 THEN RR(X1)=Y1:PLOT X1,Y1 280 NEXT XI 290 NEXT ZI In NTSC, the runtime is slower, 17min 20sec. Note that if you turn off DMA during the drawing, the runtime is only 12min 15sec. but trying showing the complete image, not hiding anything at all. Quote Link to comment Share on other sites More sharing options...
dmsc Posted March 8, 2015 Share Posted March 8, 2015 (edited) Hi!, but trying showing the complete image, not hiding anything at all.Well, you can replace lines 250 and 270 with a simple PLOT X1,Y1 and remove the lines 120 and 130 this will draw the complete wireframe. I can't run it now, but I suspect will be about the same speed. Edited March 8, 2015 by dmsc Quote Link to comment Share on other sites More sharing options...
devwebcl Posted March 8, 2015 Share Posted March 8, 2015 Hi!, Well, you can replace lines 250 and 270 with a simple PLOT X1,Y1 and remove the lines 120 and 130 this will draw the complete wireframe. I can't run it now, but I suspect will be about the same speed. I doubt it will be at the same speed. I am saving several iterations at: 160 FOR ZI=0 TO 64 That's why I am using four PLOT's Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 8, 2015 Share Posted March 8, 2015 (edited) Stupid comment deleted Edited March 8, 2015 by JamesD Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 8, 2015 Share Posted March 8, 2015 Microsoft basic isn't known for blinding speed but it's still competitive. The last change cut the CoCo 3 NTSC time to 19:15. Combining lines resulted in 19:14. Over a lot more lines it would add up but not here because there are so few lines.Renumbering only helps with GOTOs and GOSUBs so no change there. Defining the most used variables first will speed up Microsoft BASIC a bit and the first test of this resulted in 18:44.If I put a little more effort into it I could probably cut that a little more but finding what is optimal would require many trials. Enabling the 6309 native mode on a CoCo 3 equipped with one results in about a 21% speed increase on the same code. That should let this complete in 14:48. BASIC-09 on the CoCo 3 should complete this in under 5 minutes, possibly in under 2 if the ration on Ahl's benchmark holds but it's tough to tell without testing. The BBC, Apple IIgs and IIc+ should all do very well running this due to their higher clock speed.Assembly language vs BASIC is obviously not a fair comparison.For that matter, neither is comparing a machine rendering 256x192 vs other machines rendering 320x192. It's also not fair comparing a machine with a 2 color screen vs 16 color.If this were a benchmark you would want everyone rendering the same size image and same number of colors if possible.Also remember that a small code sample may run faster on the TI because you can stick code in scratchpad RAM but you won't always get the same speedup with a larger program. Quote Link to comment Share on other sites More sharing options...
dmsc Posted March 8, 2015 Share Posted March 8, 2015 Hi!, I doubt it will be at the same speed. I am saving several iterations at: 160 FOR ZI=0 TO 64That's why I am using four PLOT's Yes, I know. But your timings don't look right. If I remove the IF's and the array initialization to my program, I get 15min 23sec in PAL, 16min 25sec in NTSC. This is on an 800XL with TBXL 1.5 (emulated). If I combine my program with yours, I get a runtime of 8min 47sec on PAL, or 9min 21sec on NTSC: 100 SX=144:SY=56:SZ=64:CX=320:CY=192 110 C1=2.2*SY:C2=1.6*SY 120 GRAPHICS 8+16:SETCOLOR 2,0,0:COLOR 1 130 CX=CX*0.5:CY=CY*0.46875:FX=SX/64:FZ=SZ/64 140 XF=4.71238905/SX 150 FOR ZI=0 TO 64 160 ZT=ZI*FX:ZS=ZT*ZT:ZT=ZI*FZ 170 XL=INT(SQR(SX*SX-ZS)+0.5) 180 ZX1=CX+ZT:ZY1=CY+ZT 190 ZX2=CX-ZT:ZY2=CY-ZT 200 FOR XI=0 TO XL 210 A=SIN(SQR(XI*XI+ZS)*XF) 220 A=A*(C1-C2*A*A) 230 Y1=ZY1-A:Y2=ZY2-A 240 X1=ZX1+XI:X3=ZX2+XI 250 X2=ZX1-XI:X4=ZX2-XI 260 IF Y1<191.5 THEN PLOT X1,Y1:PLOT X2,Y1 270 PLOT X3,Y2:PLOT X4,Y2 280 NEXT XI 290 NEXT ZI 300 GOTO 300 Note that your program had a bug, you were TRAP-ing on points below the screen and the missing plotting the corresponding points above. This is why I put the "IF Y1<191.5" over the first two points only. 1 Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 8, 2015 Share Posted March 8, 2015 VZ200 completed it in 15:27 but it only supports 128x64 from BASIC. 100 SX=50:SY=18:SZ=18:CX=128:CY=64 140 MODE(1):COLOR 4,0 250 IF RR(X1)>Y1 THEN RR(X1)=Y1:SET(X1,Y1) 270 IF RR(X1)>Y1 THEN RR(X1)=Y1:SET(X1,Y1) Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 8, 2015 Share Posted March 8, 2015 Hi!, Yes, I know. But your timings don't look right. If I remove the IF's and the array initialization to my program, I get 15min 23sec in PAL, 16min 25sec in NTSC. This is on an 800XL with TBXL 1.5 (emulated). If I combine my program with yours, I get a runtime of 8min 47sec on PAL, or 9min 21sec on NTSC: 100 SX=144:SY=56:SZ=64:CX=320:CY=192 110 C1=2.2*SY:C2=1.6*SY 120 GRAPHICS 8+16:SETCOLOR 2,0,0:COLOR 1 130 CX=CX*0.5:CY=CY*0.46875:FX=SX/64:FZ=SZ/64 140 XF=4.71238905/SX 150 FOR ZI=0 TO 64 160 ZT=ZI*FX:ZS=ZT*ZT:ZT=ZI*FZ 170 XL=INT(SQR(SX*SX-ZS)+0.5) 180 ZX1=CX+ZT:ZY1=CY+ZT 190 ZX2=CX-ZT:ZY2=CY-ZT 200 FOR XI=0 TO XL 210 A=SIN(SQR(XI*XI+ZS)*XF) 220 A=A*(C1-C2*A*A) 230 Y1=ZY1-A:Y2=ZY2-A 240 X1=ZX1+XI:X3=ZX2+XI 250 X2=ZX1-XI:X4=ZX2-XI 260 IF Y1<191.5 THEN PLOT X1,Y1:PLOT X2,Y1 270 PLOT X3,Y2:PLOT X4,Y2 280 NEXT XI 290 NEXT ZI 300 GOTO 300 Note that your program had a bug, you were TRAP-ing on points below the screen and the missing plotting the corresponding points above. This is why I put the "IF Y1<191.5" over the first two points only. That only drew the back half of the image for me. Changing line 150 to start at -64 caused it to draw the entire image. Quote Link to comment Share on other sites More sharing options...
devwebcl Posted March 9, 2015 Share Posted March 9, 2015 That only drew the back half of the image for me. Changing line 150 to start at -64 caused it to draw the entire image. Actually that's the optimization... I only draw a half of the image in axis-y and the rest is calculated at runtime in the same position, that's the reason it should be faster (I am only considering math optimization, not VBLANK, NTSC/PAL or other hacks). Quote Link to comment Share on other sites More sharing options...
fujidude Posted March 9, 2015 Share Posted March 9, 2015 I translated the original program as found in Analog magazine into a Python program. I chose Tkinter as the GUI library, since it is part of the standard Python package. This is my 1st GUI program in Python, thus it is pretty simplistic as far as GUI elements are concerned. This program took aproximately a second to run on my core i7, and most of that was just the GUI app setting up to open, not the actual calculations. I know that because as I was making the program and testing, it took almost as long to run a program with just one instruction to draw a single line. Anyway, I hope no one gets upset that I introduced something modern here that isn't retro. I would like to explain myself in advance in that regard: I no longer have any retro hardware. I use emulation on modern machines for my retro fixes. But the look and feel of the software and programing environments of the retro equipment isn't the only aspect of retro I enjoy. I really loved the exploration and learning back in those early days, especially as it concerned making programs. And that same magic is captured for me again on modern systems, with the Python programming language. It's pretty close to a universal language these days, kind of like BASIC was in the past. It is interpreted also, so it is quick to develop with; again, just like BASIC. It comes preinstalled on Linux and Mac OSX. Is freely available for Windows also. So without further delay, for those who are interested, here is the Python version listing: #------------------------------------------------------------------------------- # Name: archimedes.py # Purpose: mplement the Archimedes' spiral prgoram in Python 2.x # # Author: fujidude for just the Python version, original code # Charles Bachand, pub. Antic magazine, issue 7, pp.60-61. # # Created: 08-03-2015 #------------------------------------------------------------------------------- # import necessary modules from Tkinter import * import math def end(): rootwin.destroy() rootwin=Tk() # main (root) application window based on Tkinter rootwin.wm_title("Archimedes' Spiral") quitBtn=Button(rootwin,text="Exit",command=end) quitBtn.pack(side="bottom") graphzone=Canvas(rootwin, width = 320, height = 192, bg = "black") graphzone.pack() XP = 144 XR = 4.71238905 XF = XR / XP for ZI in range(-64, 65): ZT = ZI * 2.25 ZS = ZT * ZT XL = int(math.sqrt(20736 - ZS) + 0.5) for XI in range(0 - XL, XL+1): XT = math.sqrt(XI * XI + ZS) * XF YY = (math.sin(XT) + math.sin(XT * 3) * 0.4) * 56 X1 = XI + ZI + 160 Y1 = 90 - YY + ZI graphzone.create_oval(X1, Y1, X1, Y1, fill = "white") graphzone.create_line(X1, Y1+1, X1, 191, fill="black") # remove this line for transparent version rootwin.mainloop() Again, that is more or less a direct style translation of the original Analog listing. I might try to make some of the optimization changes suggested here in another version. Depending on if there is even a shred of interest. 2 Quote Link to comment Share on other sites More sharing options...
JamesD Posted March 9, 2015 Share Posted March 9, 2015 (edited) Actually that's the optimization... I only draw a half of the image in axis-y and the rest is calculated at runtime in the same position, that's the reason it should be faster (I am only considering math optimization, not VBLANK, NTSC/PAL or other hacks). I see what you are doing and after some tracing the subtractions aren't working in several places. I'm guessing an emulation issue or a bad ROM image for the emulator. *edit* Or variable names are limited to 2 digits. Du-Oh! Edited March 9, 2015 by JamesD Quote Link to comment Share on other sites More sharing options...
dmsc Posted March 9, 2015 Share Posted March 9, 2015 Hi!, I see what you are doing and after some tracing the subtractions aren't working in several places. I'm guessing an emulation issue or a bad ROM image for the emulator. *edit* Or variable names are limited to 2 digits. Du-Oh! Yes, it is strange. Attached is a disk image with both programs, the wireframe only and the visible faces, it is a bootable image that autoloads a menu to select which program to run. Also, I included the timing calculations for both PAL and NTSC, to verify runtimes. fedora.atr Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.