Jump to content
bfollett

Bitmap mode.

Recommended Posts

Quick observation. You left the variable CX on line 100 as 320 instead of changing it to 256.

 

Bob

Remember that Missing Link give 240 pixel columns, not 256.

Share this post


Link to post
Share on other sites

Remember that Missing Link give 240 pixel columns, not 256.

Then you also have to adjust line 101 so it uses .75 instead of .8

Share this post


Link to post
Share on other sites

 

I will try to do an all-integer version after I first try the reverse-indexed-loops version (described somewhere above by @senior_falcon and @Tursi). I wanted to first get a baseline. It is gratifying that fbForth 2.0 is faster then TI Basic. Assuming that Classic99 and the real iron run at the same speed, TI Basic took 2.5X longer to run the prgogram than fbForth 2.0—this per Kevan's run reported in post #127.

 

...lee

 

That's the optimized, Missing Link version. You're way more than 2.5 times ahead. :)

Share this post


Link to post
Share on other sites

After reading some of The Missing Link docs, I think the "CALL LINK("PD")" is only required once and can be done outside the loop.
Dump line 101 from the previous code I posted and use these lines.
I can't guarantee there are no typos and you may have to reduce SX further but I think it should work.
Since I don't know enough about the TI to really test it I'll leave that to someone else.
About the only thing I can think of you'd need is setting colors with CALL LINK("COLOR",FG,BG) or CALL LINK("PENHUE",FG,BG) on like 130

100 SX=100::SY=56::SZ=64::CX=240::CY=192

130 CALL LINK("CLEAR")::CALL LINK("PD")

240     IF RR(X1)>Y1 THEN RR(X1)=Y1:: CALL LINK("PIXEL",Y1,X1)

260     IF RR(X1)>Y1 THEN RR(X1)=Y1:: CALL LINK("PIXEL",Y1,X1)

Share this post


Link to post
Share on other sites

There are certain default conditions set when a program runs under The Missing Link:

The window is set to be the full screen

The pen is set to PD

Penhue is set to black on transparent (2,1)

The screen is set to Cyan 8

The screen is cleared

All sprites are stopped

Character size is set to 6x8 (same size as text mode in the TI)

 

So there is no need for PD or setting the colors unless you want to change them. This stuff should be in the manual but somehow got overlooked.

Edited by senior_falcon

Share this post


Link to post
Share on other sites

...

 

So there is no need for PD or setting the colors unless you want to change them. This stuff should be in the manual but somehow got overlooked.

Overlooked? I guess that's one way to put it. I just didn't read it all.

Share this post


Link to post
Share on other sites

There are certain default conditions set when a program runs under The Missing Link:

The window is set to be the full screen

The pen is set to PD

Penhue is set to black on transparent (2,1)

The screen is set to Cyan 8

The screen is cleared

All sprites are stopped

Character size is set to 6x8 (same size as text mode in the TI)

 

So there is no need for PD or setting the colors unless you want to change them. This stuff should be in the manual but somehow got overlooked.

BTW, since PD would only be executed once where I moved it, you probably couldn't even tell the difference in run times.

This is probably going to be running over 20 minutes.

Share this post


Link to post
Share on other sites

So there is no need for PD or setting the colors unless you want to change them. This stuff should be in the manual but somehow got overlooked.

Overlooked? I guess that's one way to put it. I just didn't read it all.

Nothing you overlooked. What I meant to say was that I should have included this information in the manual, but somehow it did not get put in there.

  • Like 1

Share this post


Link to post
Share on other sites

The DIM statement doesn't seem to accept a variable as a parameter so make the following change and it runs

110 DIM RR(256)

post-10422-0-41279500-1425626105_thumb.jpg

Edited by JamesD

Share this post


Link to post
Share on other sites

This is the working code. It requires Extended BASIC and The Missing Link.
The scale really needs adjusted to make the image look right.

100 SX=100::SY=56::SZ=64::CX=240::CY=192
110 DIM RR(256)
120 FOR I=0 TO CX::RR(I)=CY::NEXT I
130 CALL LINK("CLEAR")
140 CX=CX*0.5::CY=CY*0.46875::FX=SX/64::FZ=SZ/64
150 XF=4.71238905/SX
160 FOR ZI=64 TO -64 STEP -1
170   ZT=ZI*FX::ZS=ZT*ZT
180   XL=INT(SQR(SX*SX-ZS)+0.5)
190   ZX=ZI*FZ+CX::ZY=CY+ZI*FZ
200   FOR XI=0 TO XL
210     XT=SQR(XI*XI+ZS)*XF
220     YY=(SIN(XT)+SIN(XT*3)*0.4)*SY
230     X1=XI+ZX::Y1=ZY-YY
240     IF RR(X1)>Y1 THEN RR(X1)=Y1:: CALL LINK("PIXEL",Y1,X1)
250     X1=ZX-XI
260     IF RR(X1)>Y1 THEN RR(X1)=Y1:: CALL LINK("PIXEL",Y1,X1)
270   NEXT XI::NEXT ZI
280 GOTO 280 

Without The Missing Link it would have been really ugly and even slower... and trust me it's slow enough.
*edit*
I restarted it at normal speed and it's not even half way done at the 20 minute mark.
I'm guessing it would take at least 50 minutes to complete but I'm not waiting around to find out.

Edited by JamesD
  • Like 1

Share this post


Link to post
Share on other sites

Not that it really matters but you Dimmed RR(256). It only needs to be 240. I'd like the total run time if you ever get it.

 

Bob

Share this post


Link to post
Share on other sites

Not that it really matters but you Dimmed RR(256). It only needs to be 240. I'd like the total run time if you ever get it.

 

Bob

I know. The array was the first thing that broke when I was working on the Apple II version and I only wanted to change it once. It doesn't do anything else so who cares? If anyone figures out how to get it to work with a full width screen it will still work just by changing SX and CX.

 

If you want the total run time you could always run it yourself.

Load classic99, enable Extended BASIC, put The Missing Link disk image in drive 0, type RUN "DSK0.TML", cut the final code from above and post it in Extended BASIC and type run.

I think it will take around an hour to run but it has to set fewer pixels after the half way point than before it so it might be less.

To be honest, if Classic99 hadn't supported pasting code I probably wouldn't have run it.

 

Share this post


Link to post
Share on other sites

There are certain default conditions set when a program runs under The Missing Link:

The window is set to be the full screen

The pen is set to PD

Penhue is set to black on transparent (2,1)

The screen is set to Cyan 8

The screen is cleared

All sprites are stopped

Character size is set to 6x8 (same size as text mode in the TI)

 

So there is no need for PD or setting the colors unless you want to change them. This stuff should be in the manual but somehow got overlooked.

Can I use CALL LOAD to load The Missing Link from within a program?

Share this post


Link to post
Share on other sites

Can I use CALL LOAD to load The Missing Link from within a program?

No, that would not work, the memory is configured differently from XB. Even if you could it would take several minutes to load. You can have the Missing Link loader start up TML and then run an XB program that uses the graphics routines, See the manual for information on how to do this.

Share this post


Link to post
Share on other sites

No, that would not work, the memory is configured differently from XB. Even if you could it would take several minutes to load. You can have the Missing Link loader start up TML and then run an XB program that uses the graphics routines, See the manual for information on how to do this.

I figured as much but I thought I'd ask.

Share this post


Link to post
Share on other sites

Spending too much time on this... but! I tried the Missing Link version and was pretty impressed by the optimizations. I ran it in overdrive, but if the earlier ratios I was using were right, the run time would have been about 38 minutes.

 

That made me revisit my assembly language version again. Just implementing the new half-draw loop allowed me to remove one signed multiply, not to mention run only half the loops, and it took about 33s to runs.

 

I decided to go for broke, I had the plot pixel function inline the VDP access, and adjusted the x factor (? XF) to scale for my sine table, instead of for Radians. This allowed me to remove the SINE function and an extra multiply, and do the sine table lookup inline. That brought the runtime to 31 seconds. I also added some rounding which slightly improved some of the curves, and did not substantially impact runtime.

 

Finally, I moved some of the innermost functions into scratchpad RAM. The square root function loops and is called for every pixel, drawn or not, so it was a good candidate. I moved the plot function and the code immediately around it. There were still two signed multiplies, so I moved that down into scratchpad as well. This brought the runtime to 26 seconds. Of course, the whole loop wouldn't fit, but I think the time spent shifting and multiplying exceeds the time spent fetching instructions anyway.

 

I also fixed the "dent" in the hat. The new code dented worse, so I looked at it, and it was pretty simple in the end. By setting the maximum row position to 193 at the start, I was allowing a Y coordinate of 192 to be plotted. The screen only goes from 0-191, so the first few pixels at 192 were scribbling into the character table and changing some of the characters (before they had any pattern!) Capping the rows to 192 instead of 193 fixed that problem. ;)

 

So, maybe final build, assembly runtime of 26 seconds?

 

 

 


DEF START

* THIS VERSION USES ALL THE OPTIMIZATIONS TO DATE.
* PLUS SCRATCHPAD UTILITIES AND INLINE SINE LOOKUP
* THANKS TO SOMETIMES99ER FOR WORKING OUT THE DATA!

* relocated to scratchpad - addresses worked
* out by hand! Use caution when modifying them!
SQRT EQU >8324
PLOT EQU >8350
SMULT EQU >838E
DRAWPX EQU >83A8
*FREE EQU >83F8 - only 8 bytes of scratchpad free!

* LABELS FOR SAVE UTILITY
SLOAD
SFIRST
B @START

* array for highest pixel
ROWS
BSS 256

* backup for scratchpad, we're going to just
* blindly decimate it. So we need to restore
* it before we let the console interrupt run
* at the end of execution. I could be picky,
* selective, or careful, but this works too. ;)
SCRATCH
BSS 224

* bits for pixel
BITS
DATA >8040,>2010,>0804,>0201

* SINE TABLE - 9.7 fixed point entries, 256 total
SINTAB
DATA 0,3,6,9,13,16,19,22
DATA 25,28,31,34,37,40,43,46
DATA 49,52,55,58,60,63,66,68
DATA 71,74,76,79,81,84,86,88
DATA 91,93,95,97,99,101,103,105
DATA 106,108,110,111,113,114,116,117
DATA 118,119,121,122,122,123,124,125
DATA 126,126,127,127,127,127,127,127
DATA 127,127,127,127,127,127,127,126
DATA 126,125,124,123,122,122,121,119
DATA 118,117,116,114,113,111,110,108
DATA 106,105,103,101,99,97,95,93
DATA 91,88,86,84,81,79,76,74
DATA 71,68,66,63,60,58,55,52
DATA 49,46,43,40,37,34,31,28
DATA 25,22,19,16,13,9,6,3
DATA 0,-3,-6,-9,-13,-16,-19,-22
DATA -25,-28,-31,-34,-37,-40,-43,-46
DATA -49,-52,-55,-58,-60,-63,-66,-68
DATA -71,-74,-76,-79,-81,-84,-86,-88
DATA -91,-93,-95,-97,-99,-101,-103,-105
DATA -106,-108,-110,-111,-113,-114,-116,-117
DATA -118,-119,-121,-122,-122,-123,-124,-125
DATA -126,-126,-127,-127,-127,-128,-128,-128
DATA -128,-128,-128,-128,-127,-127,-127,-126
DATA -126,-125,-124,-123,-122,-122,-121,-119
DATA -118,-117,-116,-114,-113,-111,-110,-108
DATA -106,-105,-103,-101,-99,-97,-95,-93
DATA -91,-88,-86,-84,-81,-79,-76,-74
DATA -71,-68,-66,-63,-60,-58,-55,-52
DATA -49,-46,-43,-40,-37,-34,-31,-28
DATA -25,-22,-19,-16,-13,-9,-6,-3

* note: NOT in memory, so don't use @XF
* 9.7 signed fixed point variables in registers
XF EQU 15
XT EQU 14
YY EQU 13

* INTEGER VALUES
ZS EQU 12
* RET EQU 11 - for BL
ZI EQU 10
XL EQU 9
XI EQU 8

* 32-bit temp, uses 6 and 7
T32B EQU 7
T32 EQU 6

* Temp vars
T16 EQU 5
T1 EQU 4
T2 EQU 3
NEGFL EQU 2

* PIXEL VARIABLES
X1 EQU 1
Y1 EQU 0

* out of registers, use RAM (these ARE @ZY)
ZX EQU >8320
ZY EQU >8322

* return save
SAVE
BSS 2

* registers for bitmap (and 5A00 is the address of the sprite table)
* background is transparent (the only color never redefined)
* PDT - >0000
* SIT - >1800
* SDT - >1800
* CT - >2000
* SAL - >1B00
BMREGS DATA >81E0,>8002,>8206,>83ff,>8403,>8536,>8603,>8700,>5B00,>0000

START
LWPI >8300

* LOAD THE ROWS ARRAY WITH 192 ENTRIES
LI R0,ROWS
LI R1,192*256
LI R2,256
LP1
MOVB R1,*R0+
DEC R2
JNE LP1

* backup scratchpad
LI R0,>8320 * skip our WP
LI R1,SCRATCH
LI R2,56 * 4 bytes at a time
LS1
MOV *R0+,*R1+
MOV *R0+,*R1+
DEC R2
JNE LS1

* now copy utilities in
LI R0,SQRTX * first function
LI R1,>8324 * first free word
LC1
MOV *R0+,*R1+ * copy one word
CI R0,SLAST * check for done (thus no unroll)
JL LC1

* 140 GRAPHICS 8+16:SETCOLOR 2,0,0
BL @BITMAP
* erase the pattern table
CLR R0
CLR R1
LI R2,>1800
BL @VDPFILL
* set the color table to white on black
LI R0,>2000
LI R1,>F100
LI R2,>1800
BL @VDPFILL

* 130 XP=144:XR=4.71238905:XF=XR/XP
* I'm not sure why they spelled it this way...
* goal of the above math is to covert the Y axis
* of 192 pixels into one circle in Radians (2PI).
* It would have been more clear if XP was 192
* and XR was 6.2831854, these values seem
* obfuscated. Anyway, that's what it is.
* To avoid conversion to radians then back to
* my sine table units, we can just adjust the
* scale factor. For me, 192 needs to equal
* 256, so my ratio is 256/192=1.333333
* which is >00A9 in fixed point (169, losing the .3333)
* As an added bonus, we can clip to the right
* range by simply masking now.
LI XF,>00A9

* 140 FOR ZI=64 TO -64 STEP -1
* Making this an integer!
LI ZI,64
L160

* 150 ZT=ZI*2.25:ZS=ZT*ZT
* We have to do two multiplies here, so we're going
* to end up in a 32-bit value temporarily anyway. That
* actually makes life a little easier.
* 2.25 * 128 = 288, WHICH IS >120
* note: ZT not used :)
LI T32,>0120
MOV ZI,T1
ABS T1 * this is okay, because we are going to square it anyway
MPY T1,T32

* now T32 is 32-bits wide, and contains an 25.7 bit number.
* ZI(16.0) times T32 (9.7) yields 25.7 bits.
* So since we want a 9.7, we just have to take the least
* significant word, no shifting needed! Of course we ignore
* the possibility of overflow, but the largest value should
* be 64*2.25 = 144, which fits in 9 bits.
* now just put them into place, and multiply again
* we know from analysis that the 'sign bit' shouldn't be set here
MOV T32B,T32
MOV T32B,T1
MPY T1,T32

* So, T32 now contains a 32-bit 18.14 number, but for simplicity we
* are going to move that down into ZS as a 16-bit unsigned integer
* so we just need to extract 16 bits of integer, as we don't expect overflow
* and don't want fraction. Of course, those 16 bits are split across the
* two words...
MOV T32B,ZS * least significant - we want two bits from this
SRL ZS,14 * toss the rest
SLA T32,2 * prepare the most significant
SOC T32,ZS * and merge it in

* 160 XL=INT(SQR(20736-ZS)+0.5)
* ZS is a normal int, so this shouldn't be too bad to start
* the result is also an int, and the +0.5 is just for rounding
* our sqrt will return one of our fractional values, as noted,
* to be consistent.
LI T1,20736
S ZS,T1
BL @SQRT * T1 IN AS positive INT, T1 OUT AS 9.7
SRL T1,7 * make an integer for counting
MOV T1,XL * and store it

* 170 ZX=ZI+160:ZY=90+ZI
MOV ZI,T1
AI T1,127 * smaller screen
MOV T1,@ZX
MOV ZI,T1
AI T1,90
MOV T1,@ZY

* 180 FOR XI=0 TO XL
* even this loop always executes once (0 to 0), so
* I can put the condition at the bottom.
CLR XI
L190

* 190 XT=SQR(XI*XI+ZS)*XF
* pretty similar to above, again we are squaring to get positive
* so that makes the unsigned MPY easier to deal with
* XT needs to be integer now, not 9.7
MOV XI,T32 * Integer (always positive now)
MPY XI,T32 * XI*XI - 16.0 * 16.0 = 32.0, so just take the LSW

MOV T32B,T1 * least significant - still 16.0
A ZS,T1 * add ZS (we're an integer so can just add - max is 41472, so unsigned!)
BL @SQRT * T1 in as positive int, T1 OUT as 9.7
MOV XF,T32 * prepare to mult - we know these values are positive
MPY T1,T32 * do it - 9.7*9.7 = 18.14

* it matters to keep the fraction for the XT*3 below, so, keep it
SRL T32B,7 * make room, throwing away 7 fractional bits
SLA T32,9 * get the more significant bits into the right place
SOC T32,T32B * merge the two 16-bit words

MOV T32B,XT

* 200 YY=(SIN(XT)+SIN(XT*3)*0.4)*55 -- was 55, needed to adjust for rounding errors
* order of op, we do SIN(XT*3)*0.4 first...
MOV XT,T1 * prepare for second sine
A XT,T1 * simpler than MPY by 3, no need to shift result
A XT,T1
SRL T1,6 * shift out fraction, but multiply by 2 (we'll trim the extra bit below)
INC T1 * rounding
ANDI T1,>01FE * mask for lookup
MOV @SINTAB(T1),T2
LI T1,>0033 * roughly 0.4 (actually 0.398)
BL @SMULT * Signed multiply, result in T32B
MOV T32B,T16

SRL XT,6 * shift out fraction, but multiply by 2 (we'll trim the extra bit below)
INC XT * rounding
ANDI XT,>01FE * mask for lookup (We don't use XT again)
MOV @SINTAB(XT),T2
A T16,T2
LI T1,>1B80 * 55 x less than 1 will be less than 55, so it fits
BL @SMULT * Signed multiply, result in T32B

* We can just make YY an integer right here
SRA T32B,7 * discard fraction (sign extend!)
MOV T32B,YY

* now go plot the two pixels
BL @DRAWPX

* 250 NEXT XI
INC XI
C XI,XL * I know it's always positive now,
JLE L190 * so I can use an unsigned test

* 255 NEXT ZI
L255
DEC ZI
CI ZI,-65
JGT L160

* 260 GOTO 260
* restore scratchpad before enabling interrupts
LI R0,SCRATCH
LI R1,>8320 * skip our WP
LI R2,56 * 4 bytes at a time
LS2
MOV *R0+,*R1+
MOV *R0+,*R1+
DEC R2
JNE LS2

WAIT
LIMI 2
LIMI 0
JMP WAIT

* VDP access

* Write single byte to R0 from MSB R1
* Destroys R0 (actually just oRs it)
VSBW
ORI R0,>4000
SWPB R0
MOVB R0,@>8C02
SWPB R0
MOVB R0,@>8C02
MOVB R1,@>8C00
B *R11

* Write R2 bytes from R1 to VDP R0
* Destroys R0,R1,R2
VDPFILL
ORI R0,>4000
SWPB R0
MOVB R0,@>8C02
SWPB R0
MOVB R0,@>8C02
VMBWLP
MOVB R1,@>8C00
DEC R2
JNE VMBWLP
B *R11

* Write address or register
VDPWA
SWPB R0
MOVB R0,@>8C02
SWPB R0
MOVB R0,@>8C02
B *R11

* load regs list to VDP address, end on >0000 and write >D0 (for sprites)
* address of table in R1 (destroyed)
LOADRG
LOADLP
MOV *R1+,R0
JEQ LDRDN
SWPB R0
MOVB R0,@>8C02
SWPB R0
MOVB R0,@>8C02
JMP LOADLP
LDRDN
LI R1,>D000
MOVB R1,@>8C00
B *R11

* Setup for normal bitmap mode
BITMAP
MOV R11,@SAVE

* set display and disable sprites
LI R1,BMREGS
BL @LOADRG

* set up SIT - We load the standard 0-255, 3 times
LI R0,>5800
BL @VDPWA
CLR R2
NQ#
CLR R1
LP#
MOVB R1,@>8C00
AI R1,>0100
CI R1,>0000
JNE LP#
INC R2
CI R2,3
JNE NQ#

MOV @SAVE,R11
B *R11

* use this and a listing to get scratchpad addresses for the fctns
* AORG >8324

* IN AND OUT IN T1
* T1 in = integer
* T1 out = 9.7 signed fixed point
* Uses T2,X1,Y1,T32
* http://samples.sainsburysebooks.co.uk/9781483296692_sample_809121.pdf
* modified a bit - we pretend the input is a 16.8 value (the
* entire fractional part will be 0), that let's us get out a
* 8.8 value, because the algorithm needs an even number of fractional
* bits. Then we just shift once to get .7
SQRTX
CLR X1 root
CLR T2 remHi (t1 is remLo)
LI Y1,16 count = ((WORD/2-1)+(FRACBITS>>1)) -> 11+4, +1 for loop

SQRT1
SLA T2,2 remHi = (remHi << 2) | (remLo >> 14);
MOV T1,T32
SRL T32,14
SOC T32,T2
SLA T1,2 remLo <<= 2;
SLA X1,1 root <<= 1;
MOV X1,T32 testDiv = (root << 1) + 1;
SLA T32,1
INC T32
C T2,T32 if (remHi >= testDiv) {
JL SQRT2
S T32,T2 remHi -= testDiv;
INC X1 root += 1;
SQRT2
DEC Y1 while (--count != 0);
JNE SQRT1

MOV X1,T1 return( root);
SRL T1,1 Get it down to x.7 fixed point
B *R11

* INPUT X1,Y1 - kills T1,T2 as well
PLOTX
* use the E/A routine for address
MOV Y1,T1 R1 is the Y value.
SLA T1,5
SOC Y1,T1
ANDI T1,>FF07
MOV X1,T2 R0 is the X value.
ANDI T2,7
A X1,T1 T1 is the byte offset.
S T2,T1 T2 is the bit offset.

* inline VDP!
SWPB T1 set up read address
MOVB T1,@>8C02
SWPB T1
MOVB T1,@>8C02
ORI T1,>4000 we need this later, and provides a VDP delay
MOVB @>8800,R1 read the byte from VDP
SWPB T1 set up write address
MOVB T1,@>8C02
SWPB T1
MOVB T1,@>8C02
SOCB @BITS(T2),R1 or the bit and provide VDP delay
MOVB R1,@>8C00 write the byte back

B *R11

* signed fixed point multiply - T1 * T2 = T32B
* ONLY T2 is allowed to be negative!! Result
* will be negative if T2 was.
* Uses T1,T2,NEGFL,T32,T32B
SMULTX
CLR NEGFL * temp flag for negative
MOV T2,T32 * prepare for mult and test
JGT NOTNEG1
SETO NEGFL * it is negative, so remember and make positive
ABS T32
NOTNEG1
MPY T1,T32 * does the multiply - you know the drill, fix up number

SRL T32B,7 * make room, throwing away 7 fractional bits
SLA T32,9 * get the more significant bits into the right place
SOC T32,T32B * merge the two 16-bit words

MOV NEGFL,NEGFL * check if it should be negative
JEQ NOTNEG2
NEG T32B * yes, it should
NOTNEG2
B *R11

DRAWXX
MOV R11,@SAVE * need this to get back!

* 210 X1=XI*0.75+ZX:Y1=ZY-YY
* XI can never be negative now, so we can remove all that code
MOV XI,X1 * integer
LI T32,>0060 * 0.75
MPY X1,T32 * now 25.7, so just take the LSW (unsigned mult!)
AI T32B,>40 * 0.5 in x.7, for rounding
SRA T32B,7 * make integer for the plot function (sign extend!)
MOV T32B,X1 * get the integer
A @ZX,X1 * add (integer) ZX

MOV @ZY,Y1 * get ZY (integer)
S YY,Y1 * subtract YY (integer)

* 220 IF RR(X1)>Y1 THEN RR(X1)=Y1:PLOT X1,Y1
SWPB Y1 * stupid Big Endian....
MOV Y1,T16 * plot kills X1,Y1, and we need Y1 again
CB @ROWS(X1),Y1
JLE L230
MOVB Y1,@ROWS(X1)
SWPB Y1
* NOTE: PLOT EXPECTS THE PIXEL IN REGISTERS X1,Y1
BL @PLOT

* 230 X1=ZX-XI*0.75
L230
MOV @ZX,X1
S T32B,X1 * use the scaled X1 on both sides of the origin

* 240 IF RR(X1)>Y1 THEN RR(X1)=Y1:PLOT X1,Y1
MOV T16,Y1 * get it back, still swapped
CB @ROWS(X1),Y1
JLE L250
MOVB Y1,@ROWS(X1)
SWPB Y1
* NOTE: PLOT EXPECTS THE PIXEL IN REGISTERS X1,Y1
BL @PLOT

* Return to caller
L250
MOV @SAVE,R11
B *R11

SLAST
END

 

 

 

 

post-12959-0-09825200-1425705935_thumb.jpg

Edited by Tursi
  • Like 3

Share this post


Link to post
Share on other sites

Spending too much time on this... but! I tried the Missing Link version and was pretty impressed by the optimizations. I ran it in overdrive, but if the earlier ratios I was using were right, the run time would have been about 38 minutes.

Unless something slowed down the emulator that's significantly less than my results.

It took just over 55 minutes for the image to be complete and the main loop wasn't done yet. I let it run while I watched a movie.

Share this post


Link to post
Share on other sites

Spending too much time on this... but! I tried the Missing Link version and was pretty impressed by the optimizations. I ran it in overdrive, but if the earlier ratios I was using were right, the run time would have been about 38 minutes.

 

The disk has four files, HATASM, NEWHATASM, and then _S of each (source code.) Are these not EA3 files? How do I get them to run?

Share this post


Link to post
Share on other sites

Unless something slowed down the emulator that's significantly less than my results.

It took just over 55 minutes for the image to be complete and the main loop wasn't done yet. I let it run while I watched a movie.

 

 

That's the difference between an estimate and reality. ;) Bob asked for an time and you said you didn't want to, so I did that. (Obviously, your time is authoritative and mine little more than a guess ;) ).

 

 

The disk has four files, HATASM, NEWHATASM, and then _S of each (source code.) Are these not EA3 files? How do I get them to run?

 

 

Yes: HATASM is the EA#3 file for the original version, NEWHATASM is the EA#3 file for the new version, and the _S are source. Program name is START.

It looks like TI99DIR didn't copy the files across correctly, they don't load here either. Grr. I probably set a bad option (they load in the editor okay, though...)

 

Fixed (and tested) version of the disk attached, I let Classic99 do the text-to-TI conversion this time. EA#3 "HATASM" or "NEWHATASM", then program name START.

 

HATASM.zip

Edited by Tursi
  • Like 1

Share this post


Link to post
Share on other sites

Spending too much time on this... but! I tried the Missing Link version and was pretty impressed by the optimizations. I ran it in overdrive, but if the earlier ratios I was using were right, the run time would have been about 38 minutes.

 

That made me revisit my assembly language version again. Just implementing the new half-draw loop allowed me to remove one signed multiply, not to mention run only half the loops, and it took about 33s to runs.

...

 

I think I will jump right to porting this version to fbForth 2.0 when I get home. I probably won't use scratchpad RAM for code, however—too much store/restore time, I'm afraid.

 

...lee

Share this post


Link to post
Share on other sites

Scratchpad wasn't really worth the time it took to code anyway, 5 seconds off of 30 isn't a huge win. ;) Removing the EQUs at the beginning, removing the scratchpad copy loops, and renaming the functions at the end to match what the EQUs were (ie: "PLOTX" becomes "PLOT") will make the code run without scratchpad (and the overhead of the branching is barely measurable). It occurred to me after the fact I should have saved a non-scratchpad version, but it was too late by then. ;)

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...