Jump to content

Reciprocating Bill

New Members
  • Content Count

  • Joined

  • Last visited

Community Reputation

17 Good

About Reciprocating Bill

  • Rank
    Space Invader

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. The Myarc disk controller does essentially that. Call Dir(1) displays a catalog of dsk1 in TI-BASIC (and all the Extended BASICs). No need to OLD a program in - it is built into the DSR.
  2. Here's an obscure data point: FIB2 running on Wycove Forth 3.0 (an extended FigForth) on a 16-bit console. This version of Forth benefits significantly from the memory upgrade. (Benchmark exactly as published): 1' 17" For me the relevant comparisons (of the feverish retrocomputing variety) are with contemporaries of the TI: C64 Forth64: 3’ 50” C64 Durex Forth 1’ 57” Apple II v 3.2 3' 56" Apple II GraForth 2' 19" Z80 4Mhz FigForth 1’19”
  3. There are workarounds, to be sure. I enjoy coding stuff in assembly like the Snowflake in my avatar, an instance that requires a lot of floating point, including sines and cosines. The avatar is four superimposed levels of the snowflake, ~2800 line segments. Calculating and drawing those coordinates using the FP routines in ROM takes 1' 43" on my 16-bit console. I also save the coordinates to an 11K array (I originally stored byte-sized coordinates into word-sized memory locations due to laziness and having ample RAM, hence the double-size array). "Playback" of the snowflake then takes takes 3.6 seconds. Last step was to SAVE a memory image of the array to disk (two files). Having done this once, a cut down version of the program (all the calculations cut out) LOADs the memory image back into the array, which takes about 2.5 seconds from the TIPI, then begins fast playback. Saving memory images of lookup tables for SIN and COS would be the more generalized next step.
  4. Could be. That's why I pose it as a question: "Would shipping a couple of BCD arguments and an opcode to the PI through the registers on the TIPI, then retrieving results, be equally time consuming?" We'd be transferring a handful of bytes each way. I think that would be worth testing. I doubt the TI would ever have to twiddle bits waiting for a response. My 2012 MacBook Pro running Chipmunk Basic calculates ~60 million sines and cosines per second. While the Broadcom in the PI is probably not quite that fast, whatever it's speed (I'm assuming floating point instructions in hardware) it's going to be finish a transcendental calculation and have time for a nap before the TI executes a single assembly instruction. The fastest software transcendentals I've seen on the TI are found in Cortex Basic, which is often 3-4x faster than Extended Basic/fbForth with this sort of maths.
  5. This would be worth testing. Check my reasoning: Extended Basic performs 100 sines and cosines in a For-Next loop in about 15-16 seconds. fbForth (to my surprise) isn't significantly faster performing the same calculations, using the Geneve-derived functions. Thats around 13-14 FP transcendentals per second - a LOT of assembly instructions per calculation. Would shipping a couple of BCD arguments and an opcode to the PI through the registers on the TIPI, then retrieving results, be equally time consuming? From the perspective of the TI, the PI and Python calculations, exclusive of this overhead, would be next to instantaneous.
  6. Hello all - Just wondering: has anyone utilized the TIPI as a FP math coprocessor for the TI? Seems like a natural fit, passing arguments and results back and forth by means of the messaging interface.
  7. The letter doesn't say. But given the apparent expense of CPU time, I think that assumption is safe. Letter vis queens.pdf
  8. Putting this in perspective, I found online a letter written in 1973 (By Edward Reingold at the University of Illinois at Urbana) giving a first report of the number of solutions for 14x14 and 15x15 N queen puzzles. The 15x15 puzzle was solved on an IBM 360/75 in 160 minutes (He remarks, "I have no idea where the student got the money"). NASA used four of these during Apollo. I wouldn't attempt that on the TI, but running an interpreted BASIC on my 2012 MacBook Pro (Chipmunk BASIC, on which all variables are floating point) I get all 2,279,184 15x15 solutions in 35 minutes. The point being: sometimes we (or at least I ) don't appreciate what we have, computationally, these days.
  9. I eventually opted to upgrade a second known good board (tested for several days before upgrading) and find it 100% stable and reliable. Looking for another pure assembler project as a benchmark, I landed on the N-Queens puzzle (how many ways are there to place N queens on an NxN chessboard with no queens attacking one another?). It displays a board, uses sprites to represent queens arrayed in successful positions and displays a count of the solutions found, so VDP access is very moderate. It finds and displays 92 correct 8x8 positions in 1-2 seconds, 724 10x10 positions in 26.6 seconds and 14,200 12x12 positions in 11 minutes, 16 seconds (combinatorial explosion and all). A stock console runs the 10x10 routine (registers in scratchpad) in 37.5 seconds. So the stock console runs at 71% of the speed of the upgraded console - or, put another way, the upgraded console is ~1.4 times faster than stock in this instance.
  10. Here are more accurate numbers for the BYTE Sieve in 9900 assembly. The previous stock numbers did not reflect my most efficient 9900 code, omitting one tweak. The following reflect identical code on a 16-bit console, on a stock console using scratchpad for registers (expansion ram for code and the ~8190 byte array), and a stock console using expansion RAM for everything. 16-bit Registers in PAD Nothing in PAD 6.41" 9.19" 70% 12.00" 53% of 16-bit Just for context, the original BYTE article cites assembly speeds of 6.8" for a Z80 at 4 Mhz (presumably), 13.9" for a 6502 at ~1 Mhz (Ohio Superboard), 4" for an 8088 at 5 Mhz, 1.9" for an 8086 at 8 Mhz, and 0.49" for an 8 Mhz 68000. Time (and memory address space) was already marching on.
  11. I get very similar results given those conditions with assembly versions of the sieve. Illustrates the invaluable contribution of the scratch pad to performance on the stock design.
  12. I do plan to reheat suspicious joints (I think that's legal now). Prior to surgery I powered up the board briefly to assure myself that it fired up but didn't run it for any period of time - there may have been problems lurking that emerge over time. Also, as I have seen on an unmodified console, the screen brightness sometimes flickers slightly with this logic board with a period of about one second, a problem that fluctuates slightly over time (annoying but not fatal). That occurs when the computer is stable and does not appear to be related to the upgrade. I've run the upgraded console for hours with either a 32K RAM burn-in or some of my own assembly programs without problems, so the RAM itself seems solid. In fact, that is when the console is most stable. The haywire behavior comes on suddenly and then progressively, sometimes on the startup screen or when running out of a module (e.g. MunchMan from FinalGrom). It appears that VDP ram is being written to as the pattern table becomes progressively more corrupted, while the underlying program keeps running. Sometimes switching contexts (e.g. going into TI Basic and back) restores the system to normalcy. A couple times the video has lost its sync altogether. At any rate, I'm thinking I'll find a beat up console with a known solid logic board and re-do the operation, as I am feeling more confident in my ability to pull it off, and probably more quickly. I don't want to modify my main console, as it is too nice to risk.
  13. I'm new here. What a great forum. Coincidently, I recently (I mean literally yesterday) successfully completed the 16-bit upgrade described here: http://www.mainbyte.com/ti99/16bit32k/32kconsole.html Took about four hours. The shield is not a factor. I've found that speed increase varies depending upon what my code is doing. And the upgraded computer occasionally does some flakey things. Here are the results of some tests I ran. First number is 16-bit console, second is stock, and third is stock performance as a percentage of 16-bit performance. All these programs were already optimized as much as I was able, e.g. using registers in scratchpad, along with small snippets of code in PAD. These are stopwatch timings. - Assembly program calculates and displays 3500 points of the Lorenz attractor with heavy use of console FP routines: 139" 147" 94% - stock runs at 94% of the speed of the upgrade. Not a big gain because of heavy use of ROM routines. - Same assembly program replays above 3500 points 15 times from RAM storage: 12.9" 14.6" 83% - better, the bottleneck now being VDP reads and writes - Assembly program calculates and displays 4 levels of the Mandelbrot Snowflake. Again many calls to console ROM for FP calculations: 105" 111" 94% - Assembly program replays above four levels of Snowflake from RAM storage (five times): 18.1" 22.5" 80% - I rolled my own Bresenham. - Assembly program displays first 6 levels of Hilbert Curve on bitmap (three times): 31.9" 39.8" 80% - same Bresenham. Some fbForth and TurboForth results - no large gains expected, given that these Forths are running from cartridge ROM: - fbForth loads four screens from cold start (on real iron with floppy storage). 22.7" 23.15" 96% - fbForth loads same four screens a second time, so no disk access: 14.2" 14.6" 97% - fbForth runs Sieve of Erastothanes benchmark as presented in January 1983 Byte Magazine (a very dog-eared copy of which I still own). (1 iteration rather than 10). 15.8" 17.6" 89% - TurboForth Sieve of Erastothanes (1 iteration): 10.5" 11.9" 88% - again, this makes sense. - Wycove Forth Sieve of Erastothanes (1 iteration): 11.2" 15" 75% - the biggest Forth beneficiary. Makes sense - Wycove Forth runs entirely out of RAM. My assembly version of the Sieve of Erastothanes. Registers in scratchpad. (10 iterations). 6.37" 10" 67% Sieve on stock console with registers and central loop of code squeezed into scratchpad: 7.1" This is not overwhelming, but about what I expected. Due to the occasional flakeyness of the upgraded board (occasional garbage written to the home screen, even with no module in place), as well as my not having a satisfactory second keyboard for a second console, I've reverted for now to my nearly pristine and very stable stock console. There is something to be said for stable versus modestly faster but flakey.
  • Create New...