Jump to content
IGNORED

Patch the stock OS with faster FP routines


morelenmir

Recommended Posts

While looking at the various alternatives versions of BASIC that are available I became quite interested in the floating point routines that are housed in the OS ROM. From what I have read these are notoriously badly written and very slow. Apparently just using better routines will speed up execution of even the native version of BASIC. I found out through a little further research that a fellow called 'Charles Marslett' wrote greatly improved floating point code way back in the early days of the 400 and 800. He very kindly offers the source code for these routines on his website:

 

http://www.wordmark.org/mydos.html

 

I was wondering if it is possible to build these routines and directly patch the stock OS ROM code with them?

  • Like 2
Link to comment
Share on other sites

While looking at the various alternatives versions of BASIC that are available I became quite interested in the floating point routines that are housed in the OS ROM. From what I have read these are notoriously badly written and very slow. Apparently just using better routines will speed up execution of even the native version of BASIC. I found out through a little further research that a fellow called 'Charles Marslett' wrote greatly improved floating point code way back in the early days of the 400 and 800. He very kindly offers the source code for these routines on his website:

 

http://www.wordmark.org/mydos.html

 

I was wondering if it is possible to build these routines and directly patch the stock OS ROM code with them?

 

Look Here: https://atariage.com/forums/topic/124761-fast-math-rom/

  • Like 2
Link to comment
Share on other sites

Cool, I hadn't come across this thread before, didn't realize Claus had written his own routines faster than the Newell or Charles Marslett ones.

 

So, which FP routine was used for the patched os in this thread? : http://atariage.com/forums/topic/206880-130xe-reverse-option-key-for-basic/page-2

 

Or.. Does that not actually have a patched FP routine

 

Edit: yeah, later in the thread: http://atariage.com/forums/topic/206880-130xe-reverse-option-key-for-basic/?p=3944284

  • Like 1
Link to comment
Share on other sites

Had forgotten about that linked math pack thread from 2008-09. There have been a lot of threads about patching the OS in various ways. But made me think about the patching procedure.

 

Isn't there a patching program that patches and then recalculates the OS checksum? Actually I was thinking there were at least two -- one ML and one in BAS? Looked in my stuff, but so far haven't found them.

 

Anyone remember these?

 

-Larry

  • Like 1
Link to comment
Share on other sites

I don't suppose anyone has ever thought of, if possible, creating larger than 16K OS's so that a we can have a patched OS that does everything we want from many different custom OS's? maybe some kind of banking schemed OS, or for expanded memory systems just have a larger 32K/48/64K OS and footprint in memory, and have routines that redirect memlo, or move the OS to extended memory like is an option for SDX to reside there, etc. for legacy programs to work with an expanded OS and expanded memory? I'd love to take OS's like MyBIOS/MyIDE and Ominimon, Omniview 80, 57/126K highspeed SIO OS, Warp+, fast math routines, etc. and combine their unique features into one super-OS.

Edited by Gunstar
  • Like 1
Link to comment
Share on other sites

The PBI mechanism already allows for OS extensions such as hard disk drivers, high speed SIO, and additional CIO handlers ("Z:" for instance), so if you couple that with an OS with a fast FP package, you're already more than half way there. The rest (80 column display, etc) can be accomplished via soft-loaded drivers (see the extensive capabilities of SpartaDOS X in that respect).

 

A modular approach beats the monolithic approach every time, IMO.

  • Like 7
Link to comment
Share on other sites

The PBI mechanism already allows for OS extensions such as hard disk drivers, high speed SIO, and additional CIO handlers ("Z:" for instance), so if you couple that with an OS with a fast FP package, you're already more than half way there. The rest (80 column display, etc) can be accomplished via soft-loaded drivers (see the extensive capabilities of SpartaDOS X in that respect).

 

A modular approach beats the monolithic approach every time, IMO.

Yep, just a nebulous idea man here, with no knowledge of the best way to implement the idea, just throwing out possibilities with the buzz words and terms I read all the time... :-D

 

Hopefully in the not so distant future I will have an understanding as I've got my Mapping The Atari, De Re Atari and a BUNCH of 6502 coding, Machine Language For Beginners,, BASIC and ML Atari and standard books to learn from now...

Edited by Gunstar
Link to comment
Share on other sites

While looking at the various alternatives versions of BASIC that are available I became quite interested in the floating point routines that are housed in the OS ROM. From what I have read these are notoriously badly written and very slow..

http://www.wordmark.org/mydos.html

 

I was wondering if it is possible to build these routines and directly patch the stock OS ROM code with them?

 

Well, the problem is a bit more complicated than this. First, it is true that the implementation of the MathPack isn't fabulous, so it is slower than necessary. There are various attempts to fix that, including the functions above, but also those in Os++ (see there). However, the underlying problem is much more that the floating point representation upon which these functions build is not very wisely designed (to put it mildly).

 

If you have a BCD representation, there is only very little you can do to speed up multiplication and division, beyond loop unrolling and (as far as TurboBasic is concerned) pre-computing multiples of one of the factors. It essentially boils down to repeated addition or repeated subtraction. The sources you quote up there are no different, of course.

 

If you want to patch in something faster - the Os++ functions of the math pack are exactly the same size, so they fit into the original Rom space, and the call-in functions are also exactly identical. Thus, no extra ROM space is needed for them.

 

  • Like 1
Link to comment
Share on other sites

Many thanks indeed guys!!! I can see I have tapped in to a rich seam of ideas here!

 

At first I liked the idea of a built-in or at least cartridge'd version of TBXL, however I think Phaeron's 'Altirra BASIC' offers many of the same advantages--even direct control of P/M G--and that slots perfectly into the 8K allotted for BASIC without any shoehorning at all. I do know he deliberately left the FP routines as they were in a sort of pass-through arrangement to Altirra since the emulator itself offers the option of patching the OS code on the fly. However when you use Altirra BASIC on a real computer then you are falling through to the stock OS. Hence my interest here.

 

 

Well, the problem is a bit more complicated than this. First, it is true that the implementation of the MathPack isn't fabulous, so it is slower than necessary. There are various attempts to fix that, including the functions above, but also those in Os++ (see there). However, the underlying problem is much more that the floating point representation upon which these functions build is not very wisely designed (to put it mildly).

 

If you have a BCD representation, there is only very little you can do to speed up multiplication and division, beyond loop unrolling and (as far as TurboBasic is concerned) pre-computing multiples of one of the factors. It essentially boils down to repeated addition or repeated subtraction. The sources you quote up there are no different, of course.

 

If you want to patch in something faster - the Os++ functions of the math pack are exactly the same size, so they fit into the original Rom space, and the call-in functions are also exactly identical. Thus, no extra ROM space is needed for them.

 

 

it was pretty much a process like that which I envisioned. Is there an utility to do the patching or is it just a case of manually finding the right place in the OS image file and then copy and pasting the new routines over it with a hex editor? Larry also mentions the possibility of re-checksum'ing the finished item.

 

In regards the idea Gunstar mentions of banking in and out to extend the OS, I have wondered about exactly that process in regards extending the potential space for the PBI BIOS. Obviously this is all hand-wavy/blue-sky talk, but in general would it be possible to do something like that? Obviously I am not requesting a circuit plan and a BOM for the finished project!!!

  • Like 1
Link to comment
Share on other sites

If you have a BCD representation, there is only very little you can do to speed up multiplication and division, beyond loop unrolling and (as far as TurboBasic is concerned) pre-computing multiples of one of the factors. It essentially boils down to repeated addition or repeated subtraction. The sources you quote up there are no different, of course.

 

I have never done any soft-fp programming, but would it help if the BCD representations were first converted to 32-bit float, do the math on that and convert back to BCD?

 

Link to comment
Share on other sites

The PBI mechanism already allows for OS extensions such as hard disk drivers, high speed SIO, and additional CIO handlers ("Z:" for instance), so if you couple that with an OS with a fast FP package, you're already more than half way there. The rest (80 column display, etc) can be accomplished via soft-loaded drivers (see the extensive capabilities of SpartaDOS X in that respect).

 

A modular approach beats the monolithic approach every time, IMO.

Errr.... Lots of confusions. Banking in a ROM at the math pack area does not require a PBI handler. It only requires an external ROM and pulling one line of the MMU. However, why do you make your life so complicated. If all you want is to replace the math pack, you can equally well replace the whole Os rom in first place.

Link to comment
Share on other sites

Hi!

 

I have never done any soft-fp programming, but would it help if the BCD representations were first converted to 32-bit float, do the math on that and convert back to BCD?

No, that would be slower still, converting between bases is slower tan multiplication.

Link to comment
Share on other sites

Hi!

 

Many thanks indeed guys!!! I can see I have tapped in to a rich seam of ideas here!

 

At first I liked the idea of a built-in or at least cartridge'd version of TBXL, however I think Phaeron's 'Altirra BASIC' offers many of the same advantages--even direct control of P/M G--and that slots perfectly into the 8K allotted for BASIC without any shoehorning at all. I do know he deliberately left the FP routines as they were in a sort of pass-through arrangement to Altirra since the emulator itself offers the option of patching the OS code on the fly. However when you use Altirra BASIC on a real computer then you are falling through to the stock OS. Hence my interest here.

Then, you can easily use the Altirra mathpack, it is 100% compatible with the original and a lot faster. Also, source is included in Altirra sources, search for "mathpack.s", or in this old post with sources and binaries: http://atariage.com/forums/topic/240811-u-basic/page-2?do=findComment&comment=3647208

Link to comment
Share on other sites

No, that would be slower still, converting between bases is slower tan multiplication.

 

I see. What would happen if all FP related routines in the OS were replaced by IEEE 754, i.e. just change the representation (and perhaps ignore the next two bytes if it's necessary to still occupy six bytes). Are there any programs that specifically rely on the 6-byte BCD floats? Just asking and thinking out loud ;)

Edited by ivop
Link to comment
Share on other sites

Hi!

 

I see. What would happen if all FP related routines in the OS were replaced by IEEE 754, i.e. just change the representation (and perhaps ignore the next two bytes if it's necessary to still occupy six bytes). Are there any programs that specifically rely on the 6-byte BCD floats? Just asking and thinking out loud ;)

All of AtariBASIC, Altirra BASIC, OSS BASIC, and even my own FastBasic, depends on the BCD floating point, and I suspect it is the same with all other software.

 

The reason is that you need to store constants in your code in the BCD format, implement the missing SIN/COS, store constants in the tokenized formats, etc.

 

Also, remember that IEEE 754 floats (32bits) has lower precision and range than Atari BCD, that has an equivalent precision of about 30 bits of mantisa and 10 bits of exponent.

Edited by dmsc
  • Like 3
Link to comment
Share on other sites

Errr.... Lots of confusions. Banking in a ROM at the math pack area does not require a PBI handler. It only requires an external ROM and pulling one line of the MMU. However, why do you make your life so complicated. If all you want is to replace the math pack, you can equally well replace the whole Os rom in first place.

I know exactly how the MPD function works, thanks. I'm simply trying to point out that most of the requested functionality can be or is already provided by PBI devices in a manner entirely compatible with a stock Atari OS.

  • Like 1
Link to comment
Share on other sites

I know exactly how the MPD function works, thanks. I'm simply trying to point out that most of the requested functionality can be or is already provided by PBI devices in a manner entirely compatible with a stock Atari OS.

Some times I forget about all the functionality SDX adds to the OS. I still haven't had time to really learn all it can do and set it up the way I want. Same with MyIDE II and THE!CART and my new eprom burner, etc.and learning to reflash things with uflash and Atarimax flash workbook and programmer. All I've used so far is the Atarimax programming cartridge to switch between Space harrier and Atari Blast on my ATarimax 8Mb flashcart...I need to find quality time to really learn the in's and outs of it all. You are probably right in that all the things I want in an "extended OS" I can probably get working with one good hacked OS rom and SDX. But that doesn't change my mind in thinking it would be really cool to have an extended OS one way or the other, just for the novely and to see what could be done with like a bank-schemed OS larger than 16K, or using extended memory for extra OS options,etc. I want to make my cake, eat it, and have it again and eat it too. ;) How about your GUI on a really large OS chip? j/k a super cartridge is fine...

Edited by Gunstar
  • Like 2
Link to comment
Share on other sites

One of my 800xls has a custom OS I threw together years ago, with fast FP and fast E:.

 

It would be an interesting ABBUC Software Competition entry to write a program to build custom OS images - add a variety of patches, change some options (screen colour on boot, OPTION function on boot, key repeat etc)... heck, even replace Self Test with a small game or something.

  • Like 6
Link to comment
Share on other sites

The problem is that the math pack and the basic are not well separated. Actually, the math pack is part of Basic. It does not even have a jump table, and some functions are in the math pack, others are in Basic.

 

Also, remember that IEEE 754 floats (32bits) has lower precision and range than Atari BCD, that has an equivalent precision of about 30 bits of mantisa and 10 bits of exponent.

Errr.... Technically, yes, but this is because IEEE 754 uses only four bytes, The Atari BCD uses six bytes. With a six-byte binary representation, you could reach a higher precision than the math pack. This is because the BCD representation (to the base 100) has a much higher average round-off error than a binary representation (to the base of 2).

  • Like 1
Link to comment
Share on other sites

I vote for a better self test.

  • A keyboard test that sees ALL the possible key combinations, including SHIFT by itself, and all the illegal keys (anything not part of the stock key matrix), so that the PS/2 aftermarket adapter function keys as well as the 1200XL function keys would appear.
  • A memory test that tests all the expanded memory upgrades past and present. And instead of simply cycling, specifically reports where an error occured.
  • An audio test that encompasses all the possibilities, Dual Pokeys, Covox, MIDI.
  • A Joystick/Paddle/Trackball/ST Mouse test.
  • And a system info screen, having detection for all possible 'detectable' upgrades, peripherals, and revision numbers where applicable.
  • Like 2
Link to comment
Share on other sites

Hello Michael

 

F1-F4 are in the selftest AFAIK, even in the XE's.

 

Sincerely

 

Mathy

 

Opps I forgot about that. So lets see that extended from the 4 to the full 12 ;) .

 

LOL scratch that, since most are just mapped to existing functions anyway (Start, Select, Option, Reset, Help). However, there are a few keys that have unique codes when pressed (F9, Print Screen, and the Windows keys). And let's not forget the CTRL+SHIFT+key codes that don't show on the existing self test.

 

Also be nice to have the actual KBCODE appear in both hex and decimal.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...