Another assembler

Alfred · October 21, 2017

Hi Alfred!

As you said, FastBasic currently does not support many string functions and long strings, but you can use byte arrays as longer strings and work directly with them, for example the integrated editor loads the complete file into a byte array and then prints segments using "BPUT", and manipulate parts using MOVE.

I'm want FastBasic to grow to a useful language for many types of programs, and extend it to address common needs (always keeping it as fast as possible). Perhaps some additions could be made:

- Possibility to treat byte arrays directly as (null-terminated) strings, with a syntax like:
 dim big(2000) byte    : ' A big "string" array
 big() = "Hello World" : ' Compiled as:  tmp$="Hello World": move adr(tmp$), adr(big), len(tmp$) : poke adr(big)+len(tmp$), 0
 big(6,) = "Alfred"    : ' Compiled as:  tmp$="Alfred": move adr(tmp$), adr(big)+6, len(tmp$) : poke adr(big)+6+len(tmp$), 0
 ? big()               : ' Compiled as:  i=0: while big(i) : put big(i) : inc i : wend
- Adding functions to search memory/strings, something like " index = memsearch( ADR(haystack$), LEN(haystack$), ADR(needle$), LEN(needle$) ) "

What do you think, would that be useful? Main problem with null-terminated strings is that you can't store $0, and in Atari this is the common "hearth" character.

Thanks,

Daniel.

I would avoid zero terminated strings. Instead I would up the length field to two bytes to allow for 64K strings. Since that is the largest data segment, it should suffice.

dmsc · October 22, 2017

Hi!

I would avoid zero terminated strings. Instead I would up the length field to two bytes to allow for 64K strings. Since that is the largest data segment, it should suffice.

Yes, but that makes string manipulations a lot more time consuming, because you can't simply point to a part of a string.

Currently FastBasic stores strings and arrays as simply pointers to the data, with the length as the first byte of the string (like Pascal short strings). String variables are allocated at full 256 bytes on assignment, so there is no memory fragmentation nor garbage collection needed. Also, as length is limited to 255 bytes, string ops are done with small and fast loops.

To use dynamic long strings, I would not only need to store the length as two bytes, but also to reallocate the string on each assignment (as you can't simply copy the contents of strings with different lengths), so you wold need to compact the string area on memory exhaustion (as old microsoft basics did) or to implement a memory allocator supporting alloc/free.

If on the other hand I implement static strings (like Atari BASIC), each string variable would need 3 integers (6 bytes): a pointer to the data, the allocated length and the current length. This could work in the floating-point version by storing the string descriptors in the floating-point stack, with the TOS at FR0. I'm not fond of AtariBASIC string handling, so I'm not sure that I would like implementing this.

Alfred · October 22, 2017

So split the difference. Require that strings be pre-allocated via DIM. So that gives you a max size, and removes the need to reallocate on a string operation. Zero-terminated strings are an abomination to be avoided where possible.

dmsc · October 22, 2017

Hi!

So split the difference. Require that strings be pre-allocated via DIM. So that gives you a max size, and removes the need to reallocate on a string operation. Zero-terminated strings are an abomination to be avoided where possible.

I agree about zero-terminated strings in this context, but as said before, pre-allocating strings does not solve my main worry: all string operations become slower and more complicated with long strings, as you now need to keep the allocation length, and loops are no longer as simple. I really think that most uses of strings are for small (less than 40 bytes even) strings. Even today's languages optimize for short strings.

That's why I wrote above that perhaps letting to use byte arrays as (limited) long strings made sense, that way you have fast operations for common short strings and slower operations on uncommon byte arrays.

So, another proposed syntax, storing the length as a "hidden" integer variable:

 dim big(2000) string  : ' A big "string" array, compiled to: dim big_data(2000) byte : big_len = 0
 big() = "Hello World" : ' Compiled to:  tmp$="Hello World": move adr(tmp$), adr(big_data), len(tmp$) : big_len = len(tmp$)
 big(6,) = "Alfred"    : ' Compiled as:  tmp$="Alfred": move adr(tmp$), adr(big_data)+6, len(tmp$) : big_len = len(tmp$)
 ? big()               : ' Compiled as:  bput #0, adr(big_data), big_len

Note that as I don't store allocated length, there is no bounds checking.

This is actually more difficult to implement in my parser than before, as currently the parser emits one reference for each parsed element, but I could add stack manipulation tokens to allow the repetition.

Alfred · October 23, 2017

Well this is Basic after all, it's not going to be super fast. Big strings are not that big a deal, but I think you have to have one of two: Either allow for an array of strings, like Dim X$(100,40); for 100 40 char strings or allow for big strings. You also need to lose that 256 bytes for any string size; the Atari doesn't have enough memory to waste it like that. You could do that in a 65816 version where memory isn't so constrained.

drac030 · October 23, 2017

Alfasm is a pretty basic assembler I wrote

So your name is Jeff Williams.

As far as I remember from the documentation, Alfasm supports the 65C816 high RAM, produces binaries with 24-bit addresses in headers ($FBFB is the signature) and uses the regular extended banked RAM to store symbols during assembling.

I wonder why you rather would not develop that program further.

Alfred · October 23, 2017

Been a while, but as I recall it's not all that fast, and it doesn't do macros. I prefer the assembler/linker method to producing binaries for larger projects now, so I'd rather write a new assembler/linker than try to refit Six Forks or fix Alfasm.I find it faster usually to do something new and just cut and paste in parts from other projects that are useful, than try to retrofit big patches.

Alfred · February 26, 2018

Well I'm throwing in the towel on using Basic XE. The Procedure/Exit statement is broken with LOCAL variables such that it's pretty much impossible to use if you're nesting the calls. Back to assembler it is.

Alfred · February 26, 2018

Looking at the Basic XE source code, it looks like it's a bug in the implementation of the EXIT statement. To save space there is a lot of joining in with code for other statements, so for example, Exit joins with Call and Procedure. If I've understood the code correctly what is happening is it is stores the result in a buffer, restores the calling value, but then copies the buffer to the variable and then copies it to the target of the "TO" part of the Call stmt. This overwrites the variable that was used.

Procedure "something" Using K

-- do code --

Exit K

Procedure "Caller"

Local i,k

for k= 1 to 10

call "something" using i to i

next K

Exit

The overwrite of K in "something" kills it in the "caller" procedure. They needed to do it in reverse order, copy the parm to the "TO" variable and then pop the stack. I don't know that I understand the code well enough to actually fix it though. With Wilkinson gone, I'm not sure there is anybody who can fix it.

VladR · February 27, 2018

If you have the actual source code, a fix for this should be simple - just store the variable in new location prior to calling the offended function, and then restore it (trumps doing this in binary). It's only a question of how much time you wanna spend on this. Looks like you did the 80% of work already.

The bigger problem is, that this might not be an actual bug, just merely a design "feature" / limitation.

Which may break some other functionality that relies on this "behavior". And that could be quite a problem to test. Meaning, even if you get this to work, some other, more subtle feature may break.

Alfred · February 27, 2018

I'd agree, I don't think it's a bug per se, it's working as intended. It's just that the design has a problem, which they probably thought was no big deal, even if it occurred to them. They probably just thought "Well, the coder can always just use different variable names" until someone like me comes along who always uses the same few letters for loops, etc. and then the whole scheme falls apart. It's interesting that what the manual says on page 116 that EXIT does, is not actually what happens, and if it did work that way, in my example, the return value would always be overlaid by the original value of the passed in variable. Sort of the opposite of what is happening now.

I can't see how fixing it would break existing code. If you're doing what I did, it just doesn't work. It's hard to imagine taking advantage of that fact. If there was a PUSH command that did the reverse of what POP does, I could see how fixing this would make that code irrelevant but it wouldn't make it stop working.

Anyway, I think I'll have a go at it. After Action! I've always liked Basic XE and it would be nice to give it the ability to address more than four banks of XE memory.

Edited February 27, 2018 by Alfred

VladR · February 27, 2018

Given how much work it is to keep the manual always up to date with latest codebase, especially with random hotfixes, I wouldn't be surprised that it doesn't correspond 100% to the reality

As for the broken dependencies - well it's always a surprise when some such thing happens, though eventually, after examining the code, it's behaving as the code says - but if you believe there should be no other feature that will get broken, that's awesome.

I had no idea that you could access additional banks from basic. Are you suggesting you wanna make the basic xe access up to 1 MB of RAM ?

Alfred · February 27, 2018

I've not looked too deeply into how Basic XE handles extended memory. I know that it bit twiddles rather than using a table lookup. It does not scan the hardware, it simply presumes that it's an XE and flips the appropriate bits. There are BIT instructions all over the place that test for a flag which says whether we are in Extended mode or not and branches to different code if such is the case. It is my hope that statement location is indicated by a bank # and address/offset within the bank, because such a scheme would in theory allow for up to 255 banks. If it's something else, well who knows then. if for example, it's a 16 bit offset from zero, then going past 64K is problematic because then all the code involving statement creation and lookup would have to be changed to support a larger offset.

Yes, I would like Basic XE be able to address banked memory up to the limit of what it can handle, 128 banks, 255 banks, whatever it can handle beyond the current 4 banks without having to rewrite the whole program. There is no way, for example, to allow for data to be accessed in banked memory because virtually every data reference statement would need to be updated.

My ulterior motive overall is to enable increased 65816 use. So my plan for 65816 world domination goes something like this:

- good 65816 assembler/linker, which leads to

- good 65816 DOS which then allows for binaries to run above the 64k line which leads to

- Action! and Basic XE to run (in 6502 mode) above the 64K line. So you could have separate 64K code and data banks with little modification of the existing Action! or Basic XE binaries

- finally, convert Action! and/or Basic XE to actually generate/use 65816 operands.

I doubt I will live long enough to see it through, but it gives me something to do when I'm tired of fixing rail fences that my wife's horses are always breaking.

Alfred · February 27, 2018

Been perusing the XE code. So it both bit twiddles and has a small table of values. I'll have to go look at my copy of the Basic Sourcecode book to understand this, but it seems to be that each bank has it's own Statement table, and if the msb of an entry is $FF that means end of bank. So it looks like it should be possible to extend that to more banks. The problem is that as more banks are added, the time to search for the line # will increase, and would probably be too much in the case of a 1Meg machine. I have yet to see to what extent the FAST statement fixes this, as so far all the code just uses linear search to locate a given line number.

It might be that if you use say 128 banks for an XE program, FAST will be a requirement. It might also be possible to scan the current bank first, so that if a program has decent locality of reference it might run at an acceptable speed.

Edited February 27, 2018 by Alfred

Alfred · February 27, 2018

It occurred to me that I was looking at the Basic XE 4.2 source code, but the cartridge I am using is 4.1. So I tried to build the 4.2 extensions from the source. Too many files, and not enough memory under DOS 2.0. Ok, Sparta 3.2. Nope, MEM Full. Ok, let's go with Sparta 2.3, lowest memlo in existence. Nope, MEM Full. The cartridge binary can be assembled just fine, only the extensions fail. I guess this is why 4.2 never saw the light of day. It's simply not possible to assemble the extensions disk on an Atari using Mac/65.

Oh well, so much for updating Basic XE.

_The Doctor__ · February 27, 2018

WOW, that was quick to give up...

Alfred · February 27, 2018

Well I did pursue it a little further. Using Sparta 2.3 and assembling from disk, which I know will cause an error in Pass 2, the assemble fails with a DIV10 error which means that somewhere there's a mistake. So that means having to list out the whole damn thing to find the error.

I wouldn't say quick, I've spent hours on this. I might tinker with it a bit more, but I don't see a solution to the memory issue. If you think can get it to assemble -on the Atari using MAC/65- well you go right ahead then and let me know how you did it.

Alfred · February 28, 2018

Ok, we're back in business. After taking a nap, and some narcotic painkillers I went back to the source. Turns out disk3 which is the floating point disk also has a MASTER2 for assembling just the extensions. Using that, the extensions assemble without error, which is interesting. It means the floating point code on disk2 is not only wrong, it isn't actually used in building the extension file. The file BASICXE2.OSS seems to be valid, although I haven't parsed out the differences between it and the BASICXE.OSS on that disk.

This is still a bit of a mess. Using Sparta 2.3 there are 487 bytes left. Not much room for making changes there.

The error in EXIT processing also appears to be fixed. My example runs just fine. Disk3 also has Dos 2.5 on it, so I'm going to see if it can assemble the floating point code.

Unfortunately my area of expertise is in writing operating systems and code for them, for IBM mainframes. I lack the math skills to tinker with the floating point code on the Atari and am not really interested in learning. So if there's anybody here that does have some knowledge of how to write floating point routines, please speak up. I might not need the help, but it would be nice to know we have it available if I or anyone else needs a hand.

+Stephen · February 28, 2018

Just a suggestion here, as I see you are targeting 65816. Have you considered Veronica BASIC?

Alfred · February 28, 2018

I had to go look Veronica up because I had no idea what it is. As a co-processor it's a much different environment than a native 65816 base processor. At this point I have enough things to keep me busy, I doubt I would get around to writing code for Veronica.

Alfred · February 28, 2018

Looking at FAST, what it does is scan through your program looking for statements like goto, gosub, for, next etc. If the target is a fp number only, not an expression, then it locates the bank/address of the target, replaces the fp number with that bank/address and changes the token type from FP to FAST. So that's good, meaning it can support more banks.

The bad news is line numbers having to be less than 32768 seems to be hardcoded all over the place with BIT tests to see if the line number is 32768 or greater. Changing that would be a breaking change to any existing Basic XE code.

Anyway, that's enough of Basic XE for now, back to what I was doing before.

Kyle22 · March 4, 2018

I would be very interested in BXE that has been tweaked to take full advantage of the 802/816.

Alfred · March 4, 2018

I would be very interested in BXE that has been tweaked to take full advantage of the 802/816.

I was thinking about this and I don't think it would make sense to port BXE to the 65816. I think Basic XL would be a better choice. The reason is that the two are virtually identical in capability, and under the 65816 being able to use banked memory is less useful. I would instead upgrade the Move command in BXL to be able to use banks like BXE and perhaps give it the faster floating point routines from BXE as well as include most of the XL toolkit to give it the Call, Procedure etc. statements therein.

It would be less work, if only from the standpoint of not having to take out all those flag tests that BXE does to look for Extended mode. In addition, it would be very labour intensive to move BXE to another assembler because of how it has used MAC/65 local variables. In order to conserve symbol table space it will use something like :BR1 dozens or hundreds of times as the target of a few local branches, redefining :BR1 every dozen lines of code. So to switch to a different assembler, you would need to go through every line to figure out the logic of the branches in order to relabel them correctly. That wouldn't be fun.

luckybuck · March 4, 2018

Hi Alfred,

First, we are soooo glad, you are doing fine. :-)

What you are on is on my wish list regarding Ultimate Basic for the Atari. Here, I would like to have the basic xe access up to 4 MB of RAM! Not kidding.

If it can be done with ACTION!, too, even better.

But Alfred, what about a new topic? How should find someone BXE under 'another assembler' at all? The topic must always be an eye catcher, otherwise it is difficult to get 'big brain' on that...

I suggest: 'OSS Basic XE with 4 MB RAM - who can help to finish?'

Same with Action!

Further, did you try some cross assemblers? WUDSN is really best choice here. Makes the job much smarter.

Anyway, I wish you all the health in the world to continue in this.

Stay with us Alfred, we really need you!

Kyle22 · March 5, 2018

BXL would also be great, but I was thinking about BXE for those of us with 802’s or 816’s without linear RAM.

I would think that optimal 816 code would really increase the speed of either one.

So as to avoid confusion, this is all assuming that the user will have an 816 OS in there to deal with interrupts so we can stay in native mode.

Another assembler

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members