Jump to content
thorfdbg

Atari Basic Bug List

Recommended Posts

Hi folks,

 

just trying to collect a list of interesting "features" of the various Atari Basic revisions we have - mostly to avoid them. This is currently the list I have. As you see, some of them are pretty much known, while others are probably quite exotic. Thus, if you are aware of anything that's not in my list, please let me know:

 

- The parser allows INPUT without any parameters, i.e. a line like "10 INPUT" was parsed as "correct".

- A Ctrl-U as last component of a string argument to PRINT works as if the PRINT statement included a semicolon. (Rev.A only, fixed in Rev.B and C).

 

- It is possible to DIM two-dimensional arrays overrunning the available memory because any type of overflow check is missing. (All revisions). The interpreter only made some very basic checks, but does not test whether the arithmetics overflow.

 

- A downwards block move of an exact multiple of 256 bytes moved the wrong memory (Rev A only). This can crash or hang the interpreter when deleting lines.

- A upwards block move of an exact multiple of 256 bytes moved the wrong memory (Rev B only). This can crash or hang the interpreter when inserting lines.

 

- Basic Rev. A crashed on cascaded versions of multiple unary operators, in particular PRINT NOT NOT A or PRINT ++2 or PRINT --X crashed. Atari "resolved" this issue in Rev.B by not parsing such expressions, Basic++ allows then and implements them correctly.

- LOCATE apparently did not restore the input buffer pointer correctly and could have caused errors if followed by a VAL() that required the pointer to be seated correctly.

 

- NOTE and STATUS were also parsed correctly if their arguments were arrays. That is, code like "10 NOTE #1,SECTOR(I),BYTE(I)" was accepted as correct, but does not execute correctly (all revisions).

 

- POINT does not accept arbitrary expressions, i.e. "10 POINT #1,SECTOR(I),BYTE(I)" works correct, but "10 POINT #1,SECTOR(I),BYTE(I)+1" is parsed as syntactically wrong, though it is not. (all revisions)

 

- CHR$(a)=CHR$(b) is always true, regardless of a and b, the same bug holds true for STR$(a)=STR$(b) if "a" and "b" have the same number of digits (all revisions)

 

- The ^ (POW) function returns incorrectly rounded results (Rev.A), 2^3 is not eight, but 7.999...

- The ^ (POW) function returns wierd results for some inputs (Rev.B and Rev.C), i.e. 1^44 = 2 (huh?)

- ATN(1) is not equal to 45 degrees. CLOG(1) and LOG(1) are not zero.

 

- ON x GOSUB a,b,c still pushes the return address on the stack, even if "x" is out of range. So for example, if "x" equals to 4 in the above expression, and a "RETURN" is executed, this RETURN returns to the "ON x GOSUB" expression even though it was never executed. Or rather, "because it was partially executed, even though it should not be executed at all".

 

- If a LOAD is interrupted by an error on the disk, the interpreter is left with a partially loaded program and does not clear the program area, which could result in crashes and hangs. It DOES clear the program area on the next RESET then.

 

- If SYSTEM RESET is pressed while the computer is in the middle of shuffling program lines around, for example because it is inserting a line into a longish program, the interpreter might be left with an unusable program and may crash or hang. It does not clear out the program area as it should, and it does not detect that the program area is inconsistent.

 

- If a program runs "ENTER" to interactively add lines to its code, and the ENTER'd data includes a direct instruction (without a line number), this instruction is executed directly (intentional!). However, if that instruction includes an INPUT, the data for the INPUT is read from the file, not from the screen, and ENTER terminates. Worse, if the program to ENTER includes a GOTO statement (without line number) or some other form to continue regular program execution, the input channel #7 is never closed, and the "standard input" of the interpreter remains at this channel. In specific, the next INPUT reads from this channel rather than from the editor.

 

- Range checks for many functions or statements are missing. SOUND accepts volumes larger than 15 or frequencies larger than 255. SETCOLOR ditto. The IOCB channel is not checked for PRINT and INPUT, but is checked for all other operations (why, oh just tell me why!).

POKE can take "byte values" larger than 255.

 

I'd also like to collect programs that depend on such bugs. So for example, I've seen programs that depend on SOUND not trapping if the arguments are larger than 255. I've also been told that INPUT #16 is used at times. There is also an interesting application of the "Load Flag" at address 202: The game KAISER pokes here the value 4 as a "list protection" as this address checked by the interpreter (though not always consistently) whether a LOAD was incomplete. Hence, it triggers "self destruction" of the program. KAISER also depends on "ENTER" executing direct statements, and channel #7 staying open to receive INPUT, and to TRAP if it finds the EOF of the ENTER'd data.

 

 

  • Like 7

Share this post


Link to post
Share on other sites

Wow. This was an interesting read. Thanks for putting all of that together.

BASIC B will add 16 bytes at the end of each SAVE, eventually trashing the program. A LIST "D:PROG.LST", ENTER "D:PROG.LST, followed by a

SAVE "D:PROG.BAS cleaned up the end garbage. Probably precede with a LIST to screen, even if you BREAK before LIST is complete.

As an aside, I prefer BASIC B to A and C. I've done a lot of BASIC programming and B is most stable IMHO. ie. A and C will crash or hang more often than B.

B is a recompile over A, whereas C is a twelve byte patch of B, I guess just to get rid of the 16 byte tail ending.

Jindroush wrote a PC program to recover a B trashed program, I wrote one also in Atari BASIC.

The SALVAGE program makes a LISTed program from any SAVEd program.

Here's three versions of my SALVAGE programs, I think SALVAGEX.BAS is the best, I don't remember.

SALVAGE.zip

Edited by russg

Share this post


Link to post
Share on other sites

Hi!,

 

I tested some of the bugs, this are my findings:

 

Hi folks,

 

just trying to collect a list of interesting "features" of the various Atari Basic revisions we have - mostly to avoid them. This is currently the list I have. As you see, some of them are pretty much known, while others are probably quite exotic. Thus, if you are aware of anything that's not in my list, please let me know:

 

- The parser allows INPUT without any parameters, i.e. a line like "10 INPUT" was parsed as "correct".

Present in rev-A, fixed in rev-B, rev-C, TBXL and altirra-basic.

- A Ctrl-U as last component of a string argument to PRINT works as if the PRINT statement included a semicolon. (Rev.A only, fixed in Rev.B and C).

Present in rev-A and TBXL, fixed in rev-B, rev-C and altirra-basic

- It is possible to DIM two-dimensional arrays overrunning the available memory because any type of overflow check is missing. (All revisions). The interpreter only made some very basic checks, but does not test whether the arithmetics overflow.

Present in rev-A, rev-B and rev-C, apparently fixed in TBXL and altirra-basic.

- A downwards block move of an exact multiple of 256 bytes moved the wrong memory (Rev A only). This can crash or hang the interpreter when deleting lines.

Don't know how to test, but I think this is fixed also in TBXL (and altirra-basic).

- A upwards block move of an exact multiple of 256 bytes moved the wrong memory (Rev B only). This can crash or hang the interpreter when inserting lines.

Don't know how to test, but I think this is fixed also in TBXL (and altirra-basic).

- Basic Rev. A crashed on cascaded versions of multiple unary operators, in particular PRINT NOT NOT A or PRINT ++2 or PRINT --X crashed. Atari "resolved" this issue in Rev.B by not parsing such expressions, Basic++ allows then and implements them correctly.

Correctly fixed in TBXL and altirra-basic, a .BAS saved in rev-A executes correctly only on TBXL and altirra-basic.

- LOCATE apparently did not restore the input buffer pointer correctly and could have caused errors if followed by a VAL() that required the pointer to be seated correctly.

Can you produce a sample program to test this?

- NOTE and STATUS were also parsed correctly if their arguments were arrays. That is, code like "10 NOTE #1,SECTOR(I),BYTE(I)" was accepted as correct, but does not execute correctly (all revisions).

Present in rev-A, rev-B and rev-C, parser fixed in TBXL and altirra-basic.

- POINT does not accept arbitrary expressions, i.e. "10 POINT #1,SECTOR(I),BYTE(I)" works correct, but "10 POINT #1,SECTOR(I),BYTE(I)+1" is parsed as syntactically wrong, though it is not. (all revisions)

Parser fixed in TBXL, saved .BAS from TBXL works correctly on rev-A, rev-B, rev-C and altirra-basic.

- CHR$(a)=CHR$(b) is always true, regardless of a and b, the same bug holds true for STR$(a)=STR$(b) if "a" and "b" have the same number of digits (all revisions)

Present in rev-A, rev-B, rev-C, TBXL and altirra-basic.

- The ^ (POW) function returns incorrectly rounded results (Rev.A), 2^3 is not eight, but 7.999...

Bug in mathpack, fixed in TBXL as it replaces math pack OS routine.

- The ^ (POW) function returns wierd results for some inputs (Rev.B and Rev.C), i.e. 1^44 = 2 (huh?)

Only in rev-B and rev-C, rev-A and altirra-basic gives 1.00000002 and TBXL gives 1.

- ATN(1) is not equal to 45 degrees. CLOG(1) and LOG(1) are not zero.

Only a little loss of precision, "DEG:?ATN(1)" gives 45.00000033 in rev-A and TBXL, 45.0000003 in rev-B and rev-C, 45.00000037 in altirra-basic.

CLOG(1) gives 2e-10 in rev-A, TBXL and altirra-basic, gives 0 in rev-B and rev-C.

- ON x GOSUB a,b,c still pushes the return address on the stack, even if "x" is out of range. So for example, if "x" equals to 4 in the above expression, and a "RETURN" is executed, this RETURN returns to the "ON x GOSUB" expression even though it was never executed. Or rather, "because it was partially executed, even though it should not be executed at all".

Present in rev-A, fixed in rev-B, rev-C, TBXL and altirra-basic.

- If a LOAD is interrupted by an error on the disk, the interpreter is left with a partially loaded program and does not clear the program area, which could result in crashes and hangs. It DOES clear the program area on the next RESET then.

I don't know how to test this.

- If SYSTEM RESET is pressed while the computer is in the middle of shuffling program lines around, for example because it is inserting a line into a longish program, the interpreter might be left with an unusable program and may crash or hang. It does not clear out the program area as it should, and it does not detect that the program area is inconsistent.

Also don't know how to test this.

- If a program runs "ENTER" to interactively add lines to its code, and the ENTER'd data includes a direct instruction (without a line number), this instruction is executed directly (intentional!). However, if that instruction includes an INPUT, the data for the INPUT is read from the file, not from the screen, and ENTER terminates. Worse, if the program to ENTER includes a GOTO statement (without line number) or some other form to continue regular program execution, the input channel #7 is never closed, and the "standard input" of the interpreter remains at this channel. In specific, the next INPUT reads from this channel rather than from the editor.

Present in rev-A, rev-B, rev-C and TBXL. Fixed in altirra-basic.

- Range checks for many functions or statements are missing. SOUND accepts volumes larger than 15 or frequencies larger than 255. SETCOLOR ditto. The IOCB channel is not checked for PRINT and INPUT, but is checked for all other operations (why, oh just tell me why!).

POKE can take "byte values" larger than 255.

Can not reproduce in standard basics, the following program prints "OK" in rev-A, rev-B, rev-C and TBXL but indeed fails in altirra-basic:

 

10 X=256
20 TRAP 25:POKE 40000,X
21 ? "BAD",X:END 
25 X=X+1:IF X<100000 THEN 20
30 ? "OK"
Two more bugs:

 

- Parser accepts $1B characters equivalent to ':'. This is because $1B is $80 eor $9B, so it matches the EOL. This is present in rev-A, rev-B, rev-C and TBXL, fixed in altirra-basic. A saved .BAS with this does not list ok in altirra-basic, listas as newlines between statements in other basics.

 

- "NOT -0" is parsed as "NOT" "-0" in rev-A and TBXL, and parsed as "NOT" "0" in rev-B, rev-C and altirra-basic. Also, the rev-A parsed ".BAS" gives "0" (should be 1) in rev-A, error 11 in rev-B and rev-C and correct result in TBXL and altirra-basic.

Edited by dmsc
  • Like 2

Share this post


Link to post
Share on other sites

Altirra BASIC didn't fix bugs so much as it just never had them in the first place, because it isn't based on Atari BASIC. Many of the weird behaviors it has are intentional for compatibility. Some of the cases that were fixed are probably things I should unfix, actually.

Also, some of these issues aren't bugs so much as quality of implementation issues. ^, in particular, was never guaranteed to return exact results for integers even though it is nice to do so.

 

- A Ctrl-U as last component of a string argument to PRINT works as if the PRINT statement included a semicolon. (Rev.A only, fixed in Rev.B and C).

 

Actually, it's any token sequence which ends in $12 or $15, which means that a string ending in Ctrl-R and a numeric constant with 12 or 15 can also trigger it, i.e. PRINT 1.00000015.

 

POINT does not accept arbitrary expressions, i.e. "10 POINT #1,SECTOR(I),BYTE(I)" works correct, but "10 POINT #1,SECTOR(I),BYTE(I)+1" is parsed as syntactically wrong, though it is not. (all revisions)

 

POINT is documented in the manual as taking avars, so the bug here is that BASIC should not be accepting either case. However, design-wise it should have taken aexps in both cases to be consistent with all other statements that take in parameters. That was probably an attempt to fix-by-documentation when problems in the code were discovered.

Some other cases that I know of, from my ATBasic notes:

  • READ caches a direct pointer to the current line that is not invalidated by program modifications. Editing the program and then doing a CONT or immediate READ can cause incorrect DATA to be read if the edits cause a different DATA statement to fall in the same place as the old one. FOR probably has similar issues.
  • If the stop line is deleted, CONT continues two lines after where the stop occurred, not one.
  • LOCATE/POINT/PLOT/DRAWTO can be used on the text screen, but only after executing GR.0.
  • GRAPHICS closes the IOCB before evaluating the argument, so GR.3:GR.1/0 will leave you will a graphics screen where all drawing commands fail. This is unusual since for most other commands BASIC evaluates all arguments before doing any irreversible actions.
  • SOUND writes to SKCTL but not SSKCTL.
  • FRE(0) is off by one byte. Possibly intentional, though, due to a rumored similar off-by-one in the Screen Editor memory check.
  • Pressing the Break key while LIST is executing in deferred mode continues execution with the next statement instead of stopping.
  • AND, OR, THEN, and STEP are not fully treated as keywords and can be inadvertently created as variables, which causes havoc. THEN=1 works, PRINT THEN works, PRINT 1 AND THEN works, PRINT THEN AND 1 produces a parsing error.
  • String literals can omit the ending quote. This is legal according to the manual, but notably Floyd of the Jungle (1982) breaks if this is not allowed.
  • XIO can break IOCBs because it overwrites AUX1 without restoring it and AUX1 is used by CIO for permission checks. However, fixing this is inadvisable because it could in turn break devices like R: that expect this behavior and restore AUX1.
  • OPEN allows already opened IOCBs to be used and relies on CIO to throw an error. This can in turn trash the IOCB and cause a crash due to a bug on OS-B.
  • Parsing errors do not remove variables added during the parse from the program.
  • The parser can invoke TRAP and resume program execution if you give it a bad line number (-1) or run out of memory.
  • Many statements that take integer arguments and check for >256 or >32767 will accept 65535.9 as 0. This is a math pack FPI routine bug.
  • IF takes only a literal constant for the line number instead of an aexp. This is documented but inconsistent with other statements that take line numbers.
Edited by phaeron
  • Like 1

Share this post


Link to post
Share on other sites

 

- Parser accepts $1B characters equivalent to ':'. This is because $1B is $80 eor $9B, so it matches the EOL. This is present in rev-A, rev-B, rev-C and TBXL, fixed in altirra-basic. A saved .BAS with this does not list ok in altirra-basic, listas as newlines between statements in other basics.

 

- "NOT -0" is parsed as "NOT" "-0" in rev-A and TBXL, and parsed as "NOT" "0" in rev-B, rev-C and altirra-basic. Also, the rev-A parsed ".BAS" gives "0" (should be 1) in rev-A, error 11 in rev-B and rev-C and correct result in TBXL and altirra-basic.

 

The former bug has a couple of additional implications, too. ATASCII-Zero can also be mis-understood as a token, and so does ATASCII-2. I don't remember the details, but the latter two are fixed and no longer present in Basic++.

 

Parsing of NOT NOT and NOT-0 and so are all implications of the parsing of unary operators that changed from Rev.A to Rev.B, and that Rev.A generates the wrong output is the consequence of invalid priorities of the unary operators (they "bind" with the wrong associativity). Actually, Rev.A parse "NOT -0" as three tokens, "unary-not", "unary-minus" and "float-const zero". Due to the parser changes, rev.B and C parse it as "NOT" followed by "minus zero" which degenerates into "zero" by the floating point package, though Rev.B does not allow "NOT NOT 0", though it should be valid. Basic++ fixes this, but also parses "? -2" as two tokens "PRINT" "-2", whereas Rev.A,B and C parse this as "PRINT" "unary minus" "float-const 2". The net effect is of course the same.

Share this post


Link to post
Share on other sites

 

Some other cases that I know of, from my ATBasic notes:

  • READ caches a direct pointer to the current line that is not invalidated by program modifications. Editing the program and then doing a CONT or immediate READ can cause incorrect DATA to be read if the edits cause a different DATA statement to fall in the same place as the old one. FOR probably has similar issues.
  • If the stop line is deleted, CONT continues two lines after where the stop occurred, not one.
  • LOCATE/POINT/PLOT/DRAWTO can be used on the text screen, but only after executing GR.0.
  • GRAPHICS closes the IOCB before evaluating the argument, so GR.3:GR.1/0 will leave you will a graphics screen where all drawing commands fail. This is unusual since for most other commands BASIC evaluates all arguments before doing any irreversible actions.
  • SOUND writes to SKCTL but not SSKCTL.
  • FRE(0) is off by one byte. Possibly intentional, though, due to a rumored similar off-by-one in the Screen Editor memory check.
  • Pressing the Break key while LIST is executing in deferred mode continues execution with the next statement instead of stopping.
  • AND, OR, THEN, and STEP are not fully treated as keywords and can be inadvertently created as variables, which causes havoc. THEN=1 works, PRINT THEN works, PRINT 1 AND THEN works, PRINT THEN AND 1 produces a parsing error.
  • String literals can omit the ending quote. This is legal according to the manual, but notably Floyd of the Jungle (1982) breaks if this is not allowed.
  • XIO can break IOCBs because it overwrites AUX1 without restoring it and AUX1 is used by CIO for permission checks. However, fixing this is inadvisable because it could in turn break devices like R: that expect this behavior and restore AUX1.
  • OPEN allows already opened IOCBs to be used and relies on CIO to throw an error. This can in turn trash the IOCB and cause a crash due to a bug on OS-B.
  • Parsing errors do not remove variables added during the parse from the program.
  • The parser can invoke TRAP and resume program execution if you give it a bad line number (-1) or run out of memory.
  • Many statements that take integer arguments and check for >256 or >32767 will accept 65535.9 as 0. This is a math pack FPI routine bug.
  • IF takes only a literal constant for the line number instead of an aexp. This is documented but inconsistent with other statements that take line numbers.

 

Huh? READ does not cache pointers. It caches a line-number/offset pair, which is part of the problem why it is so slow. Actually, it caches two line numbers: The line number of the current code, and the line number of the DATA line it wants to read from. It first goes once completely over the program to find the line to read from, and then goes over the program once again to restore the program line pointer. At least this problem got fixed.

 

The only bad thing that can happen is that it may read the wrong data if you replace the DATA line it currently tries to read from.

 

That plot and drawto draw on a Gr.0 screen is actually just due to the orthogonality of the Os, so there is not really a problem with it. The CONT/STOP bug is a nice find. I believe this is because the line search returns the line that is equal or larger than equal than the line to scan for, and if the STOP line is deleted, it finds the next line. And since it tries to skip the next line since this is where it stopped, it skips over one line too much.

 

Not sure whether SOUND should write into the shadow register. That's actually used for the serial port control, not so much audio control. Anyhow, it's probably a good idea to update the shadow register?

 

I believe the reason why FRE(0) is off is simply by the definition of the variables. MemTop is the pointer to the highest free byte of the operating system, and HiMem (0x90) is the first byte *not* used by basic. So their difference is not really the number of free bytes, but the number of free bytes - 1. The same bug is actually in the basic memory allocator: It already fails if HiMem becomes equal to MemTop, so in a sense, this is consistent. FRE(0) is the number of bytes Basic "wants" to allocate from the Os. It could allocate one additional byte. It's probably a better idea not to attempt this, though.

 

I cannot quite reproduce the problem with LIST here, though. Probably that has been fixed "by accident".

 

The "token fun" with THEN and so on is again a nice find. I believe there is a place where Basic tests for operator names not to be used as variable names, so I'm not quite sure how this works.

 

More later...

Share this post


Link to post
Share on other sites

Huh? READ does not cache pointers. It caches a line-number/offset pair, which is part of the problem why it is so slow. Actually, it caches two line numbers: The line number of the current code, and the line number of the DATA line it wants to read from. It first goes once completely over the program to find the line to read from, and then goes over the program once again to restore the program line pointer. At least this problem got fixed.

Well, my guess as to the cause was wrong, but it was based on this behavior:

 

10 DATA 1,2,3
20 DATA 4,5,6
30 READ A
40 STOP
RUN

STOPPED  AT LINE 40
10
READ B

READY
PRINT B
5

READY

The BASIC interpreter switches from midway in the first line to midway in the second. Either an error or restarting at the beginning of the second line would make more sense.

 

Not sure whether SOUND should write into the shadow register. That's actually used for the serial port control, not so much audio control. Anyhow, it's probably a good idea to update the shadow register?

 

BASIC has to at least write into SKCTL for sound channels 3 and 4 to work after boot or a disk load. Asynchronous receive mode (SKCTL bit 4) is left on by SIO after a receive operation and must be cleared for those channels to operate. Many emulators do not emulate this, but it's necessary on real hardware. BASIC writes $03, which is a safe default. However, it also means that if you've enabled POT scan mode or temporarily disabled the keyboard, the SOUND command has the unexpected side effect of resetting those modes.

 

Reading the OS manual documentation again, though, SSKCTL is documented as an SIO internal variable. Hmm. That probably means there isn't actually an officially correct way to maintain this shadow externally. SIO itself won't be affected as it always updates bits 4-6.

 

Share this post


Link to post
Share on other sites

Well, my guess as to the cause was wrong, but it was based on this behavior:

 

10 DATA 1,2,3
20 DATA 4,5,6
30 READ A
40 STOP
RUN

STOPPED  AT LINE 40
10
READ B

READY
PRINT B
5

READY

The BASIC interpreter switches from midway in the first line to midway in the second. Either an error or restarting at the beginning of the second line would make more sense.

 

 

BASIC has to at least write into SKCTL for sound channels 3 and 4 to work after boot or a disk load. Asynchronous receive mode (SKCTL bit 4) is left on by SIO after a receive operation and must be cleared for those channels to operate. Many emulators do not emulate this, but it's necessary on real hardware. BASIC writes $03, which is a safe default. However, it also means that if you've enabled POT scan mode or temporarily disabled the keyboard, the SOUND command has the unexpected side effect of resetting those modes.

 

Reading the OS manual documentation again, though, SSKCTL is documented as an SIO internal variable. Hmm. That probably means there isn't actually an officially correct way to maintain this shadow externally. SIO itself won't be affected as it always updates bits 4-6.

 

That the basic interpreter switches is actually a side-effect of the line number search. The line number search function returns the smallest line larger or equal to the indicated line number, but READ does not adjust the offset to the start of the line in case the line number search does not find the target line. So it remains at the same offset, but in a different line. Hmmm....

 

SSKCTL is pretty much a SIO shadow, but it doesn't really make much use of it in first place. It would equally well work if SIO would just poke the value it needs into the hardware register.

Share this post


Link to post
Share on other sites

 

  • String literals can omit the ending quote. This is legal according to the manual, but notably Floyd of the Jungle (1982) breaks if this is not allowed.
  • XIO can break IOCBs because it overwrites AUX1 without restoring it and AUX1 is used by CIO for permission checks. However, fixing this is inadvisable because it could in turn break devices like R: that expect this behavior and restore AUX1.
  • OPEN allows already opened IOCBs to be used and relies on CIO to throw an error. This can in turn trash the IOCB and cause a crash due to a bug on OS-B.
  • Parsing errors do not remove variables added during the parse from the program.
  • The parser can invoke TRAP and resume program execution if you give it a bad line number (-1) or run out of memory.
  • Many statements that take integer arguments and check for >256 or >32767 will accept 65535.9 as 0. This is a math pack FPI routine bug.
  • IF takes only a literal constant for the line number instead of an aexp. This is documented but inconsistent with other statements that take line numbers.

 

That the ending quote can be omitted is a feature. (-: Actually, it's often quite helpful to avoid unnecessary typing, so it's probably best left alone.

 

For XIO, the problem really is a bit the underlying CIO protocol since XIO can work both on openend (then Aux1,2 should be left alone) or closed (then Aux1,2 are part of the implicit open protocol) channels.

 

Actually, I do not see any bug in leaving implicitly defined variables in the variable table. Basic has rather some logic to remove them after parsing. It keeps the variable table pointers before starting the parsing phase, and if that fails and Basic generates an ERROR line, it removes those tables again. There is a counter at $b1 (SVVVTE in Atari Basic) that keeps track of the number of variables created by the parser.

 

That the parser can invoke TRAP is somehow implicit in the protocol. It somehow has to because otherwise ENTER could not be trapped either, and that is because ENTER does nothing more but redirect the "stdin" of the parser to the given file.

 

The 65535.9 bug does not seem to appear here. That's Os++, so apparently I already fixed that in the math pack. Good to know anyhow. It is probably a matter of the rounding protocol. Os++ has a "round to nearest" policy regarding the float -> int conversion, so 65535.4 works as expected, 65535.5 already throws an error. So apparently, I got something right.

 

As for the "if", the actual implementation could take anything, i.e. it runs into the full expression evaluation. It pretty much runs into GOTO directly if no second statement is following. Only the parser disallows this. But then again, I'm not quite sure whether this is actually helping:

 

IF X THEN A+10

 

reads somehow funny to me, whereas

 

IF X THEN GOTO A+10

 

does the same, but is more explicit and hence quite clear on its intention.

 

Anyhow, thanks for updating the list.

Share this post


Link to post
Share on other sites

 

 

- The parser allows INPUT without any parameters, i.e. a line like "10 INPUT" was parsed as "correct".

Present in rev-A, fixed in rev-B, rev-C, TBXL and altirra-basic.

 

Yeah, listing bugs which have been already fixed in rev. B/C (30 years ago) seems barren.

 

It is interesting though how apparently noone noticed (not having clicked in the link Sikor quoted) that in Atari BASIC rev. C the DIM instruction can be typed in without parameters, and it is accepted by the parser. Just try 10 DIM{RETURN}

  • Like 1

Share this post


Link to post
Share on other sites

The "token fun" with THEN and so on is again a nice find. I believe there is a place where Basic tests for operator names not to be used as variable names, so I'm not quite sure how this works.

 

More later...

It's actually worse. Basic did (sort of) check whether a variable name is reserved word when creating variables, but not correctly so. Thus operator functions like ASC or so could not be used as variable names, though everything else worked, pretty much dependent on context.

 

So for example,

 

PRINT GOSUB

 

creates a new variable named "GOSUB" on the variable table, because GOSUB is also an operator (as in ON x GOSUB). However, you cannot assign to it like this:

 

GOSUB=1

 

because this is parsed as a GOSUB command. However, this

 

LET GOSUB=1

 

worked.

 

GOSUB GOSUB

 

or

 

GOTO GOSUB

 

also worked. Fun, fun, fun...

 

There is even a race condition in the parser that could, in some cases, insert junk tokens into the program, if - by pure chance - the address of the "parse variable" ABML tokens matched the token of an operator just parsed as a potential variable name. I do not know whether that ever happened, but it is a possibility.

Share this post


Link to post
Share on other sites

>LET GOSUB=1

For me that always was the reason why LET was there in first place, also in many other, esp. early BASIC dialects. The option to omit LET came with the "late" BASICs - at least that was my perception. In Sinclair BASIC is was mandatory and fully allowed all keywords as variable names. Esp. important if you think the new versions of the BASIC can introduce new keywords here without breaking existing code!

 

From the Atari BASIC Manual: "It is advisable not to use a keyword as a variable name or as the first part of a variable name as it may not be interpreted correctly."

 

advisable, not forbidden.

 

(Fun fact. C64 BASIC even refuses LET WELLRUN=1 because it contains "RUN" :-)

  • Like 1

Share this post


Link to post
Share on other sites

Hi!,

 

Altirra BASIC didn't fix bugs so much as it just never had them in the first place, because it isn't based on Atari BASIC. Many of the weird behaviors it has are intentional for compatibility. Some of the cases that were fixed are probably things I should unfix, actually.

I know, I tested the bugs only to document the different behavior.

Also, some of these issues aren't bugs so much as quality of implementation issues. ^, in particular, was never guaranteed to return exact results for integers even though it is nice to do so.

I fully agree on this, those are limitations on the FP implementation.

Actually, it's any token sequence which ends in $12 or $15, which means that a string ending in Ctrl-R and a numeric constant with 12 or 15 can also trigger it, i.e. PRINT 1.00000015.

I remember this bug from when I used TBXL, it caused very hard to explain differences in formatting on my code when writing certain FP results. I never understood the bug fully until recently.

POINT is documented in the manual as taking avars, so the bug here is that BASIC should not be accepting either case. However, design-wise it should have taken aexps in both cases to be consistent with all other statements that take in parameters. That was probably an attempt to fix-by-documentation when problems in the code were discovered.

 

Some other cases that I know of, from my ATBasic notes:

  • READ caches a direct pointer to the current line that is not invalidated by program modifications. Editing the program and then doing a CONT or immediate READ can cause incorrect DATA to be read if the edits cause a different DATA statement to fall in the same place as the old one. FOR probably has similar issues.
  • If the stop line is deleted, CONT continues two lines after where the stop occurred, not one.
This is present in all BASICs that I tested, also what I found "buggy" is that after STOP, the rest of the line is always skipped, I always tough that the execution should continue just after the STOP.

  • LOCATE/POINT/PLOT/DRAWTO can be used on the text screen, but only after executing GR.0.

 

I used this on my programs, so I should way that should be the standard.

 

  • GRAPHICS closes the IOCB before evaluating the argument, so GR.3:GR.1/0 will leave you will a graphics screen where all drawing commands fail. This is unusual since for most other commands BASIC evaluates all arguments before doing any irreversible actions.
  • SOUND writes to SKCTL but not SSKCTL.
  • FRE(0) is off by one byte. Possibly intentional, though, due to a rumored similar off-by-one in the Screen Editor memory check.
  • Pressing the Break key while LIST is executing in deferred mode continues execution with the next statement instead of stopping.
  • AND, OR, THEN, and STEP are not fully treated as keywords and can be inadvertently created as variables, which causes havoc. THEN=1 works, PRINT THEN works, PRINT 1 AND THEN works, PRINT THEN AND 1 produces a parsing error.

 

Yea, the last one is really weird. I always assumed that if you use "LET" before variable names that contained statement names at the start, you could use any word as variable name, and that "GO", "GOTO", "GOSUB", "TO" and "THEN" behaved like statements. Note that my parser really allows any of the above, even allows "AND", "OR" as variable names.

 

  • String literals can omit the ending quote. This is legal according to the manual, but notably Floyd of the Jungle (1982) breaks if this is not allowed.

 

Yea, this is very handy, I always typed SAVE "D:PROG without the closing quote :)

 

  • XIO can break IOCBs because it overwrites AUX1 without restoring it and AUX1 is used by CIO for permission checks. However, fixing this is inadvisable because it could in turn break devices like R: that expect this behavior and restore AUX1.
  • OPEN allows already opened IOCBs to be used and relies on CIO to throw an error. This can in turn trash the IOCB and cause a crash due to a bug on OS-B.
  • Parsing errors do not remove variables added during the parse from the program.

 

I tried parsing "10 TEST = TEST +", but this does not add "TEST" to the variable list in revA, revB, revC or TBXL, only on altirra-basic, any other way to test it?

 

  • The parser can invoke TRAP and resume program execution if you give it a bad line number (-1) or run out of memory.

 

Yes, that was really funny, as sometimes the program you were editing started executing by itself... By the way, this is present in rev-A, rev-B, rev-C and altirra-basic, but fixed in TBXL.

 

  • Many statements that take integer arguments and check for >256 or >32767 will accept 65535.9 as 0. This is a math pack FPI routine bug.
  • IF takes only a literal constant for the line number instead of an aexp. This is documented but inconsistent with other statements that take line numbers.

 

Well, the last one is understandable, as you need to parse expressions like "IF B=0 THEN A=3" as "IF B=0 THEN LET A=3" instead of "IF B=0 THEN GOTO A=3". Best way to avoid this ambiguity is only allow literal constants.

Share this post


Link to post
Share on other sites

Sorry, forgot another condition on the variables-during-parse case -- it happens when the parser runs out of memory trying to insert the statement. You need to fill memory almost completely first, either with arrays or the runtime stack.

Share this post


Link to post
Share on other sites

For me that always was the reason why LET was there in first place, also in many other, esp. early BASIC dialects. The option to omit LET came with the "late" BASICs - at least that was my perception. In Sinclair BASIC is was mandatory and fully allowed all keywords as variable names. Esp. important if you think the new versions of the BASIC can introduce new keywords here without breaking existing code!

 

From the Atari BASIC Manual: "It is advisable not to use a keyword as a variable name or as the first part of a variable name as it may not be interpreted correctly."

 

advisable, not forbidden.

Actually, I doubt that this was intentional. It is really a bug. So for example, while

 

LET GOSUB=1

 

works,

 

LET GOSUB = 1

 

(note the spaces around "="!) does not.

 

For "regular" variable names,i.e. LET A = 1 and LET A=1 (with and without spaces) are equally good. The reason is that the parser checks whether the variable name is an operator token, and if it is, it is parsed as an operator and not as a variable if the character following the operator token is below ATASCI '0'. So in other words, continuing with a blank space makes it an operator, continuing with an equals sign does not. I believe what the authors here had in mind is a test whether the following character would also qualify as a valid variable name, such as to allow variable names like "GOSUB0", which are indeed valid and cause no problem. Unfortunately, they forgot that there are codes between '9' and 'A' that also have a meaning as operators....

Edited by thorfdbg

Share this post


Link to post
Share on other sites

Sorry, forgot another condition on the variables-during-parse case -- it happens when the parser runs out of memory trying to insert the statement. You need to fill memory almost completely first, either with arrays or the runtime stack.

If the parser runs out of memory during parsing, it does not create a parsing error, it rather traps and then runs into the basic error handler, leaving the line un-parsed. The variable name is allocated first, with the memory for the entry in the variable table second. The problem is that the code never runs into the variable cleanup code if allocation traps.

 

There is, btw., a related bug. If the variable table runs full (i.e. more than 128 entries), the parser only notices at the very end of the variable creation, and then traps. It does not restore the variable name table and the variable pointer table. It should really check in the beginning, and not in the end...

 

Anyhow, exception processing is exceptionally weak here.

Share this post


Link to post
Share on other sites

That reminds me, parsing rules for statements and function tokens differ as well. Statement tokens are detected by prefix, so GOSUBX becomes GOSUB X. Function tokens have their full name checked instead, so SINE(0) is parsed as an array variable reference instead of SIN followed by a syntax error. That's fine, but unfortunately $ isn't quite handled as expected and BASIC doesn't allow SIN$(0). Yet, it does allow STR(0), because the $ is part of the name of the function STR$.

 

By the way, Basic XL/XE adds in more awkward special cases. Normally statement tokens are matched in ascending order by token value, but ENDIF is after END, so it must be special cased in a parser. Even more annoying is BUMP(), which unlike all standard Atari BASIC function tokens, contains the ( as part of the function token.

Share this post


Link to post
Share on other sites

This is fascinating, and it makes me feel like I was not _quite_ as dumb as a kid as I thought. I'm sure I hit a number of BASIC (Rev B) bugs and, at the time, assumed I was just "doing it wrong" :) A little frustrating to learn about all these little landmines, though. Ergh :)

Share this post


Link to post
Share on other sites

And more is comming...

 

* ENTER "..." cannot be aborted by pressing break. It will just skip the current line being parsed and continue with the next line.

 

* GOSUB with a missing line number will still push the RETURN address on the run time stack. If TRAP then RETURNs, it returns to the faulty GOSUB. Maybe this is intentional?

 

* An IO error will reset the "direct flag" of the screen driver, i.e. any control characters printed are again executed and not escaped. This would make sense on a BREAK only or if it relates to the editor handler.

Share this post


Link to post
Share on other sites
* GOSUB with a missing line number will still push the RETURN address on the run time stack. If TRAP then RETURNs, it returns to the faulty GOSUB. Maybe this is intentional?

 

Doubt it, probably just an implementation side effect of chaining to the GOTO handler.

 

* An IO error will reset the "direct flag" of the screen driver, i.e. any control characters printed are again executed and not escaped. This would make sense on a BREAK only or if it relates to the editor handler.

 

This doesn't actually seem like a bad idea to me, given all the ways that LIST can fail, but it's a bit strange since the main error handler already resets DSPFLG. What code path could trigger an I/O error that wouldn't go through the main error path too?

Share this post


Link to post
Share on other sites

 

Doubt it, probably just an implementation side effect of chaining to the GOTO handler.

 

 

This doesn't actually seem like a bad idea to me, given all the ways that LIST can fail, but it's a bit strange since the main error handler already resets DSPFLG. What code path could trigger an I/O error that wouldn't go through the main error path too?

GOSUB is chained with GOTO, indeed, but whether anything depends on this I do not know. It's at least a bit weird.

 

Concerning the DirectFlag: Yes, that's reset in the main error handler, but why? So for example, why should a failed "POINT" reset a flag of the editor handler? This doesn't make much sense to me. It does make sense to reset the direct flag in case the program is aborted (i.e. the error is not trapped) because then Basic enters the command line mode and this should be "least surprising" to the user. It also makes "kind of" sense in case the program is supposed to be aborted, though it would make more sense to reset the flag in the STOP implementation then rather than in the error handler.

Share this post


Link to post
Share on other sites

Hmm. This chaining has a couple of impliciations as well. So for example, a GOSUB 1/0 also pushes the return address on the run time stack, but then traps. A RETURN returns then to the position behind GOSUB.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...