Jump to content
IGNORED

Developing a new language - ACUSOL


Pab

Recommended Posts

After years of programming in Borland Pascal and Lazarus (a Delphi-like RAD environment) under Windows, I recently went back to playing around with programming for the 8-bits, and realized a number of things that I missed when having to step "down" to the older platform.

 

Being the type of person who tends to reinvent the wheel whenever I find something that doesn't do exactly what I want how I want, I came to the conclusion that a new language was called for that would fix these shortcomings. I dragged out an acronym I first came up with back in the very early 1980's that I promised I would use if I ever developed a language and retrofitted it onto this need. I started work on a parser in Lazarus under Windows, which will eventually lead to a compiler. I have the parser to the point where it can handle type definitions, procedure and function calls, variable definitions, and the major structure of a program. The next hurdle will be getting the parser to create pseudo-code, which can then be compiled into a functioning Atari program.

 

A major goal of the project will be, once the Windows-based compiler is finished, to use it to bootstrap a compiler native to the 8-bits. That native compiler can then be modified as the language grows and evolves and bugs are identified.

 

Since a language is a major undertaking, I don't want to develop it completely in a vacuum. The objective is to be useful, and the more ideas and input that can be brought into it the better. Plus, once I have a functioning Windows compiler capable of making at least functional Atari code, the task of writing the native compiler can be farmed out to a series of volunteers. That is why I am starting this thread, to share what I've already got worked out, solicit comment and suggestions, and hopefully involve others in the process.

 

I will be releasing all language definitions and code I write into the public domain.

 

With that out of the way, let me outline the basics of ACUSOL - Atari Computers Unified Symbolic Object-Oriented Language

 

  • Based upon Action! - I am using the Action! language definitions and structure as a starting point, Action! is a fairly well-structured and easy to understand language, and the extensions I wanted to make to it are easily grafted on.
  • Object-oriented - It will be an object-oriented language, with features similar to Object Pascal and Delphi
  • Easy use of extended memory - It will be easy to use banked RAM in expanded 8-bits with this language. Extended memory can be easily addressed, the compiler will be designed to allow variables to be allocated in banked RAM, and portions of the actual program code can be stored in banked RAM and called from anywhere in the program.

The following posts will outline major portions of the language.

 

  • Like 2
Link to comment
Share on other sites

This is just my opinion.

 

It looks like the Atari 8-bit computers already have many effcient, useful languages. What about targeting the Atari 5200? The last accessible language was 5200BAS and that has been long abandoned. Why not encourage new 5200 homebrew AND have a new, active user base? There would be no other competitor in your niche.

Link to comment
Share on other sites

MEMORY MANAGEMENT AND BANKED RAM SUPPORT

The extended RAM in the 130XE and the equivalent upgrades for earlier machines was a powerful feature when it first came out, but was woefully underutilized. With few exceptions (such as BASIC XE) there weren't programming languages or environments that supported this extended memory easily, and almost nothing other than huge RAMdisks used it to its full potential.

A compiled ACUSOL program will include at the beginning of its code some runtime routines (less than 256 bytes worth) to handle banked RAM, which will make the process of accessing this RAM all but transparent to the programmer. This will have some drawbacks (mainly that the first module of code cannot be located between $4000 and $7FFF), but they will be minor when weighed against the convenience this will provide.

Memory addressing convention

To reference a location in extended memory, ACUSOL uses a convention similar to that originally used by MSDOS:

 

Address:Bank

 

"Address" is any 16-bit address anywhere in RAM. "Bank" refers to the number of the bank the location is in. Bank 0 is the main (unextended) bank. Other banks would be numbered 1-64, depending upon the size of the expansion. The original 130XE banks would be addressed as 1-4. A 320K upgrade would add banks 5-20, and so on.

 

An address in this definition refers to the location where the machine sees the referenced byte. For example, the "first" byte of expanded 130XE memory will be referenced as $4000:01. The last byte of expanded 130XE memory would be $7FFF:04. Locations $0000 through $3FFF and $8000 through $FFFF will always be drawn from the main bank.

 

The MODULE statement

 

MODULE was a fairly useless statement in Action!, just serving as a way to tell the compiler we were going to define a few new globals (only really useful at the beginning of INCLUDEd code). ACUSOL will extend the statement to include instructions on where to place code generated and variables allocated. This is done through the addition of two new keywords, ORG and VARIABLES each with a memory reference.

 

Examples:

MODULE

This simply tells the compiler that the code following should be in the main bank, at the last address available in the main bank, and variables allocated within the code the way Action! does.

MODULE ORG=$1F00

The code following this statement will be generated starting at location $1F00 in the main bank, with variables allocated within the code.

MODULE VARIABLES=$4000:1

Code will be generated at the current location in the main bank. Variables will be allocated beginning at $4000 in bank #1.

MODULE ORG=$4000:2 VARIABLES=$2000

Code will be generated at the beginning of Bank #2. Variables will be allocated beginning at location $2000 in the main bank.

 

A MODULE statement will also tell the compiler to terminate the current segment of the binary-load file it is writing to disk, and start another. If the code that follows is in an extended bank, segments will be written to switch in the bank in question using the runtime code generated at the beginning of the compile.

Link to comment
Share on other sites

ADDED VARIABLE TYPES

 

In addition to the BYTE, CHAR, INT, and CARD types used by Action!, the following types will be supported natively:

 

  • FLOAT - A 6-byte floating point variable, using the FP routines in the OS.
  • STRING - A string type similar to Action!'s CHAR ARRAY.

 

OBJECTS

 

Objects, for those not familiar with the concept, can be seen as "Super records," mixing the variables of a record with procedures and functions that manipulate that object's variables.

 

Objects are defined through the CLASS statement.

CLASS <ident> {(ancestor)} <var declarations> {<procedure declarations>} {<property declarations>} END

Alternately, classes may be defined in the syntax Action! records are defined, for backwards compatibility:

CLASS <ident>{(ancestor)}=[<var declarations> {<procedure declarations>} {<property declartations>}]

"Properties" are pseudo-variables that are reported to the calling code as if they were variables, but can either reference an actual variable or a function/procedure call. For example, a class for handling player-missile graphics might include some of the following:

CLASS Player
   BYTE ypos
   BYTE PlayerNumber
   PROC SetXPosition          ; Set the X-position of the player
   CARD FUNC GetXPosition     ; Find the X position
   PROC DrawPlayer            ; Draw the player staring at line y
   BYTE PROP x READ GetXPosition WRITE SetXPosition
   BYTE PROP y READ ypos WRITE DrawPlayer
END

Procedures and functions of a class are written later in the code.

PROC Player.SetXPosition(BYTE x)
BYTE POINTER bp
   bp = $D000 + PlayerNumber
   bp^ = x
RETURN

BYTE FUNC Player.GetXPosition
BYTE POINTER bp
   bp = $D000 + PlayerNumber
RETURN(bp^)

Let's say you define a player object named "Ship" for a space game using this example. You want to set its position to (80,30). Normally you would store 80 in the memory location for the player's horizontal position, blank the player graphic RAM (or at least the portion already drawn to), and redraw the player starting at horizontal line 30. If you have written the "Drawplayer" procedure to handle the drawing, then you could simply have the statements

   Ship.X = 80
   Ship.Y = 30

For backward compatibility, Action! TYPE records are supported as objects with no procedures, functions, or properties. In fact, the TYPE keyword aliases to CLASS, and the Action! syntax is supported, The example above could have easily been written as:

CLASS Player=[BYTE ypos
              BYTE PlayerNumber
              PROC SetXPosition          ; Set the X-position of the player
              CARD FUNC GetXPosition     ; Find the X position
              PROC DrawPlayer            ; Draw the player staring at line y
              BYTE PROP x READ GetXPosition WRITE SetXPosition
              BYTE PROP y READ ypos WRITE DrawPlayer ]

or even

CLASS Player=[BYTE ypos BYTE PlayerNumber PROC SetXPosition CARD FUNC GetXPosition PROC DrawPlayer BYTE PROP x READ GetXPosition WRITE SetXPosition BYTE PROP y READ ypos WRITE DrawPlayer]
Edited by Pab
Link to comment
Share on other sites

This is just my opinion.

 

It looks like the Atari 8-bit computers already have many effcient, useful languages. What about targeting the Atari 5200? The last accessible language was 5200BAS and that has been long abandoned. Why not encourage new 5200 homebrew AND have a new, active user base? There would be no other competitor in your niche.

 

That's part of the beauty of this. Once the basics are laid down, a 5200 compiler could be easily adapted from the normal compiler. Player-Missiles are a natural for object oriented programming because of the kludgy way they were designed for the 8-bits/5200. Of course, the 5200 version of the compiler wouldn't need or have the banked RAM support.

  • Like 1
Link to comment
Share on other sites

Will this run on real a8 or is this another cross-platform system?

 

I am completely satisfied by my two Assembler Cartridges. One of them is Synassembler (which I use 95% of the time) and the other one is Mac/65.

Will this system (if it runs on a8) also run on a cartridge? I prefer a program language that runs from a cart.

Link to comment
Share on other sites

CLASS ANCESTORS AND DESCENDANTS

 

A class can "inherit" methods (variables and procedures) from another class, called its ancestor. Any procedure or property names that are the same in the new class as in the ancestor "override" the original in the new class.

 

 

For example, let's say you want to have classes that refer to an object on the screen. These objects are going to have some variables in common, and perhaps some code. Instead or repeating the code several times, descendant objects could be used.

CLASS ScreenItem
   BYTE xpos
   BYTE ypos
END

CLASS TextItem(ScreenItem)
   STRING text
   PROC WriteToScreen
   PROC WriteToX
   PROC WriteToY
   BYTE PROP x READ xpos WRITE WriteToX
   BYTE PROP y READ ypos WRITE WriteToY
END

CLASS PlayerItem(ScreenItem)
   BYTE PlayerNumber
   PROC SetXPosition ; Set the X-position of the player
   CARD FUNC GetXPosition ; Find the X position
   PROC DrawPlayer ; Draw the player staring at line y
   BYTE PROP x READ GetXPosition WRITE SetXPosition
   BYTE PROP y READ ypos WRITE DrawPlayer
END

; Imagine I wrote the code for player-missiles here. I'm too tired to actually do it.

PROC TextItem.WriteToScreen
   Position(xpos,ypos)
   Print(text)
RETURN

PROC TextItem.WriteToX(BYTE x)
   xpos = x
   WriteToScreen
RETURN

PROC TextItem.WriteToY(BYTE y)
   ypos = y
   WriteToScreen
RETURN

Of course, this is a little more verbose than would actually be needed. It's just meant as an example.

 

Another good use would be for I/O routines. Sometimes you don't care where or how you get your data since you're going to use it the same way no matter where it came from. Likewise, you don't always care where you write your data to, as long as it ends up where it's needed. A "stream" class would allow you to treat a disk, a modem, the keyboard, or even an area in RAM the same way as any other.

CLASS Stream
  PROC Put
  BYTE FUNC Get
RETURN

; pretend I actually wrote code for all these.

CLASS FileStream(Stream)
CLASS KeyboardStream(Stream)
CLASS PrinterStream(Stream)
CLASS ModemStream(Stream)
CLASS MemoryStream(Stream)

KeyboardStream K
PrinterStream P
FileStream File
MemoryStream Buffer

So let's say that we want to copy a text file to the printer. It might work this way:

PROC Main
BYTE b
  File.Open("D1:TEXT.TXT")
  WHILE NOT File.EOF DO
      b := File.Get
      P.Put(b)
  OD
  File.Close
RETURN

In fact, this is so handy that I plan to make a "Streams" unit one of the first things I write for the language's RTL.

Edited by Pab
Link to comment
Share on other sites

Will this run on real a8 or is this another cross-platform system?

 

I am completely satisfied by my two Assembler Cartridges. One of them is Synassembler (which I use 95% of the time) and the other one is Mac/65.

Will this system (if it runs on a8) also run on a cartridge? I prefer a program language that runs from a cart.

 

I'm developing the PC end in Lazarus, which is cross-platform. I'm mainly doing that just to bootstrap the Atari compiler, but it could also be expanded and modified as the language grows if people want a PC-side compiler.

 

I pictured the Atari compiler as being disk-based like KASM, but of course a cartridge version could be developed. Depends on who wants to join in the fun.

Link to comment
Share on other sites

CHANGES/ADDITIONS TO THE ACTION LANGUAGE

 

  • C-style and multi-line comments. Anything between /* and */ will be seen as a comment and ignored by the compiler.
    /* Sample ACUSOL program
       Written by some idiot */
    
    MODULE ORG=$1F00
    
  • Also, // can be used the same way ; is in Action! to terminate a line with a comment. For those used to using // as in C.
       PrintE("Hello, World!")  // What a stupid progra
  • There is no need to include a definition for the main procedure. Any code at the end of a program will be considered the main procedure.
    MODULE
    
       BYTE x
    
    PROC DoSomething
       PrintE("Hello, World!")
    RETURN
    
       FOR x = 1 TO Rand(10) DO DoSomething OD
    
    END
    
  • The need for "()" for functions and procedures with no arguments is eliminated.
  • "END" may be substituted for "RETURN" and used interchangeably.
  • A pseudo-variable called "RESULT" used in functions to provide the returned value. This was mainly done to allow for string functions by letting the routine write directly to the variable the result is being stored in. The old Action! method of "RETURN(X)" is supported for backward compatibility, and is interpreted by the compiler as being "RESULT = X RETURN"
    MODULE
    
    CHAR ARRAY HexDigits(16) = ['0 '1 '2 '3 '4 '5 '6 '7 '8 '9 'A 'B 'C 'D 'E 'F]
    
    STRING FUNCTION HexStr(CARD c)
    BYTE x, y
      RESULT = ""
      FOR x = 0 TO 4 DO
          y = c MOD 16
          RESULT = HexDigits(y) + RESULT
          c = c RSH 1
      OD
    RETURN
    
  • String character and array references may be enclosed in either () or []

     

Link to comment
Share on other sites

This is a good idea, I think, but I do have a couple of observations :

 

1. I don't think the OO stuff is all that useful on the 8 bits. You need a comparatively large codebase for that to really kick in its benefits. I do think some abstraction and encapsulation features are good, but I don't think a full OO treatment is warranted. Often these impose some run-time overhead too, which is not absorbed very well on 8 bit machines. Of course, some things, like advanced structures would be useful, and the ability to easily create and use separate libraries that have their own namespaces and library-local variables for instance, would be good. Just my opinion, of course.

 

2. The output code quality has to be really excellent, or people won't use it for significant efforts. Action was so popular not because of the syntactic features ( although they were good ) but rather because it produced FAST code. Speed of produced binary will trump elegance of syntax every time.

Edited by danwinslow
Link to comment
Share on other sites

1. I don't think the OO stuff is all that useful on the 8 bits. You need a comparatively large codebase for that to really kick in its benefits. I do think some abstraction and encapsulation features are good, but I don't think a full OO treatment is warranted. Often these impose some run-time overhead too, which is not absorbed very well on 8 bit machines. Of course, some things, like advanced structures would be useful, and the ability to easily create and use separate libraries that have their own namespaces and library-local variables for instance, would be good. Just my opinion, of course.

 

As I have it right now, the objects are not created dynamically at runtime as in C++ and Delphi, but really are a sort of "super TYPE." Most of the overhead, at the moment, is in the compilation process.

 

For example, if I define this object:

CLASS SampleObject
   BYTE b
   CARD c
   FLOAT f
   STRING s[8]
   PROC Sample Procedure
   FUNC Sample Function
END

And create two objects of it:

MODULE VARIABLES = $2000   ; For this example

SampleClass SC1
SampleClass SC2

Then SC1 would be allocated at $2000 and SC2 18 bytes later at $2012. The PROC and FUNC for the class would be compiled just once.

 

When calling a method to a class, the first three bytes in the argument passed to it are the address and bank of the object being worked upon. If we pass arguments the way that Action! did (for reverse compatibility) then...

SC1.SampleProc($04A0,$30,0)

Would see $00 in the accumulator, $20 in the X register, $00 (main bank) in the Y register, $A0 in A3, $04 in A4, $30 in A5, and 0 in A6.

 

References to the variable methods would also be calculated on compilation. So if in our sample procedure we have the code

    C = b & $0E

then the compiler would generate:

     LDA $2012    ; b
     AND #$0E     ; & $0E
     STA $2013    ; C low end
     LDA #0
     STA $2014

A reference to SC1 would compile the same code, but with $2000, $2001, and $2002 as the addresses in question.

I aim to have the final compiled code be as fast as possible, personally.

Link to comment
Share on other sites

Yes, anything that can be done compile time is probably reasonable, given that its a cross compiler. Run time polymorphism and so forth isn't really necessary.

 

Sounds good, but I'll reiterate the advice on object speed. A good extended memory system would offset some of that, but that's going to run into it's own challenges with various dos's and different extension mechanisms.

 

I'd be very interested in the plans for the extended memory stuff by the way, when you get to that stage. I might be able to help, I have done some work towards something like that.

Link to comment
Share on other sites

Funny you should mention that. I just got to the point where the compiler is generating actual code. The first thing dumped, obviously, is the extended memory runtime code.

 

This is what I'm using to start with. 114 bytes long. Originally assembled in Page 6, but relocated on the fly as the compiler writes it to the output object file.

 

The logic I used in this was: Banks 0-3 (the XE banks) are addressed normally. A 320K upgrade addressed through bit 5 (the old ANTIC bit). 576K upgrade addressed through bit 1 (the old BASIC bit). 1088K upgrade through bit 6. This is how I remember it being done, but it's ages since I've actually played around on a physical Atari, and that was a stock XE!

 

Anyone wanting to rewrite this code is more than welcome to!

10 ; Bank ram access code
20 ;
30 BANKADDR = $CB
40 MAINADDR = $CC
50 BANK = $CD
60 SCRATCH = $D4
70 PORTB = $D301
80 OLDPORTB = $CF
0100 ;
0101  *=$600
0110 BANKSELECT
0120  LDA PORTB
0130  STA OLDPORTB
0140  AND #$01 ; Turn off all memory management values
0150  TAX ; Stash it here for the moment.
0160  LDA BANK
0170  AND #$03 ; Check the 4 lowest banks
0171  ASL A
0172  ASL A
0180  STA SCRATCH
0190  LDA BANK
0200  AND #$04 ; Check the bit for 256K upgrades
0210  ASL A
0211  ASL A
0212  ASL A
0220  ORA SCRATCH
0230  STA SCRATCH
0240  LDA BANK
0250  AND #$08 ; Check for 576K upgrade banks
0260  LSR A
0261  LSR A
0270  ORA SCRATCH ; In the BASIC bit
0280  STA SCRATCH
0290  LDA BANK
0300  AND #$10 ; A megabyte upgrade?
0301  ASL A
0302  ASL A
0303  ORA SCRATCH
0305  STA SCRATCH ; We have set all the bits
0306  TXA
0310  STA PORTB
0320  RTS
0330 UNBANK PHA
0340  LDA OLDPORTB
0350  STA PORTB
0360  PLA
0370  RTS
0380 BANKPEEK JSR BANKSELECT
0390  LDY #0
0400  LDA (BANKADDR),Y
0410  JSR UNBANK
0420  RTS
0430 BANKPOKE PHA
0440  JSR BANKSELECT
0450  PLA
0460  LDY #0
0470  STA (BANKADDR),Y
0480  JSR UNBANK
0490  RTS
0500 CARDPOKE PHA
0505  JSR BANKSELECT
0510  LDY #0
0520  STA (BANKADDR),Y
0530  INY
0540  TXA
0550  STA (BANKADDR),Y
0540  JSR UNBANK
0550  RTS
0560 CARDPEEK JSR BANKSELECT
0570  LDY #1
0580  LDA (BANKADDR),Y
0590  TAX
0600  DEY
0610  LDA (BANKADDR),Y
0620  JSR UNBANK
0630  RTS
 
  • Like 1
Link to comment
Share on other sites

What I think would be nice in a language is case-insensitive keywords/reserved words.

 

I really like BASIC XL because DRAWTO, drawto, DrawTo, and every permutation of that including inverse characters are all handled equally and recognized by the tokenizer.

Link to comment
Share on other sites

This looks like a great project. If I may add one request. I would like to see the compiler have a switch to optimize code to take advantage of the 65816/802 extended instruction set. For those who have a 6502, don't use the switch, and it optimizes for 6502. Possibly a separate switch for using 816 flat memory above $FFFF.

 

That should handle all possibilities of 6502, 65c802, and 65c816.

 

Is this plausible? I hope so. I want some software to take advantage of my 802 :)

Link to comment
Share on other sites

Hello Pab

 

Nice to see that you are back.

 

 

The logic I used in this was: Banks 0-3 (the XE banks) are addressed normally. A 320K upgrade addressed through bit 5 (the old ANTIC bit). 576K upgrade addressed through bit 1 (the old BASIC bit). 1088K upgrade through bit 6. This is how I remember it being done, but it's ages since I've actually played around on a physical Atari, and that was a stock XE!

 

Why don't you use a routine to test which banks are available/which bits are used? That way, your programming language will be compatible to more upgrades.

 

Sincerely

 

Mathy

Edited by Mathy
Link to comment
Share on other sites

 

And create two objects of it:

MODULE VARIABLES = $2000   ; For this example

SampleClass SC1
SampleClass SC2

Then SC1 would be allocated at $2000 and SC2 18 bytes later at $2012.

 

This way you will duplicate some slowness of CC65 :-/

In typical situation when you need an object you have an array of objects. Take a look at this example:

http://www.cc65.org/mailarchive/2010-09/8593.html

At least there should be an option how to place structures.

 

My other proposals are:

  • Do not use stack for local variables (therefore no recursion - recursion can be easily programmed with user defined stack).
  • Do not use stack for function parameters - make them global variables.
  • Perform call tree analysis to find out which local variables can be reused.
  • Allow to define zero-page variables for speed.
  • Allow to place data (& code) at defined places. This is important with all the Atari alignment issues (fonts, sprites, LMS etc.).
  • Avoid pointers in the language if possible. They (and recursion) disallow a most of optimizations.

Btw, take a look at http://www.dwheeler.com/6502/

Edited by ilmenit
  • Like 2
Link to comment
Share on other sites

IMO, Action! was a terrific starting point. The big weakness in Action! is math. Not only is it weak, but it is buggy and doesn't work as it is supposed to much of the time. Hopefully you can improve on this (even if it is not lightning-quick).

 

-Larry

Link to comment
Share on other sites

I think ilmenit's and Mathy's suggestions above are very good.

 

About the extended banks routines...if you are planning to provide access only on a byte by byte basis, ie., BANKPEEK, you will introduce a lot of slowness. I would suggest you make provisions to map a page or a 1k block of extended ram at a time into a special reserved area in real memory. Most of the time, memory access is sequential, ie., a progam that has just read from byte 0002 is very likely to read 0003 next.

Link to comment
Share on other sites

 

Why don't you use a routine to test which banks are available/which bits are used? That way, your programming language will be compatible to more upgrades.

 

 

I will probably include one in the RTL, but at this point it's not that vital. I have a sneaking suspicion that most people who develop with this are going to be sticking to banks 1-4 (the standard XE banks) since most people will develop for the lowest common denominator.

 

I also plan to have banked versions of Peek(), Poke(), and Move() in the RTL, so the program could use banks at runtime that weren't utilized in the writing/compilation process.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...