Developing a new language - ACUSOL

Pab · March 18, 2014

Question: Does anyone ever really put code between $0500 and $0580? I don't think I've ever seen a routine or a handler that lives there.

If that's a safe enough spot to use, then I'd have a full 256 bytes to play with for buffers when dealing with banked RAM. Instead of the cassette buffer I could just stash everything on Page 5 during a swap.

Obviously there won't be a BASIC line buffer to worry about from $580-$5FF, and I wouldn't be calling any floating point routines during the swap. I could even load the size byte of a full 256-character string into $4FF and the entire string into Page 5 without having to deal with the indirect addressing across pages bug.

flashjazzcat · March 18, 2014

I've used $400/500 plenty in programs, but surely your runtime can just put the banking code at $2000 or some other address inclusive of the application?

BTW: I'm having to fall back onto the MADS assembler's capability of producing SpartaDOS X relocatable code for my GUI. MADS has its own (proprietary) relocatable format, but that's not (yet) suitable for my envisaged needs (namely multiple absolute and relocatable segments in the same executable, with relocatable blocks explicitly targeting conventional or extended RAM). Obviously a linking loader (which I'll have to write) will handle the application loading and fixups, but I'm pretty open-minded when it comes to which relocatable format to implement if someone comes up with startling ideas. External symbols may also be useful.

Taking this train of thought a bit further, 2-3 years down the line I'll probably be looking for some kind of PC-based IDE for the GUI (think: SymStudio for the SymbOS GUI). Obviously these are long term plans, but I'd be very interested in discussing some kind of language implementation targeted at creating relocatable GUI applications further down the line. Just putting it out there...

Pab · March 18, 2014

I don't plan to put code there; the banked RAM code will be at the beginning of the first module. I mean using it for temporary data. I'm more worried about overwriting someone else's code with my data.

Pab · March 19, 2014

Minor milestone achieved. I got the compiler to produce an Atari object code file. Granted, other than the micro-runtime it's useless code (the only code generation routines I've written so far are those to call a procedure) but it had the proper header information and was the correct length. SpartaDOS recognized it and loaded it into the memory it was supposed to be in.

Tickled_Pink · March 19, 2014

The use of INTERRUPT instead of PROC is exactly how interrupt functions are defined in PL65. That compiler's biggest problem is that it produces excessively fragmented compound files.

I'm pretty sure it also stores its runtime library either at $2000 or $2100. The manual was uploaded on these forums some time ago. I think it includes a list of memory locations used by the compiler. Might give you some ideas.

flashjazzcat · March 19, 2014

I don't plan to put code there; the banked RAM code will be at the beginning of the first module. I mean using it for temporary data. I'm more worried about overwriting someone else's code with my data.

I see. In that case you could just keep the buffer inside the "bounds" of the executable for safety's sake (after all, that's where your stack, etc, will be).

Regarding stacks: I'd love to see someone use an interleaved implementation with a hard limit of 256 entries using ABS,X addressing. With a 1K stack you could then pass 32-bit ints and it would be pretty fast.

thorfdbg · March 19, 2014

Question: Does anyone ever really put code between $0500 and $0580? I don't think I've ever seen a routine or a handler that lives there.

Yes. The bootstrap for the R: handler goes there. Once it is resident, the RAM area will become available again. A couple of the Dos++ runtimes go there, but that is again only during the DOS command line. I also believe that TurboBasic places temps there to speed up the floating point math. The screen buffer of DDT (the debugger in the Mac/65 cart) uses this RAM area as well.

Edited March 19, 2014 by thorfdbg

JoSch · March 19, 2014

Math is where I'm going to need the most help on this. Pascal/Delphi code to parse mathematical statements and some fast 6502 math routines would be greatly appreciated.

Here you go: http://www.6502.org/source/floats/wozfp1.txt

thorfdbg · March 19, 2014

Math is where I'm going to need the most help on this. Pascal/Delphi code to parse mathematical statements and some fast 6502 math routines would be greatly appreciated.

To be frank, this is why I believe this project will never see the light of the day. Honestly, expression parsing is the 101 of building compilers - thus, this is the easy part. Code generation(!) is the hard part. Really good code generation is the really hard part.

Ok, you asked for it. If you want math routines, and have the minimum requirement to be compatible to anything in the Atari world, you would need to stick to the math pack. Yes, it's slow.

For parsing expressions: The typical approach is that of recursive descent parsing. This can be used to build pseudo-code or intermediate code for the code generator. Wikipedia has a very short introduction to it:

http://en.wikipedia.org/wiki/Recursive_descent_parser

The recommended literature for this is the "dragon book" (thus named by the picture on the cover):

Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools.

HTHH.

gozar · March 19, 2014

Cassette buff is a good idea I think, if 128 is enough...I think OSRAM is a valuable resouce ( 10k iirc ) but there's some overhead involved that isn't present is just banking, dealing with interrupts is one issue I think.

Keep in mind that for banked code, any code that is running in the bank area is going to be swapped out itself when it requests a byte from another bank, so having a protected location like cassette buf that is not in the banked space is handy.

One thing I would ask is to stay away from OSRAM. :-)

Alfred · March 19, 2014

The recommended literature for this is the "dragon book" (thus named by the picture on the cover):

Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools.

HTHH.

I have nearly twenty different books on compiler design and construction and I have to say that book is the worst for trying to learn, it's so dense with math. Compiler Design in C by Allen Holub is his attempt to explain the dragon book for non-math majors and it's pretty good.

I've been following this thread and while I like to be optimistic, I don't see a happy outcome here. The goals are simply too ambitious for a 48k or even 128k 6502, for one guy to implement in any reasonable timeframe. Maybe it works, but it's terribly slow, or more likely it just won't ever get finished. I think it would be much more realistic to start small, say implement a subset cleanly and then expand it. Doing everything all at once I think is just going to make it nearly impossible to complete. Unless Pab writes compilers in his day job.

Pab · March 19, 2014

As far as math, I am using the math pack for floating point. After all, it's there and there's no point in wasting program memory reproducing what's there. Size is a concern as well as speed. People for whom speed is of the utmost importance aren't going to be using floating point, anyhow.

For 8/16 bit math, addition and subtraction are pretty easy. For multiplication and division I already have code, but if anyone wanted to volunteer anything that was particularly fast I would be happy to take it.

Pab · March 19, 2014

As for the ambitiousness of the project, I will admit that I cheated just a little here. I had the parser 80% written and debugged before I even came here to discuss the topic. I'd been toying around with the concept off and on for about 8 years, and had written some of the code long before. Before I came here the parser was already able to identify definitions for procedures, functions, classes, and variables, and allocate RAM for the latter two. Last night I got it to write an object file. This morning I have it calling procedures and passing arguments. I'm currently writing the code to return values from a function (took a break to check the web and my mail and bake a loaf of bread). Next on the drawing board after that is assigning values to variables, incrementing and decrementing variables, and by that point I should be just about ready for "Hello, world."

I acknowledge that the big challenges are coming up: loops, conditionals, and math, but I already understand the basics of handling them. I asked for anyone who had Pascal code to parse a mathematical equation to send it my way solely to try and save myself a little time since that is the code that is going to take the longest to write.

As for doing it alone, I hope not to once I have a rough command-line cross compiler for PC finished. When the time comes to write the native Atari compiler I hope to have a few hardy volunteers from here to make the job a little lighter.

Pab · March 19, 2014

Ah, and for those of you who don't know my day job, these links should fill you in:

http://sungenis.com

http://gocomics.com/thenewadventuresofqueenvictoria

flashjazzcat · March 20, 2014

I found this interesting and apposite reading: http://www.dwheeler.com/6502/a-lang.txt

danwinslow · March 20, 2014

Wow, thats very interesting stuff, nice find.

Bryan · March 20, 2014

I read the first few pages.. I was thinking that a C parameter soft-stack wouldn't be hard to implement alongside the normal stack. It doesn't really matter that they're separate as long as you push and pull in the order you normally would. If you alternate between inc/dec'ing a ZP pointer LSB and its index, you can quickly make a 512-byte stack.

danwinslow · March 20, 2014

I think thats the approach that CC65 took, additionally allowing the specification of stack size via linker variables.

flashjazzcat · March 20, 2014

Main thing I took from that article regarding stacks was that it would be much more efficient to implement the parameter stack thus (when pushing, for example):

	ldx sp
	lda #< arg1
	sta stack,x
	lda #> arg1
	sta stack+256,x
	inc sp

This, as opposed to:

	ldy #0
	lda #< arg1
	sta (sp),y
	iny
	lda #> arg1
	sta (sp),y
	lda sp
	clc
	adc #2
	sta sp

If the stack grows downwards, of course, simply substitute SBCs and DEX.

Alfred · March 20, 2014

Yes, that was an interesting paper but it all comes back to speed. Language features cost cycles and there really aren't many to spare on the Atari. Action! perhaps uses one of them, the fixed location for passing parameters ($A0-$AF) which Clinton handles by using a block copy rather than actually storing the variables one at a time. Non-system subroutine calls generate a call to SArgs to copy the arguments to the $Ax registers. It's not completely apparent, but he seems to define system procedure/functions as those defined with a fixed address:

Proc Cio=$E456 ; system proc

Implementing a stack based calling convention would likely reduce computation speed to that of BASIC or perhaps even slower. No, I think Action! got it right by forgoing that in favour of direct invocation by JSR. Despite all his efforts, I have found that a typical Action! program is at best twice the size codewise of an assembly language version. In particular, Action!'s handling of pointers is atrocious with incredible overhead. Given the handicap of a 16K ROM though, it produces some very good code via a lot of very clever coding.

Speed is everything and anything that impedes that needs to be discarded unless it is of vital necessity.

flashjazzcat · March 20, 2014

Speed is everything and anything that impedes that needs to be discarded unless it is of vital necessity.

Sounds like a compelling argument for pure assembly language.

TXG/MNX · March 21, 2014

When you use a cross compiler maybe you can add more functions in it aswell.

Like add option to add pictures in .jpg/gif format and by compiling the first pass these data will be converted to Atari format and included in the output file.

Also an option to add compression to the language as function.

So you can compress data in a string/array/memlocation or decompress data to string/array/mem-location

Last thing could it done to make a special function that would output 100% relocatable code ?

Pab · March 21, 2014

Milestone achieved.

Not the most elegant way of doing it, or even the most efficient. but works for now.

MODULE ORG=$1F00

BYTE iccom0 = $342
CARD icbuf0 = $344
CARD icblen0 = $348

PROC CIO=$E456

PROC Put(BYTE b)
   iccom0=$0B
   icblen0=1
   icbuf0=@b
   6502_X=0
   CIO
RETURN

PROC PrintE(STRING s[40])
CARD c
   c=@s
   c==+1
   iccom0=$0B
   icblen0=s[0]
   icbuf0=c
   6502_X=0
   CIO
   Put(155)
RETURN

PROC Main
   PrintE("Hello, world.")
RETURN

Disassembly by "6502 Disassembler for Atari."

;
; Code equates
;
L00A0       = $00A0
L00A1       = $00A1
L00A2       = $00A2
L00D4       = $00D4
L00D5       = $00D5
L2E64       = $2E64
L6548       = $6548
L6F6C       = $6F6C
L7720       = $7720
;
; Start of code
;
            *= $1F00
;
L1F00:      .byte $00
L1F01:      lda L00A0
            sta L1F00
            lda #$0B
            sta IOCB0+ICCOM
            lda #$01
            sta IOCB0+ICBLL
            lda #$00
            sta IOCB0+ICBLH
            lda #$00
            sta IOCB0+ICBAL
            lda #$1F
            sta IOCB0+ICBAH
            ldx #$00
            jsr CIOV
            rts
L1F25:      .byte $00,$00,$00,$00,$00,$00,$00,$00
            .byte $00,$00,$00,$00,$00,$00,$00,$00
            .byte $00,$00,$00,$00,$00,$00,$00,$00
            .byte $00,$00,$00,$00,$00,$00,$00,$00
            .byte $00,$00,$00,$00,$00,$00,$00,$00
L1F4D:      .byte $00
L1F4E:      .byte $00
L1F4F:      lda L00A0
            sta L00D4
            lda L00A1
            sta L00D5
            ldy #$00
            lda (L00D4),Y
            sta L1F25
            tay
            tax
L1F60:      lda (L00D4),Y
            sta L1F25,X
            dey
            dex
            bpl L1F60
            lda #$25
            sta L1F4D
            lda #$1F
            sta L1F4E
            inc L1F4D
            bne L1F7B
            inc L1F4E
L1F7B:      lda #$0B
            sta IOCB0+ICCOM
            lda L1F25
            sta IOCB0+ICBLL
            lda #$00
            sta IOCB0+ICBLH
            lda L1F4D
            sta IOCB0+ICBAL
            lda L1F4E
            sta IOCB0+ICBAH
            ldx #$00
            jsr CIOV
            lda #$9B
            sta L00A0
            jsr L1F01
            rts
L1FA4:      jmp L1FB5
            ora L6548
            jmp (L6F6C)
            bit L7720
            .byte $6F,$72
            jmp (L2E64)
L1FB5:      lda #$A7
            sta L00A0
            lda #$1F
            sta L00A1
            lda #$00
            sta L00A2
            jsr L1F4F
            rts
;
            *= $02E0
;
            .word L1FA4

HELLO.bin

Edited March 21, 2014 by Pab

Pab · March 21, 2014

Some notes on the code created by the such-as-it-currently-is compiler above:

Arguments for procedures are created as local variables at compilation time instead of allocated from a heap or stack at runtime. The block at $1F25 is the 41 bytes needed for a STRING[40].
Local variables are allocated in code, the same way as under Action. Thus the single byte at $1F00 is the "BYTE b" in the Put procedure.
At this moment, parameters are passed through the zero page locations $A0-$AF as under Action. I intend to allow a compiler switch (as I stated before) to directly copy the values in the call into the arguments, but that is for later. I wrote the Action-compatible version first because to support class methods and to handle return values from Functions the way I plan to, I needed to push information to the procedures in question as "hidden" arguments. Under the final version, the default behavior will be to directly copy arguments and only pass a pointer to the current instance of a class and for the return variable through $A0-$AF.
It looks like gibberish in the disassembly, but the data from $1FA7 through $1FB4 is the static string "Hello, world.' preceded by a byte for its length (#13).
String arguments (as is the case here) are passed as three-byte string pointers, then copied into the argument variable from that pointer.
@ is used here to designate the address (with no bank reference) of a variable. It acts like a CARD with the variable's address. I used this instead of ^var because ^var would return a three-byte pointer under the language's addressing scheme.

And for those who may have missed the reference to three byte pointers earlier in the thread, pointers in this language will be three bytes long: [AddrLo, AddrHI, Bank]

Edited March 21, 2014 by Pab

Pab · March 21, 2014

And another note while I'm thinking about it.

Since the "escape" character isn't available on the PC, I'm using the ~ as a substitute. If the compiler comes across ~ it treats it as ATASCII 27. So if you see that in any of my source as I post it along the way, you'll understand.

The main reason I did this was for static strings. Instead of adopting C's or Pascal's method of handling them, it made more sense to me to use escape-quote on an Atari. So to include a quote sign in a static string and not have the compiler think it's the end of the string, press ESC, ESC, quote.

"This is a static string with ~"quotes~" inside it."

Developing a new language - ACUSOL

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members