EXE file segments

morelenmir · June 14, 2013

I have read quite a bit lately about file segments. Some of the chaps pointed out a couple of excellent uses for this programming technique and I would like to take advantage of it. However there is surprizingly little documentation on how to do so. This is what I have gathered so far from scouring quite a few tangential posts:

A single programme can have many segments.
Each segment is loaded sequentially.
Each segment can have a piece of INIT code and RUN code. The former is a routine that is automatically run when the segment is first loaded. The routine specified by INIT must finish with 'RTS' so the segment continues loading. The main routine in the segment is specified by RUN, which is executed when the segment has finished loading. When the routine specified by RUN is completed the next segment in loaded.
Subsequent segments can occupy and overwrite the same memory addresses. This means the first segment could occupy $0600->$067F and the second $0600->$0700.
There are many practical uses for programming with the segment model - routines to clear memory which are then overwritten by the main programme, loading screens and so on.

Now. All these facts are language neutral - although it would be difficult to implement them in the most basic assemblers. MADS seems to be the favoured (cross-)assembler for the atari at the moment and so I will refer to it. The basic skeleton of a programme using two segments might be:

org $3000 ;segment 1 starts

clear_mem ;routine to erase a chunk of memory>
rts

main ;main routine
rts

init clear_mem
run main


org $3500 ;segment 2 starts

enable_expanded_memory ;routine to set $D301
rts

another_main ;routine that makes use of expanded memory
rts

init enable_expanded_memory
run another_main

Am I thinking in the right direction? If so my next question would be how do you specify which segment goes first, second, third and so on. Is the order simply based on which comes first in the source code?

...and clearly I STILL cannot get the hang of posting neat, indented code snippets!!!

Edited June 14, 2013 by morelenmir

MaPa · June 14, 2013

I'm not so familiar with EXE file structure but AFAIK it works like this:

- at the beginning of each segment are two addresses of next block of data (start address and end address) so the DOS knows where to load it and how much.

- INIT and RUN code is segment too. It loads 16 bit address into $2e2-$2e3 and $2e0-$2e1 respectively.

- when DOS loads INIT block it immediately jumps to specified address, and when it returns via RTS, DOS continues loading with next block (if there is any more)

- after DOS reaches end of file, it jumps to the address specified by RUN block (so RUN address is only one, if you specify more than one, the last one will be used). If RUN address is not specified, some DOSses jumps to the address of first loaded segment some DOSses can't handle it.

Edited June 14, 2013 by MaPa

morelenmir · June 14, 2013

I guess my next question then is - how do you specify these INIT and RUN addresses?

In MADS you can use the (logically named!) RUN and INIT pseudo-ops I think. However I do not like using special language features and would rather do it by hand. This would require some way of putting the appropriate memory address in to $2E2-$2E3 and $2E0-$2E1, for INITAD and RUNAD respectively. Is there some way to do this without specifically using a MADS pseudo-op - or even MADS at all? Could you do this with the bog-standard 'Assembler Editor' cartridge?

MaPa · June 14, 2013

Simply do a segment at address $2e0 or $2e2 with ORG directive or *= or whatever the assembler uses followed by two bytes of the address.

 org $2e0  ;RUN address
 dta $00,$20  ; $2000 is our RUN address

Edited June 14, 2013 by MaPa

morelenmir · June 14, 2013

Am I right in thinking each subsequent segment can have its own INITAD and RUNAD?

Edited June 14, 2013 by morelenmir

Rybags · June 14, 2013

Init addresses are only called once so you need to do a segment which specifies it each time you want an Init call even if the address is the same.

Run address is only called at end of file load so it's sort of pointless specifying it multiple times.

Xuel · June 14, 2013

Each segment does not have its own INIT and RUN. INIT and RUN *are* segments themselves. The only distinguishing feature of an INIT or RUN segment is that they write to the INITAD or RUNAD vectors, i.e. $2e2 and $2e0.

You could have several normal segments with no interleaving INIT address, or several INIT segments with no interleaving normal segments. There is no requirement that they be grouped in any particular way. Though as soon as a RUN segment is encountered, no more segments will be loaded even if the end of the file hasn't been reached. A typical sequence might look like this:

NORMAL
INIT
NORMAL
NORMAL
INIT
NORMAL
NORMAL
RUN

As MaPa showed in post #4, "run <addr>" and "ini <addr>" in MADS are just shorthand for "org $2e0 / dta a(<addr>)" and "org $2e2 / dta a(<addr>)".

Edited June 14, 2013 by Xuel

Creature XL · June 14, 2013

. Though as soon as a RUN segment is encountered, no more segments will be loaded even if the end of the file hasn't been reached.

This is not correct.

After the program is loaded completly DOS/Fileloader jumps thru RUNAD.

If a RUN-Segment was loaded. At least that is my experience.

BTW, if there is someone else except me who uses CA65, I can later post how I add INIT/RUN segments with "ld65"

Xuel · June 14, 2013

Doh! Thanks Creature XL. I keep forgetting that. You can actually have RUN as the very first segment and it won't be processed until all segments are loaded. Sorry about that.

morelenmir · June 14, 2013

Damn. I thought I had finally got a handle on segments! I'll have to keep processing this stuff.

MANY thanks for the answers guys!!!

+JAC! · June 15, 2013

Since you're using WUDSN: It comes with an integrated hex editor that recognizeses EXE segments. That's useful analyzing existing files and MADS' output. Here is some "extreme" example :-)

Edited June 15, 2013 by JAC!

danwinslow · June 15, 2013

this is is not correct.
After the program is loaded completly DOS/Fileloader jumps thru RUNAD.

If a RUN-Segment was loaded. At least that is my experience.

BTW, if there is someone else except me who uses CA65, I can later post how I add INIT/RUN segments with "ld65"

I use ca65, I would appreciate some examples of how to handle this.

Edited June 15, 2013 by danwinslow

Creature XL · June 15, 2013

In the following "CC65" refers to the complete cc65-suite (compiler, assembler, linker, ...)

As the CC65 is not Atari specific, there is a bit of extra work to do.

The assembler (ca65) generates code for address $0000. The linker handles all address specific stuff later.

That has the benefit that you can rearrange the memory later thru modifying a simple text file. The linker config.

In the assembler you divide code in segments. And in the linker config you specify which segment goes in which

memory location. These memory locations are then written to the output file in the order they are defined.

That is important for Atari XEX (or COM files to be precise). If the "file" attribute is missing, then the data of that memory location is not written out!

Here is the linker config from "HAR'em". The obx files get merged together with a simple shell-script which essentially only uses some lines like

cat blah.obx >> harem.xex

This is not needed but it makes it easier to pack parts if needed.

In the following, I will focus on the "PREP" segment.

MEMORY {
    ZP:			 start=$0000 size=$00ed type=rw, define=yes;


    M_FFFF:		 start=$0000 size=$0006 file="build/prep.obx";
   H_PREP:        start=$0000 size=$0004 file="build/prep.obx";
   M_PREP:        start=$9c00 size=$0400 file="build/prep.obx";
   I_PREP:        start=$0000 size=$0006 file="build/prep.obx";

   H_DATAOS:    start=$0000 size=$0004 file="build/dataos.obx";
   M_DATAOS:    start=$2000 size=$2400 file="build/dataos.obx";
   I_DATAOS:    start=$0000 size=$0006 file="build/dataos.obx";

   H_GAME:        start=$0000 size=$0004 file="build/game.obx";
   M_GAME:        start=$3100 size=$7200 file="build/game.obx" define=yes;
   R_GAME:        start=$0000 size=$0006 file="build/prep.obx";

   M_BSS0:        start=$0100 size=$00c0 file="/dev/null";
}



SEGMENTS {
   S_FFFF:        load=M_FFFF;

   C_DATAOS:    load=H_DATAOS type=ro;
   S_DATAOS:    load=M_DATAOS type=ro define=yes;


   C_PREP:        load=H_PREP type=ro;
   S_PREP:        load=M_PREP type=rw define=yes;
   A_PREP:        load=I_PREP type=ro define=yes;
   A_DATAOS:    load=I_DATAOS type=ro define=yes;


   C_GAME:        load=H_GAME type=ro;
   S_DLIST:    load=M_GAME type=ro define=yes;
   CODE:        load=M_GAME type=ro define=yes;
   DATA:        load=M_GAME type=rw define=yes;
   A_GAME:        load=R_GAME type=ro;

   BSS:        load=M_GAME type=rw define=yes;

    BSS0:		   load = M_BSS0,  type = rw, define=yes;

    ZEROPAGE:	   load = ZP,	  type = zp;
    EXTZP:		  load = ZP,	  type = zp,  optional = yes;
}

To use the COM-file feature of the Atari-fileloaders, you basically add segments for INIAD/RUNAD which only contain the the address to jump to.

Memory areas H_xxx are for file headers. The address doesn't really matter. So I just use $0000 in the memory-section above.

Memory areas M_xxx is the location of the memory for the real data/code

Segments C_xxx are "control" segments.

Segments S_xxx are the real code/data segments.

In the code there is segment declaration. Remember, '.CODE' is kinda syntactic sugar for '.segment "CODE"'.

       .segment "S_PREP"

   ; Cool init routine 
prep_init:
       nop
       nop
       nop
       rts



   ; The "XEX control" segment
		    .segment "C_PREP"
		    SegHead S_PREP            ; the definition of the macro follows



   ; Address of INIAD. RUNAD would be $2e0/$2e1 ( I think)
       .segment "A_PREP"
       .word    $2e2
       .word    $2e3
       .word    prep_init        ; the address the file loader should jump to

Helpful macros:

.macro        SegHead    name
       .import    .ident(.concat("__", .string(name), "_LOAD__"))
       .import    .ident(.concat("__", .string(name), "_SIZE__"))
       .word    .ident(.concat("__", .string(name), "_LOAD__"))
.word    .ident(.concat("__", .string(name), "_LOAD__"))+.ident(.concat("__", .string(name), "_SIZE__"))-1
.endmacro


.macro        SegLen    name
       .import    .ident(.concat("__", .string(name), "_LOAD__"))
       .import    .ident(.concat("__", .string(name), "_SIZE__"))
       .word    .ident(.concat("__", .string(name), "_SIZE__"))-1
.endmacro

The magic here is the "define" in the linker's segment definitions. This defines the labels used in the macros.

More or less that's it. This is a rather quick write up and no detailed "blog post". So, if that makes no sense to anybody, it's my fault Feel free to ask. I might sit down in a few weeks and make a thought out post. When I finalize the game for the ABBUC compo I might have to refresh the stuff from above and then I will document it for everybody.

The linking of "Mighty Jill Off" might be even more interesting, because all segments are packed with "inflate 6502" and therefore the linking is more complex

Again, if something is unclear please ask, or wait till September

Creature XL · June 15, 2013

To check for errors here's the code for my little tool, to list segments. It's written in C and should compile without modification on every Linux machine. I guess when you rewerite the file hanlding and include "windows.h" it should even be portable to Windows. A feature I added is the option "-x" which extracts the segments. I think I added it to get the parts of RMT and/or G2F-XEX files. Can't remember right now.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>

int    saddr, eaddr, seglen;
int    bytes;
char    fnameXEX[1024];
unsigned char    header[6];
unsigned char    buffer[65535];
int                fd;

int has0xffff( void)
{
   return ((header[0] == 0xff) && (header[1] == 0xff));
}


void calcAddrs( int index)
{
   saddr = header[0+index] + header[1+index]*256;
   eaddr = header[2+index] + header[3+index]*256;
   seglen= eaddr - saddr + 1;
}



int nextSegment( void)
{

   // Segment header has at least 4 bytes.
   if ( read( fd, header, 4) != 4)
   {
       return 0;
   }


   // Starts with 255,255?
   if ( has0xffff() )
   {
       // Relocate byte 2-3
       header[0] = header[2];
       header[1] = header[3];

       // read two bytes more
       read( fd, header+2, 2);
   }

   // Determine start, end and length if segment.
   calcAddrs( 0);

   // Read bytes
   bytes = read( fd, buffer, seglen);
   if (bytes != seglen)
   {
       printf( "Read %d bytes. Expected %d\n", bytes, seglen);
       exit( -1);
   }

   // Flag: consistent segment read.
   return 1;
}


void writeSegment( void)
{
   static    int    i = 0;
   int            f;
   char        fname[128];

   // Compose file name
   sprintf( fname, "seg_%03d_%04x_%04x.bin", i, saddr, eaddr);


   // Write segment to file.
   f    = open( fname, O_WRONLY | O_CREAT, S_IRUSR| S_IWUSR | S_IRGRP | S_IROTH);
   write( f, buffer, seglen);
   close( f);

   ++i;
}


int main( int argc, char** argv)
{
   char    options[] = "lxo:";
   int        c;
   int        mode = 0;
   char*    outbase = NULL;


   // Parse CLI arguments
   while ( (c=getopt( argc, argv, options)) != -1)
   {
       switch (c)
       {
           // Option: "List segments" mode
           case 'l':
               mode = 0;
               break;

           // Option: "eXtract segments" mode
           case 'x':
               mode = 1;
               break;

           // Option: set Output base name
           case 'o':
               outbase = optarg;

           // Default:
           default:
               abort();
       }
   }

   // Did we get extra arguments?
   if (optind < argc)
   {
       // Use the first as XEX file name
       strcpy( fnameXEX, argv[ optind]);
   }
   else
   {
       // Exit with error.
       exit( -1);
   }


   // Open XEX for reading
   fd    = open( fnameXEX, O_RDONLY);


   // Extract mode active?
   if ( mode == 1)
   {
       //
   }


   while ( nextSegment())
   {

       // Check for RUNAD
       if ( (saddr == 0x02e0) && (eaddr == 0x02e1))
       {
           printf( "RUN at:  %4x\n", buffer[0] + 256*buffer[1]);
       }
       else

       // Check for INTAD
       if ( (saddr == 0x02e2) && (eaddr == 0x02e3))
       {
           printf( "INIT at: %4x\n", buffer[0] + 256*buffer[1]);
       }
       else
       {
           printf( "Segment: $%04x - $%4x. Lenght: %4x ( %5d)\n",
                               saddr, eaddr, seglen, seglen);

           // eXtract mode?
           if ( mode == 1)
           {
               // write segment to a file
               writeSegment();
           }
       }
   }




   exit( 0);
}

snicklin · June 15, 2013

I can confirm that this works on my Linux Mint 15 system.

morelenmir · June 16, 2013

THAT is a lot to digest over Sunday breakfast chaps!!!

I like very much the potential technique of having what you might call a 'temporary routine' - say to blank or set a specific section of memory. Through using INITAD and segments you can have it called when your program is loaded, It does its job and is then overwritten by a new sector and no memory is wasted on a routine that is only going to be used once at execution. For some reason that utility really appeals to me.

FASCINATING stuff and many thanks!

HiassofT · June 17, 2013

To check for errors here's the code for my little tool, to list segments.

I guess quite a lot of us must have written such (or similar) tools :-)

Of course, I also did :-) I called it "ataricom" and it's available both for Windows (in the "atari tools for Win32" package) and for Linux (in the atarisio sources) from my website

http://www.horus.com/~hias/atari/

Ataricom can list, extract, merge and split segments, add run/init blocks and create COM files/blocks from raw data. Docs and examples are in the README.tools file.

so long,

Hias

morelenmir · July 11, 2013

I'm still slogging away at this off and on and might finally have got there.

The segment pointed to by RUNAD is run when every other segment has been loaded. The segment pointed at by INITAD is run as soon as the flow of execution reaches it?

+JAC! · July 11, 2013

Yes, exactly.

Rybags · July 11, 2013

Think of INIT as:

Start of each segment load, INIT is reset to point to an RTS.

End of each segment load (including EOF) INIT vector is called.

The small segments which actually load INIT and/or RUN address are essentially the same as a program/data segment.

Difference between INIT/RUN - RUN only called once (at EOF). INIT calls unlimited, but INIT vector points to RTS again once loader regains control.

morelenmir · July 12, 2013

FINALLY the penny drops!!! MANY thanks for bearing with me chaps, for some reason this has been a very tough concept for me to grasp.

+Stephen · July 12, 2013

Thanks for posting though, and thanks to everyone who responded. This has been a fun thread to read and it has motivated me too try my first multi-segment file :-)

morelenmir · July 15, 2013

I suppose you can only meaningfully set a segment start address as INITAD after that segment has been loaded? Otherwise you are directing the machine to run a section of memory which at best is all zeroes and worse full of junk data.

For instance therefore you might write a segment that starts at $3000 whose code fills a set area of memory with all zeros. After that has been loaded you would set INITAD through either compiler-specific directives or a short segment begining at $2E0 to the word value $3000. When that micro-segment exits it will call the current value of INITAD which in turn will run the required segment. When the end of this routine is reached INITAD will be automatically reset to RTS. In order to save memory space the NEXT segment loaded can ALSO 'org' at $3000, overwriting the no longer needed memory blanking routine with new code.

Edited July 15, 2013 by morelenmir

Rybags · July 15, 2013

Run address you can set at any time - start, middle or end of file doesn't really matter.

Init address will be called as soon as it segment finishes. In theory you could have a segment that includes both the INIT address and is also imbedded with a short routine that runs e.g. around $2E4 or before $2E0 in Ram but you'd be overwriting OS Ram and could have stability issues.

I've in the past had a static routine that is repeatedly called with different parameters, e.g. you could have an Init routine that lives @ $4000 and repeatedly call it with parameters elsewhere being changed each time during the load process.

morelenmir · July 16, 2013

I think part of the trouble I had with this was one of terminology. In my mind I was mixing up what might be more descriptively called 'INIT-setting Segments' and 'INIT Segments'. That is, the segment which sets up INITAD - for example:

.ORG $02E2
.WORD $3000

and the actual segment that as a result of this code is called:

.ORG $3000

LDA #$AB
LDX #$00

LOOPSTART STA $4000, X
INX
CPX #$FF
BNE LOOPSTART

BOTH might be referred to as 'INIT segments', but previously when I was waffling on about them I actually meant the latter. RUNAD is MUCH easier to understand!!!

On the whole I like the functionality that segments bring to Atari assembler and think pretty much all programmes should use them - if only to bring a degree of organization to code without dipping too far in to assembler-specific directives.

Edited July 16, 2013 by morelenmir

EXE file segments

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members