Question about how GPU code is defined/copied in Atari 3D Renderer Sample

Sam Bushman · December 30, 2014

HI guys,

I am currently looking through a test program Atari wrote in 1995 for a 3D renderer for the Atari Jaguar. I am trying to learn a little about how one can generate 3D graphics on the Jag.

The program is made up of C code that defines structures for storing different renderers and input handling code for switching the active renderer. Each of these structures stores the starting address and length of the stored renderer and a function pointer to the GPU drawing code. Each iteration through the main loop, the C code copies the active renderer program into GPU local memory, updates the 3D scene variables, and calls the renderer's draw function.

My question deals with how this GPU code is assembled and stored in memory. Each renderer has the following lines in it's source file:

_gourcode:

.dc.l startblit, endblit-startblit

.gpu

.include "globlreg.inc"

.include "polyregs.inc"

.org G_RAM

startblit:

...

endblit:

For each renderer structure, the *code label (gourcode in the above example) is referenced for storing the starting address and length of the GPU code for copying purposes. A label between the startblit and endblit labels is referenced for storing the GPU drawing function pointer. My interpretation of this code is that the .gpu directive tells the assembler to treat the following code as GPU assembly (properly handling GPU-specific instructions) and the .org directive causes the address of startblit to be the first address of GPU internal memory (therefore causing the assembled code that follows to be loaded in this memory range).

What my question is, since all 6 renderers in this demo define their GPU programs in the same way (including the usage of the .org directive), how does the assembler handle the address that each of these programs are loaded at when the final program is loaded into Jaguar memory? It appears to me that each GPU program would overwrite whatever was previously written to G_RAM, and therefore not allow the C code to copy different renderer's GPU code into GPU internal memory.

If more code from this test program is needed to answer my question, please let me know.

Thanks for any help and have a happy new year guys!

Chilly Willy · December 31, 2014

The assembler DOESN'T handle it at all. All the code indeed assembles with the same gpu ram address and will overwrite each other. The main program (on the 68000) handles loading each bit of gpu code as needed. Look at this in tri68k.c:

TRIrender(Object *obj, Camera *cam, Lightmodel *lmodel)
{
	Triangle *tri;
	int	tricount;
	Polygon	*pgon, *altpgon;
	unsigned andclips, orclips, curclip;
	int	camx, camy, camz;
	int 	i;
	Texmap *tmap;
	unsigned color, basei;

	/* load up GPU code we will need */
	GPUload(gpufixdivcode);

See how the function loads the gpu code it will use shortly? You'll find other GPUload() calls elsewhere in the code as needed. When it needs to run the code, it uses the GPUexec() function to start the gpu running.

Sam Bushman · January 1, 2015

Thanks Chilly, that makes sense. I still have some confusion about where these programs reside in the executable's address space. I currently have the following assumptions (which some are incorrect):
1. Since no OS runs on the Jaguar, the address space of the executable is the same as the address space of the memory mappings for the Jaguar console (there is no indirection between the address the program is poking and the actual hardware memory address).

2. The .org directive tells the assembler that the following code should begin at the expressed address in the executable's address space (G_RAM in the initial example).

3. When the executable is loaded onto the Jaguar, code assembled after the .org directive will loaded into the system address space address equal to the definition of G_RAM (meaning the beginning of the GPU local memory).

4. I would think the various GPU programs would clobber each other when the assembled object files are linked together (as they all use the .org directive with the address G_RAM).

Clearly this cannot all be true, as all the GPU programs being used by the various renderers must exist in the executable in order to be copied to GPU local memory when GPUload is called in the first place. What I am not understanding is how the .org directive effects the memory layout of the final executable and the addresses the linker assigns to the various labels of the GPU programs.

I'm a bit of an assembly noob, so thank you for your patience and time. Cheers.

Chilly Willy · January 1, 2015

The code/data is in the 68000 code/data right where it is in the source. All org does is tell the assembler to change the addresses computed for that code once it's moved to the right place. So the gpu code may end up at say 0x18000 on the 68000 side, but all the labels and branches inside that block point to the gpu local ram, and once the 68000/blitter moves that chunk of code from 0x18000 to the gpu local ram, it will work properly. It won't work if you try to run it where it is in the system ram... unless you left off the org command so that it assembles to run right there in system ram (which then means following rules needed to run in system ram instead of in local ram).

Sam Bushman · January 1, 2015

I see. So the directive effects what address is assigned to the labels by the assembler, not the actual layout of the code in the executable. Thanks for the clarification Chilly

Otto1980 · January 1, 2015

i really liked this engine / demo,

another question: is there ever known a kind of mipmapping technic in a jaguar game or demo?

Sam Bushman · January 11, 2015

I'm currently trying to compile this demo to run on my Skunkboard. I have Michael Hill's SDK for OS X installed, and I'm working to convert the included makefile over to using VBCC and SMAC instead of GCC and MAC. Currently, when SMAC attempts to assemble the first object file listed in the makefile (wfrend.o) I get the following output (running SMAC with the verbose flag):

smac -fb -v -I./include wfrend.s

[including: wfrend.s]

[including: jaguar.inc]

[Leaving: jaguar.inc]

[including: globlreg.inc]

[Leaving: globlreg.inc]

[including: trapregs.inc]

[Leaving: trapregs.inc]

[including: load.inc]

[including: loadxpt.inc]

[Leaving: loadxpt.inc]

[Leaving: load.inc]

[including: clip.inc]

[Leaving: clip.inc]

[including: wfdraw.inc]

[Leaving: wfdraw.inc]

[including: clipsubs.inc]

[Leaving: clipsubs.inc]

[including: init.inc]

[Leaving: init.inc]

GPU RAM USE: 2524 Bytes used

1572 Bytes free

2032 Bytes available for buffer

[Leaving: wfrend.s]

(*top*)[0]: Internal Error Number 7

make: *** [wfrend.o] Error 1

This error doesn't seem to give me much to work with at first glance. Is there some sort of error output file, documentation, or common knowledge about the assembler's error codes I can work with to determine what is happening here?

Thanks guys!

Chilly Willy · January 11, 2015

You probably compiled smac in 64-bit mode. Neither smac nor rmac are 64-bit clean. Compile them in 32-bit mode. If you add -m32 to the gnumakefile as a CC flag, it will compile as 32-bit. It should then work fine.

Sam Bushman · January 12, 2015

I just built smac 1.0.18 (linked to me here by JagChris: http://3do.cdinteractive.co.uk/viewtopic.php?f=35&t=3591 )

Even with the new version and compiling smac with the -m32 flag from source I get the same error on the same file when assembling the 3D example. Is there anything else I can look at?

Edited January 12, 2015 by Sam Bushman

Chilly Willy · January 12, 2015

If you're familiar with it, you could run smac from gdb. That's what I did to find the last bug in it. If I can find some time, I'll try doing that on my system as well.

Sam Bushman · January 12, 2015

Looking into the smac source a bit (keep in mind, this is my first time looking at the smac source ) I see that the only place where an error 7 is generated is in sect.c, line 390. It looks like this is the function that resolves fixups for code sections. It looks like a "chunk" is attempting to be cached, but is not found. I'll keep looking at it and see if I can provide more info. Thanks.

Edited January 12, 2015 by Sam Bushman

Chilly Willy · January 12, 2015

I saw that too. An extra print shows that the fixup that fails has a location of 010400f0 and a flag word of 3042. The location could be a word-swapped gpu immediate, but that fixup word is odd - it means FU_ISBRA|FU_SUB32|FU_EXPR|FU_WORD.

Chilly Willy · January 12, 2015

The current file number and line number give *TOP* and 0, which combined with the weird fixup word leads me to think this is an extraneous fix up that somehow got onto the fixup list.

EDIT: Putting a continue in place of the interror call caused a segment fault as the fixup list at that point is wonked. Putting a break at that point caused the assembly to finish with no apparent problems. Everything including the symbol table looks fine.

Edited January 12, 2015 by Chilly Willy

Chilly Willy · January 12, 2015

Spoke too soon about everything looking okay. If you look at the listing, you see

  129  00000046  CC03                   	move	PC,return
  130  00000048  0C83                   	addqt	#4,return
  131                                   
  132                                   ;***********************************************************************
  133                                   ; main loop goes here
  134                                   ; it is assumed that "return" points to "triloop"
  135                                   ;***********************************************************************
  136                                   
  137                                   triloop:
  138  0000004A  xxxx                   	addqt	#(skipface-triloop),return
  139  0000004C  9064                   	moveta	return,altskip

And in the .o file you find "CC 03 0C 83 90 64", so we're skipping this. And now that I think about it, the above is a quick add that needs a sub32 expr to calculate the offset for the add.

Shamus · January 12, 2015

RMAC is 64-bit clean.

BTW, RMAC is being actively developed and bugfixed, maybe you should give it a try.

Edited January 12, 2015 by Shamus

Chilly Willy · January 12, 2015

I did. While smac (32-bit only) runs through the assembly for tube_se just fine, rmac kicks out tons of errors. You might grab the source and figure out what's causing them. As to 64-bit, smac is SOMEWHAT 64-bit clean, but has issues with the tokens (and maybe other stuff, but certainly with the tokens). Looking at rmac, it looks like you cleaned up the issues with the tokens. So I'm willing to admit I was wrong on that... I probably assumed that the errors rmac had with the tube_se assembly was a 64-bit issue like it was with smac.

Shamus · January 12, 2015

You must have been using an old version. RMAC has been 64-bit clean for well over a year now. If SMAC works on a 64-bit platform, it's more a case of pure luck than anything else, as it stuffs pointers into the token stream.

Shamus · January 13, 2015

I should also add that I was able to successfully build Tube SE with RMAC/RLN without any problems; not sure why you can't.

Chilly Willy · January 13, 2015

Okay, this is really weird. I checked that I'm using the latest rmac, and I am. Then I tried using it for tube_se... and now it works fine. Why wasn't it working earlier? It gave a list of errors a page long, so I switched back to smac. Now, not an error. My brain is melting.

Does rmac support more than a.out format for the output object files?

+CyranoJ · January 13, 2015

Okay, this is really weird. I checked that I'm using the latest rmac, and I am. Then I tried using it for tube_se... and now it works fine. Why wasn't it working earlier? It gave a list of errors a page long, so I switched back to smac. Now, not an error. My brain is melting.

You must be editing Blackout by mistake. Rum does that.

Shamus · January 13, 2015

AFAIK RMAC outputs BSD style .o objects. But it can probably be easily modified to support others as well.

Also found a bug in RMAC when using the -l switch: If you assemble GPU code with that switch it causes RMAC to barf on "OR" opcodes (I suspect it may cause similar/worse effects on other platforms; YMMV).

Edited January 13, 2015 by Shamus

Chilly Willy · January 13, 2015

The code is pretty hairy (smac or rmac). Have you looked at vasm? I wonder how hard it would be to make the jagrisc target of vasm handle "standard" Jaguar directives and such.

Sam Bushman · January 13, 2015

I just built and tried rmac with the 3D example project (thanks for all the work on VJ and this Shamus ). Sadly, it appears that rmac has issues handling a forward reference in drawpoly.inc

rmac -fb -u -I./include gourrend.s

drawpoly.inc 198: Error: undefined register equate 'endloop'

drawpoly.inc 520: Error: multiply-defined label 'endloop'

GPU RAM USE: 2992 Bytes used

1104 Bytes free

1560 Bytes available for buffer

make: *** [gourrend.o] Error 2

Do I need to adjust the rmac makefile at all for building on 64-bit OS X?

Edited January 13, 2015 by Sam Bushman

TXG/MNX · January 13, 2015

When running dosbox you can use the old original mac tool.

I had the same compiling stuff in the past. Running thru rmac it gave some strang gpu error's without altering it did compile fine the original mac.

Now I use dosbox with the original mac for those sources...

TXG/MNX · January 13, 2015

@Chilly Willy What is the current version of vasm ?

About jagrisc does it compile 68K and risc with one vasm executable ? Also I think VBCC doesn't support vasm (jagrisc) now or does it ?

Question about how GPU code is defined/copied in Atari 3D Renderer Sample

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members