Jump to content

Search the Community

Showing results for tags 'gcc'.

More search options

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


  • Atari Systems
    • Atari General
    • Atari 2600
    • Atari 5200
    • Atari 7800
    • Atari Lynx
    • Atari Jaguar
    • Atari VCS
    • Dedicated Systems
    • Atari 8-Bit Computers
    • Atari ST/TT/Falcon Computers
  • Classic Consoles
  • Classic Computing
  • Modern Consoles
  • Gaming General
  • Marketplace
  • Community
  • Community
  • Game Programming
  • Site
  • PC Gaming
  • The Club of Clubs's Discussion
  • I Hate Sauron's Topics
  • 1088 XEL/XLD Owners and Builders's Topics
  • Atari BBS Gurus's Community Chat
  • Atari BBS Gurus's BBS Callers
  • Atari BBS Gurus's BBS SysOps
  • Atari BBS Gurus's Resources
  • Atari Lynx Programmer Club's CC65
  • Atari Lynx Programmer Club's ASM
  • Atari Lynx Programmer Club's Lynx Programming
  • Atari Lynx Programmer Club's Music/Sound
  • Atari Lynx Programmer Club's Graphics
  • The Official AtariAge Shitpost Club's Shitty meme repository
  • The Official AtariAge Shitpost Club's Read this before you enter too deep
  • Arcade Gaming's Discussion
  • Tesla's Vehicles
  • Tesla's Solar
  • Tesla's PowerWall
  • Tesla's General
  • Harmony/Melody's CDFJ
  • Harmony/Melody's DPC+
  • Harmony/Melody's BUS
  • Harmony/Melody's General
  • ZeroPage Homebrew's Discussion
  • Furry Club's Chat/RP
  • PSPMinis.com's General PSP Minis Discussion and Questions
  • PSPMinis.com's Reviews
  • Atari Lynx 30th Birthday's 30th Birthday Programming Competition Games
  • 3D Printing Club's Chat
  • Drivers' Club's Members' Vehicles
  • Drivers' Club's Drives & Events
  • Drivers' Club's Wrenching
  • Drivers' Club's Found in the Wild
  • Drivers' Club's General Discussion
  • Dirtarians's General Discussion
  • Dirtarians's Members' Rigs
  • Dirtarians's Trail Runs & Reports
  • Dirtarians's Wrenching
  • The Green Herb's Discussions
  • Robin Gravel's new blog's My blog
  • Robin Gravel's new blog's Games released
  • Atari Video Club's Harmony Games
  • Atari Video Club's The Atari Gamer
  • Atari Video Club's Video Game Summit
  • Atari Video Club's Discsuuions
  • Star Wars - The Original Trilogy's Star Wars Talk
  • PlusCart User's Bug reports
  • PlusCart User's Discussion
  • DMGD Club's Incoming!
  • DASM's General
  • AtariVox's Topics
  • Gran Turismo's Gran Turismo
  • Gran Turismo's Misc.
  • Gran Turismo's Announcements
  • The Food Club's Food
  • The Food Club's Drinks
  • The Food Club's Read me first!
  • The (Not So) Official Arcade Archives Club's Rules (READ FIRST)
  • The (Not So) Official Arcade Archives Club's Feedback
  • The (Not So) Official Arcade Archives Club's Rumor Mill
  • The (Not So) Official Arcade Archives Club's Coming Soon
  • The (Not So) Official Arcade Archives Club's General Talk
  • The (Not So) Official Arcade Archives Club's High Score Arena
  • Adelaide South Australia Atari Chat's General Chat & Welcome
  • Adelaide South Australia Atari Chat's Meets
  • Adelaide South Australia Atari Chat's Trades & Swaps
  • KC-ACE Reboot's KC-ACE Reboot Forum
  • The Official Lost Gaming Club's Lost Gaming
  • The Official Lost Gaming Club's Undumped Games
  • The Official Lost Gaming Club's Tip Of My Tounge
  • The Official Lost Gaming Club's Lost Gaming Vault
  • The Official Lost Gaming Club's Club Info
  • GIMP Users's Discussion


There are no results to display.

There are no results to display.


  • AtariAge Calendar
  • The Club of Clubs's Events
  • Atari BBS Gurus's Calendar

Find results in...

Find results that contain...

Date Created

  • Start


Last Updated

  • Start


Filter by number of...


  • Start










Custom Status



Currently Playing

Playing Next

Found 8 results

  1. Hi All, I've found some more time to work on my Alex Kidd port. I manually converted one of the original games' maps to a format that is compatible with my scrolling routines. Naturally, I lost quite a bit of detail, but the end-result isn't too bad. This is what the original looks like: And this is what the same map looks like on the TI: I had to move some blocks around, or remove some blocks to avoid color clash(can't have two different colored blocks directly next to each other), but all-in-all, most of the details are there. I also originally did a very nice conversion of the cave the old man is sitting in at the end, but I ran out of patterns and had to ditch both that and the clouds. I'm thinking of adding it in a static screen that gets tagged onto the end of the map, but those are details for later. Either way, I'm only using some of the original game maps to test playability and will probably design my own at some stage (so it'd be a sequel/prequel instead of a direct port), so I'm not sure how important those issues are. Next up, I started working on real collision detection and fine-tuning the physics to be a bit more in line with the original game. Seems to be working well, although I still need to add a collision check for the top of the sprite (you could jump into one of the blocks now and get stuck if you really try). Next up: finish the collision detection routines, find a way to make Alex face the other direction and implement punching. Also, add the jumping and punching sound effects. After that, I'll need to figure out how to make the blue blocks and the gold boxes destructable in the scrolling data... punching and destroying blocks is kinda the main mechanic of the game . It should be doable, but it will be quite convoluted as I have to decide which patterns to put in place depending on the object on the other side of the block. Finally, here's a video recorded in MESS. Normal caveats apply: youtube makes it a bit more choppy than the real thing, and of course the colors are a bit more washed out here. *edit* The latest work-in-progress version is attached to this post for those that are interested in trying it out. Latest version is from august 24th, 11pm CET. alexkidd.dsk
  2. I wasn't sure whether I should post this under the existing GCC thread or start a new thread. I'm putzing around with GCC and I think I may have found another bug when doing multiplies. From what I can tell there have been fixes for other multiply bugs. This may or may not be related. My environment: Fedora 25 Linux GCC 4.4.0 with the tms9900 1.14 patches I recall having an issue when running Insomniac's installer - there were some directories that the install was expecting to be there but weren't... All I did was create those directories and ran the installer again. Everything seemed to work fine. I don't know if this is relevant to my issue, but just putting it out there. So on to the code. I have two versions of a function that does a simple calculation using some input parameters. The first takes all required params and uses them to perform the calculation. This version works just fine: unsigned int get_vdp_addr_inline(int width, int row, int col) { return row * width + col; } Code that calls the above: unsigned int addr_inline = get_vdp_addr_inline(32, 1, 0); And now onto the buggy version. In this version the width is stored in a struct. The pointer to the struct is passed into the function instead of the width: typedef struct { int width; int height; int foo; int bar; } test_rect; unsigned int get_vdp_addr_struct(test_rect* rect, int row, int col) { return row * rect->width + col; } Code that calls the above: test_rect rect; rect.width = 32; rect.height = 24; rect.foo = 0; rect.bar = 0; unsigned int addr_struct = get_vdp_addr_struct(&rect, 1, 0); Like I said this version doesn't give me the correct result. In debugging it with Tursi's classic 99, I see the following assembly generated for the second version of the function above. I added some comments about what it is doing. ; Some memory locations: ; *0000: 83 E0 00 24 83 C0 09 00 ...$.... ; *3FF2: 00 20 00 18 00 00 00 00 . ...... ; Initial conditions when the function is called: ; R1 (3FF2) is the pointer to "rect" ; R2 is "row" param ; R3 is "col" param ; R1 == 3FF2, R2 == 0001, R3 == 0000 61FA C042 mov R2,R1 ; R1 == 0001, R2 == 0001, R3 == 0000 ​ ; The above move instruction just blasted the "rect" pointer... this will not work. 61FC 3851 mpy *R1,R1 ; R1 == 0000, R2 == 83E0, R3 == 0000 ​ ; This multiply does not make sense. ​ ; Seems to be using the wrong registers. 61FE C042 mov R2,R1 ; R1 == 83E0, R2 == 83E0, R3 == 0000 6200 A043 a R3,R1 ; R1 == 83E0, R2 == 83E0, R3 == 0000 6202 045B b *R11 ; return to caller, INCORRECT result in R1 In case it's relevant, these are the GCC / Linker flags I am using: C_FLAGS=\ -O2 -std=c99 -s --save-temp -fno-builtin -Wall -Wstack-protector LDFLAGS=\ --section-start .text=6000 --section-start .data=2000 To me, it looks like a bug in the compiler output; however, it is possible it's some kind of user error. Hopefully Insomniac will see this at some point. I think my workaround for now is to hand code some assembly in place of the buggy function.
  3. After seeing Rasmus's great work, I decided I wanted to start working on my own smooth scrolling games for our beloved TI. Initially, I was set on using the F18A for the scrolling functionality, but alas... since I don't have a hardware setup right now and no current emulator supports the thing's scrolling registers I'm stuck with the good ol' tms9918a (for now). Since I'm lazy by nature and I didn't feel like programming a whole game in assembly, I couldn't use Rasmus' excellent scrolling example code and had to re-implement it in C. I also didn't want to spend too much time transforming the assembly output from Magellan into something directly usable from C. I looked at adding a C exporter to Magellan, but the export .java source file alone was so daunting I decided to write my own tool to generate the scrolling patterns. Since I prefer to work in The Gimp to create the level, I wrote a simple command line program that takes a 16-color bitmap file that represents the scrolling map and generates a C header file with the pattern, color and nametable data (graphics mode only, for now). Maybe I'll look at turning this into a Gimp export filter at some point. For those that are interested, the simplest horizontal scrolling C application that uses the exported header file is only 100 lines of code and looks something like this: // Includes #include "libti99/vdp.h" // Tursi's libti99, VDP functions #include "tistdio.h" // Quick set of functions for keyboard scanning #include "level.h" // Generated header file containing map data #define SIT1 0x01 #define SIT2 0x03 #define CT 0xFF // copy 8 pattern tables into VDP RAM void init_patterntables() { int frame = 0; // Write 8 pattern tables to VDP memory vdpmemcpy(0x800 * frame, patt_frame0, 768); frame++; vdpmemcpy(0x800 * frame, patt_frame1, 768); frame++; vdpmemcpy(0x800 * frame, patt_frame2, 768); frame++; vdpmemcpy(0x800 * frame, patt_frame3, 768); frame++; vdpmemcpy(0x800 * frame, patt_frame4, 768); frame++; vdpmemcpy(0x800 * frame, patt_frame5, 768); frame++; vdpmemcpy(0x800 * frame, patt_frame6, 768); frame++; vdpmemcpy(0x800 * frame, patt_frame7, 768); } // Copy colortable to VDP RAM void init_colortable() { // Init the first two black, the third one gray vdpmemcpy((CT * 0x40), colortable, 13); } void init_nametable() { int x, y; for (x = 0; x < 32; x++) for (y = 0; y < 24; y++) vdpmemcpy((SIT1 * 0x400) + (x + (y * 32)), &(map[y][x]), 1); } void copy_pattern_block(int col, int frame, int backbuffer_sit) { int row = frame * 3; col++; vdpmemcpy((backbuffer_sit * 0x400) + (row * 32), &(map[row][col]), 32); row++; vdpmemcpy((backbuffer_sit * 0x400) + (row * 32), &(map[row][col]), 32); row++; vdpmemcpy((backbuffer_sit * 0x400) + (row * 32), &(map[row][col]), 32); } int main(int argc, char *argv[]) { int x, prev_x; int frame, backbuffer_sit; // Init graphics system x = set_graphics(1); VDP_SET_REGISTER(VDP_REG_MODE1, x); VDP_SET_REGISTER(VDP_REG_PDT, 0); VDP_SET_REGISTER(VDP_REG_CT, CT); VDP_SET_REGISTER(VDP_REG_SIT, SIT1); VDP_SET_REGISTER(VDP_REG_COL, 0xF1); init_patterntables(); init_colortable(); init_nametable(); prev_x = x = 0; backbuffer_sit = SIT2; while(1) { // Scan keys and do movement // scan_keys(); // UP/'E' pressed, move forward if (check_key(2,0x4000)) x++; frame = (x) % 8; if (x != prev_x) { // Move backbuffer to front and vice-versa if (frame == 0) { VDP_SET_REGISTER(VDP_REG_SIT, backbuffer_sit); backbuffer_sit = (backbuffer_sit == SIT1) ? SIT2 : SIT1; } // Advanced frame 1 pixel (aka move pattern descriptor table pointer one position up) VDP_SET_REGISTER(VDP_REG_PDT, frame); // Write 3 rows of the next full frame to the backbuffer // Depending on frame this is either 0-2 (frame 0), 3-5 (frame 1), 6-8 (frame 2), ... copy_pattern_block(x >> 3, frame, backbuffer_sit); if (x > 1016) x = 0; prev_x = x; } } return 0; } Project files, FIAD file and disk image attached for those who want to see it in action (EA#5, ALEXKIDD). The generated code is in level.h and is untouched, what you see is what the tool generated. I tried making a video, but the result looked anything but smooth. If you want to run it in an emulator, I suggest MESS as Classic99's timing is a bit off and makes it look a bit jittery. As you'll see, it runs quite fast as-is, but I'm sure there's room for improvement as this is completely unoptimized code. I'll make the tool available soon as well, but I need to clean it up a bit before it's fit for public consumption. I also need to add a binary file export function, 'cause the C header files are actually way too memory hungry for any practical use. In the future I hope to also add up-down and bitmap mode export functionality. Currently it just does what I needed it to do, and that's it. csmoothscrolling.zip
  4. Hello, For the people using the Windows environement, I've found a Perl script to fix an issue when working with GCC and Visual Studio (2013 Express in my case). The problem came from the format error reporting between GCC and VS, there are not compatible so we can see the error messages in the VS Output window but we cannot interact with them. So, you have to look manually for the line(s) and file(s) to see the errors. This script takes the output produced by a batch file and converts the error messages. The following line has to be added in the NMake section of your project configuration properties for the : Build, Rebuild All, and Clean Command Line. build.bat 2>&1 | c:\Perl\bin\perl.exe gccfilt.pl The Zip file contains the script and a screen capture from VS. I'm not the author of the script and the original link is here : http://www.codeproject.com/Articles/370890/GCCFilter-A-script-for-compiling-with-GCC-in-Visua Thanks, gccfilt.zip
  5. I've scoured the web and the AA archives and I can't seem to find any schematics for the 7800 Maria. I'm referring to logic diagrams for the chip itself, not the 7800 motherboard. These do seem to exist for TIA but I can't find any for the Maria. Does anyone know if they even exist? Surely GCC must have had to conceptualize the internal logic of the chip at some point during development, right? The reason I ask is because I'm working on a hardware project for the 7800 and I've hit a point where I fear the only way to proceed forward will likely be to attempt to replace the Maria chip with an equivalent. This has already been done with TIA but I have yet to see this done with Maria. If anyone has any information at all, it would be greatly appreciated.
  6. Hello, Since PLATOTERM is licensed under the GNU Public License, I have written a piece of accessory software that opens, and displays a copy of the GNU Public License 3.0 from a file. The TIPI version loads this from the web, courtesy of the PI.HTTP level 3 interface, which is very nifty. I do think that the dsr_xxxx functions here should be folded into libti99, but this is something that @tursillion may want to think about... Thanks to jedimatt42 for these functions, which I pulled from TIPICFG's source code. I have uploaded it to github, if you want to use it: https://github.com/tschak909/platoterm99-gpl You can run the program if you have a TIPI: CALL TIPI("PI.HTTP://TI99.IRATA.ONLINE/COPYING") And for the purposes of discussion, I am posting the code here, as well: #include <files.h> #include <string.h> #include <system.h> #include <conio.h> #define VPAB 0x3000 #define FBUF 0x3200 int force_quit=0; unsigned char dsr_openDV(struct PAB* pab, char* fname, int vdpbuffer, unsigned char flags); unsigned char dsr_close(struct PAB* pab); unsigned char dsr_read(struct PAB* pab, int recordNumber); void initPab(struct PAB* pab) { pab->OpCode = DSR_OPEN; pab->Status = DSR_TYPE_DISPLAY | DSR_TYPE_VARIABLE | DSR_TYPE_SEQUENTIAL | DSR_TYPE_INPUT; pab->RecordLength = 80; pab->RecordNumber = 0; pab->ScreenOffset = 0; pab->NameLength = 0; pab->CharCount = 0; } // Configures a PAB for filename and DV80, and opens the file unsigned char dsr_openDV(struct PAB* pab, char* fname, int vdpbuffer, unsigned char flags) { initPab(pab); pab->OpCode = DSR_OPEN; pab->Status = DSR_TYPE_DISPLAY | DSR_TYPE_VARIABLE | DSR_TYPE_SEQUENTIAL | flags; pab->RecordLength = 80; pab->pName = fname; pab->VDPBuffer = vdpbuffer; return dsrlnk(pab, VPAB); } unsigned char dsr_close(struct PAB* pab) { pab->OpCode = DSR_CLOSE; return dsrlnk(pab, VPAB); } // the data read is in FBUF, the length read in pab->CharCount // typically passing 0 in for record number will let the controller // auto-increment it. unsigned char dsr_read(struct PAB* pab, int recordNumber) { pab->OpCode = DSR_READ; pab->RecordNumber = recordNumber; pab->CharCount = 0; unsigned char result = dsrlnk(pab, VPAB); vdpmemread(VPAB + 5, (&pab->CharCount), 1); return result; } void main(void) { struct PAB pab; set_text(); charsetlc(); clrscr(); bgcolor(COLOR_CYAN); textcolor(COLOR_BLACK); gotoxy(0,0); unsigned char ferr = dsr_openDV(&pab,"PI.HTTP://TI99.IRATA.ONLINE/COPYING.TXT",FBUF,DSR_TYPE_INPUT); if (ferr) { cprintf("Could not open License from web."); for (; {} } int i=0; unsigned char ch; while (ferr == DSR_ERR_NONE) { unsigned char cbuf[81]; ferr = dsr_read(&pab,0); if (ferr == DSR_ERR_NONE) { vdpmemread(FBUF,cbuf,pab.CharCount); cbuf[pab.CharCount]=0; cprintf("%s",cbuf); } if (i>4) { i=0; cprintf(" \r\n"); cprintf(" -- PRESS ANY KEY TO CONTINUE -- "); ch=cgetc(); clrscr(); } else { i++; // Get next record. } } cprintf(" \r\n"); cprintf(" END OF LICENSE. PRESS ANY KEY TO QUIT. "); ch=cgetc(); ferr = dsr_close(&pab); } Hope it is useful. -Thom
  7. And this time so does everybody else!
  8. In trying to come up with a strategy to win back memory for Alex Kidd, I was thinking about stuffing some code in a cartridge, so I can win back some of that 32kb expansion memory. Given that I'm currently already at nearly 16k of executable code (including constants), and that I still need to add a good number of features, I need to find a way to create bank switching software with gcc. What follows is a write-up of my ideas, not everything has been tested, and I'm looking for a sanity check: will this work, am I missing something that could simplify things? 1. Multiple pieces of code at the same location The first thing we need to do when hacking support for banked memory (such as bank switched cartridges) in gcc, is to tell the compiler that specific pieces of code will run from the same physical address space. In the case of a program designed to run from cartridge, this would be 0x6000. By default, gcc will put all executable code into a section called .text, and you can tell the linker to position this code at any location in memory by using command line options (--section-start .text=0x6000), or by creating a bespoke linker script and adding a properly configured SECTIONS section: SECTIONS { . = 0x6000; .text : { *( .text ) } . = 0xa000; .data : { *( .data ) } .bss : { *( .bss ) } } (Note: the above example requires a system with 32k memory expansion installed, since it puts all variables in expanded memory. It also requires a crt0 implementation that copies the initialization values for variables in the .data segment from somewhere in ROM or from disk to 0xa000) Since all code is in the .text segment by default, the linker will just start filling up memory with code from 0x6000 onwards, blasting past 0x7fff if the code segment happens to be larger than 8k and in the process creating a useless image for our purposes. At the very least, we can define our memory layout in the linker script to get a warning when one of our blocks exceeds the maximum size. We can do this by adding a MEMORY section to the linker script (there's no command line equivalent of this), and changing the SECTIONS section accordingly: MEMORY { cart_rom (rx) : origin=0x6000, length=0x2000; /* cartridge ROM, read-only */ lower_exp (wx) : origin=0x2080, length=0x1F80; /* 8k - 128 bytes */ higher_exp (wx) : origin=0xa000, length=0x6000; scratchpad (wx) : origin=0x8320, length=0x00e0; /* 32b is for workspace */ } SECTIONS { . = >cart_rom; .text : { *( .text ) } . = >higher_exp; .data : { *( .data ) } .bss : { *( .bss ) } } Now, whenever the .text section exceeds 8k, the linker will throw an error and abort. At least we'll know our program is too big to fit in the 8k, but it would be even better if we could stuff more code in other parts of memory. Unfortunately, ld will not do this for us, and we'll need to explicitly assign code to different sections in our source files by adding attributes to the function definitions. Supposing we already have filled our 8k of cartridge ROM, we could for instance decide to put additional functions in the lower 8k of the 32k memory expansion. First we add the section attribute to each function we want to put in the lower memory expansion area: void somefunction(int somearg) __attribute__ ((section ( .moretext ))); void somefunction(int somearg) { // some code } We now have code that will get put in the .moretext section, so we need to tell the linker where to put this code (assuming the same MEMORY section as in the example above): SECTIONS { . = >cart_rom; .text : { *( .text ) } . = >lower_exp; .moretext : { *( .moretext ) } . = >higher_exp; .data : { *( .data ) } .bss : { *( .bss ) } } (Note: again we need to remember that the cart will need to load the contents of section .moretext from somewhere in ROM or from disk and copy it to the lower memory expansion at 0x2080) In theory, we could automate the annotation of functions by doing two compilation passes: one with all code in the standard .text segment to discover the size of each compiled symbol, and one that uses that info to assign individual functions to the two available sections. In practice, I imagine this is doable enough by hand for most programs. Also, on our platform gcc doesn't seem to support calculating the size of individual compiled symbols, so by hand it is. So now we are able to put code into two different physical locations in the TI's memory, but that still doesn't allow for bank switching. As we said at the very beginning, for that we need to tell the linker that two or more sections of code need to target the same memory area. Turns out that we can do this with the OVERLAY command: SECTIONS { OVERLAY >cart_rom : AT 0x0000 { .text { *( .text ) } .moretext { *( .moretext ) AT ALIGN(0x2000)} } OVERLAY >higher_exp : AT ALIGN(0x2000) { .data : { *( .data ) } } .bss : { *( .bss ) } } Running the linker with a script with the above SECTIONS section will give us a binary that contains three 8k banks: .text, .moretext and .data (we ignore .bss, because those are just zero-initialized variables and are taken care of by our crt0 implementation). The code in the first two banks will expect to run at 0x6000, and expects to find the initialized data from the .data section at 0xa000. Given all this, we should be able to generate binaries in the right format to support bank switching. 2. Actually switching banks in code That was the easy part, after all, it didn't require any coding . However, the trickiest part to bank switching is to write code that can cope with switching from one bank to another (and have that new code return). There are a couple of ways to do this (some more cumbersome than others), but they will all share a common requirement: you need to keep a "bank switching stack" (for lack of a better term). That is to say, when code in bank 1 calls a function in bank 2, we need to save the return bank "location" (i.e. what enables "bank 1") somewhere. If that function in bank 2 then in turn calls a function in bank 3, we need to do the same thing without overwriting the first return bank location. This is a recursive problem, so we need a stack. The idea location for the bank switching stack seems to be in scratchpad, since it will be relatively small and that part of memory is always available. By putting the pointers to this stack in a separate section, we can use the linker script to put it there (or wherever else is convenient). The management of the stack needs to be done right before calling a function in another bank, and right before returning to the calling bank at the end of a function. On a select number of platforms, GCC supports so-called 'far' and 'near' pointers and/or function attributes, which could be used to implement two different function prologues/epilogues depending on the type of function call that needs to be done. Unfortunately, the tms9900 platform implementation does not support these attributes. GCC also has support for instrumenting each function call and return via the -finstrument-functions command line option. You need to implement your prologue and epilogue code in the following two functions somewhere in your code: void __cyg_profile_func_enter (void *, void *) __attribute__((no_instrument_function)); void __cyg_profile_func_exit (void *, void *) __attribute__((no_instrument_function)); However, the call to and return from __cyg_profile_func_enter happens /before/ the call to the actual function, so it would take some serious wrestling with the C call stack to transparently implement bank switching in these functions. Our last option is to instrument individual functions and function calls. This is certainly the most cumbersome implementation of all, but it is the only one which does not need embedded support in the compiler implementation itself. Instrumentation of the function call is relatively easy, keeping in mind that all manipulation of the bank switching stack needs to be done from within the calling bank and the absolute last command needs to be the one that triggers the switch to the next bank. The following process could be a usable implementation: The caller (code runs in bank 1): Writes the address and bank location of the intended callee in two registers (e.g. r0 and r1) Invokes the trampoline The trampoline (code runs in scratchpad/expmem): Saves the current bank on the bank switching stack Loads the new bank Makes the call using the info in (e.g.) r0 and r1 The callee (code runs in bank 2): Does stuff Returns to the trampoline The trampoline (code runs in scratchpad/expmem): Loads the original bank (which is popped from the bank switching stack) Returns to the caller Or, in other words, every function call should be structured as follows: caller calls trampoline(), trampoline calls callee, callee returns to trampoline, trampoline returns to caller. Using this type of construct, the trampoline function needs to transparently pass on all arguments to the callee. The easiest way to accomplish this is the have a bespoke trampoline function for each "far" function we're looking to call (with a "far" function being any function that runs from a bank switchable piece of memory). Something like the following example: // Our "far" function, in bank 2 int far_somefunction(int someint) __attribute__ ((section ( .bank2 ))); int far_somefunction(int someint) { // do something return somevalue; } // Our trampoline function, in non bankable memory (e.g. scratchpad) int somefunction(int someint) __attribute__ ((section ( .nonbankable ))); int somefunction(int someint) { // Set to bank 2, and push caller's bank on the stack push_bank(2); // Call far function retval = far_somefunction(someint); // Set caller's bank pop_bank(); return retval; } Using this, we can safely call somefunction() (our trampoline function for far_somefunction()) from anywhere in our code, no matter which bank we're currently in and no matter where the calling code resides in memory. Furthermore, we can also still call far_somefunction() directly from within the same bank if we want to avoid the overhead of the bank switching and the trampoline function. The big downside of course is that we now have one trampoline function for every "far" function we want to call, all with nearly identical function bodies, eating at our available non-bankable memory. Not a big deal if you plan on banking code in big chunks, but problematic if you have lots of little functions that you need to call from everywhere in your program. We could opt to create one generic trampoline function, using variable argument lists and function pointers, if we're really strapped for memory. The downside is that it would create even more overhead for every "far" function call you're looking to make. Even with bespoke trampoline functions for each far function, it's a good idea to limit the number of bank switching calls you need to do, especially if you're writing an action game that needs to retain a high frame rate, given the fairly high overhead the bank switching introduces. If the compiler had support for naked functions (functions without prologue and epilogue), we could probably reduce the overhead to an absolute minimum, similar to what you'd get with pure assembly code, but unfortunately gcc doesn't support that attribute on our target. I think the above is a sound strategy?
  • Create New...