Jump to content

insomnia

Members
  • Posts

    91
  • Joined

  • Last visited

  • Days Won

    1

Everything posted by insomnia

  1. I don't know if this is helpful, but I've run into a problem like this before where the root cause was a non-word-aligned .data section. The memory initialization code in crt0 copies words, so a misalignment like this results in a mess of strangely-behaving code. I've tried to address this in my newer projects by making sure the linker does the alignment for me. This is just a guess, I haven't had a chance to look at any code lately. I really should either update the hello example or post an improved crt0 sometime soon.
  2. Hey everyone, I've got some more patches. At this point, the compiler is getting pretty mature, and finding a bug or a missing feature has become a rare event. I wanted to get these changes out since they work, and I'm not sure what else to work on. Anyway, here's what's new in this patch: Changes to GCC, patch 1.14 Tail call optimization Confirmed support for the C++ language Tail call optimization can remove the need for a stack in some cases, which is helpful for recursion or cases where a call is the last operation in a function. This is a common optimization strategy, and a quick Google search can be more informative tha what I could write here. The other feature that has now been tested is support for the C++ language. This can sometimes be more useful than C for certain dessigns. This also implies that all the other languages GCC supports: Java, Fortran and Ada, should work also. Unfortunately, I don't know Java or Ada well enough to test properly, and I doubt that many people are waiting desperately to use Fortran. I've included a "hello world" example in C++ in the first post of this thread, as well as the patch and a new version of the GCC installer that uses the latest patches. I've looked into far pointers, and unfortunately, I can't make them work in the compiler. GCC only uses one size for pointers. If we used 32-bit far pointers, we would have to use 32 bits for ALL pointers, resulting in much less efficient code in the general case. We would lose the use of at least one register. The stack could be up to twice as large. Code size would be larger, and memory requirements would increase. In short, there's no real advantage. If far pointers are really needed, they will need to be emulated by function calls. I'm not happy about this, but that's where we stand. I've also noticed that when using -O2 optimization, GCC likes to inline function calls as much as possible. If functions are not marked as static, they will be duplicated and not called in the output binary. This is done under the assumption that a non-inlined version will be needed for other code. So use static functions whenever possible. If anyone has any problems, or finds a bug or improvement, please let me know.
  3. Once again, it's patch time. They have been added to the first post of this thread Life has been particularly distracting lately, so I haven't been able to spend as much time doing TI stuff as I would like. I have finished a floating point library for ieee singe precision floats, and have been slowly building up the parts of libc which are OS-independent. I would still like to work with a later GCC version, but just haven't been able to get to it. The long-term plan is to write a new OS for the TI hardware, but that seems to be far in the future. Anyway, here's the fixes I made while working on the other stuff: Added compilation pass to better use post-increment addressing Ensured word alignment for symbols Removed optimization of tests against zero, they emitted unnecessary opcodes Fixed 32-bit shift instructions Fixed shift instructions to handle shift by zero bits Fixed and instruction to use ANDI when appropriate Added optimizations for shift of 32-bit value by 16 bits Fixed multiply to prevent using MPY with an immediate operand The test against zero code I removed was never called in a helpful manner. The idea was to take advantage of notes on register deaths to do some slightly faster, but destructive testing. Unfortunately, those notes are aways removed before the optimization could be used. The end result was to always insert a MOV to do testing, even if an earlier instruction already set the condition flags properly. We're better off without that junk. The new compilation pass searches for places where pointer increments can be merged with an earlier use, resulting in more post-increment addressing. This can save a few instructions in certain use cases and is a fairly neat bit of code. The multiply fix corrects a problem originally seen by TheMole this spring (Yikes! So long ago!), and was pretty straightforward. The other fixes are pretty self-explanatory, and don't need much discussion. One thing that didn't get added to this patch was a failed attempt to reorder the register usage to eliminate MOV instructions. If it worked, it could reduce most code by 7%, but in the end it was too difficult to avoid breaking 32-bit values or scrambling function arguments. Maybe I will get back to this at some point ant try again. Several questions have been asked about GCC natively handling SAMS or paged memory, or far pointers in general. This is not really practical. There are too many different ways that vendors have extended the memory space to be able to support them all in the compiler. The right way to handle this is to make a library of functions to handle these extensions. I'm looking into finding some way to make this easier, but haven't put much time into that effort. As always, If anyone finds any issues or sees a way to improve the generated code please let me know.
  4. Hey everyone, life has been eating up all my time, so I haven't done much TI stuff lately. At any rate, this is a bug in the compiler. What this error is saying is that GCC tried to output a conditional statement using an instruction like "jeq $+X". The assembler detected that X was too large to fit in the instruction and reported an error. In all likelihood, the problem is only loosely related to the animation array, which may explain why you are still seeing the problem after moving it around. This kind of problem is caused by GCC incorrectly calculating the length of a code branch. I thought I fixed all these types of errors, but apparently there is more work to be done. if you could post your code, or some smaller piece which reproduces the error, I should be able to fix this relatively quickly. Mole, your bug is way overdue for review, and I will look at that one too. If anyone else has seen compiler bugs or other annoyances let me know.
  5. After much procrastination, I've got patches. This is a pretty slim patch, with only a few changes. I suppose this is a good thing, since it implies the bug count is reaally low (hopefully zero, but how often does that happen?). Part of the reason for the delay was that I've been tryng to move the TI changes into GCC version 5.2. Version 4.4.0 was the most recent when I started, but there have been many improvements since then. So far I haven't been able to get a good build, but I'll keep trying. Anyway, these are the changes this version: Fixed bug when dividing by constant value Improved type testing for instruction arguments Added text to "--version" flag output to show patch version Added changes to improve builds in an OSX environment Yep, not a very impressive list, but each bug fixed is progress. Here's the output showing the patch version: eric@lenovo:~/dev/tios/src/temp/$ tms9900-gcc --version tms9900-gcc (GCC) 4.4.0 20090421 (TMS9900 patch 1.12) Copyright (C) 2009 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. One of the problems that bugged me was that the installer script used to be hard coded to use a specific patch set. That's been changed to use the most recent patch in the current directory when the installer script is run. This should be handy for more easily using future patches. I've also addded a check to use either wget or curl, whichever is found first, to get the source packages. The first post in this thread has been edited to include the new GCC patch. I made an archive for use under Cygwin in a Windows environment. It's too big for an attachment, so here's a link: tms9900_cygwin.tgz As always, if anyone finds a bug or an improvement, let me know.
  6. I found and fixed the compier crash for wolfie3. It turned out to be a mistake in one of the four descriptions for division. That description was allowing the use of a constant as the numerator, rather than requiring that value be stored in a register. At a later time, code which assumes the use of a register chokes when it finds a constant instead, causing a crash. The error only showed up in this code because it contains lots of calculations, increasing the pressure on the compiler to make maximum use of the registers. While doing instruction selection it saw that this division form did not require a register and acted accordingly. The normal behavior is to put all constant values in registers before use. For less demanding code, that register usage would be left in place, and no error would be seen. Once I tracked it down, this was actually a pretty easy fix. Before I put together any new patches, I'd like to figue out why libgcc is failing to build on mac. Mole, could you attach the build errors you are seeing? That would be really helpful. Also, if anyone else has feature suggestions or bug reports, I'd love to hear them so I can beef up the patch a little bit.
  7. TheMole, some of the problems you are seeing are due to a missing libgcc (__divsi3, test.c output). Since you are having problems building libgcc, that makes sense. The "internal compiler error" problem is king of worrying. That looks like the kind of type conversion problem I thought was fixed. I did some research, and the toplev.h problem seems to be very common when compiling GCC with clang. I'll include a fix for thaqt in the next patch. There's no reason we should need patches for my patches. Since I don't have a Mac to test on, fixing the other errors might be tricky. Unfortunaately, I'll need some help from you to do this. It would be helpful if you could edit wolfie3.c to find the smallest subset of code that causes compilation errors. Based on the filename, I'm guessing there are a lot of dense calculations there which probably stress the compiler beyond what I've seen so far. If you could make a list of the errors seen when building libgcc, that would also be helpful. The libgcc errors are the most suprising to me. Libgcc is built by the copy of GCC that it is has just been built. I've built GCC from scratch using the installer script several times, and I haven't seen any errors when using my Linux box. I'm not sure how using a different OS would make a difference here.
  8. This is a really good point I hadn't thought about before. In a future patch I'll need to have a better way to do this. For now, the easiest way to check if patch 1.11 is applied is to compile a file using the -da option. This will create a bunch of debug files describing each step of the compilation process. The content of these files is not important for this test. eric@compaq:~/dev/tios/src/temp$ cat dummy.c int test() {return 1;} eric@compaq:~/dev/tios/src/temp$ tms9900-gcc -da dummy.c /home/eric/dev/tios/toolchain/lib/gcc/tms9900/4.4.0/../../../../tms9900/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000a074 eric@compaq:~/dev/tios/src/temp$ ls *tms9900_subreg dummy.c.171r.tms9900_subreg If patch 1.11 has been applied there will be a file ending with tms99900_subreg, otherwise you are likely using patch 1.10. From the problems you are seeing, I'm pretty sure you are using an older patch. As you said, the newer patch should fix everything you are seeing now. Would you be okay with uploading your modified install script somewhere? I don't have access to a Mac, so I would like to incorporate your changes into a future version if you are OK with that. You've given me a lot of ideas on how things can be improved here. If I'm not mistaken, miniGW provides a unix-like environment within Windows, and is similar to Cygwin. If that's the case, you can install the TMS9900 version of GCC alongside the PC version provided by miniGW. The new compiler will be named tms9900-gcc and shouldn't conflict with anything aleady installed. You could try to run the gcc_installer script from within the miniGW environment, I think that should work. If not, Tursi made a video linked earlier in this discussion which documents the process using an older patch. This is a question that has come up a lot, and I really should figure out a better answer for you. By the way, thank you to everyone who has been using GCC and providing feedback. There is no way the compiler would be as functional as it is now without all your help.
  9. As promised, here's the new patch. The files are attached to post #1 in this thread, and the gcc_installer package there has been updated with the latest patches. I have to admit, this one was harder than I expected. The biggest problem was finding a way to handle the ever-annoying word to byte conversions. From the start, this has been the sticking point for this port. Every attempt at an elegant fix up to now has failed in some edge case. I think the method used here should be simple enough to work without any mysterious compiler bugs in the future. The compiler consists of about 200 different passes, each one doing some distinct operation (parse C code, remove dead code, register allocation, etc.). In this patch I've added a new pass to specifically handle these type conversions. The basic idea is to find every place where a data type conversion in an instruction is done, and put that conversion into it's own instruction. Temporary registers are used for the converted values, and the following passes will emit correct and optimized code. Here's an example to try to show what I'm talking about: char example(int A, char B) {return (char)A + B;} By default, GCC assumes that for both the byte and word representations, the least significant bits are stored in the least significant byte of the register or memory location holding the value. However, this is not true for values stored in registers. In this case, we need to convert the A value to byte representation. Doing this within the GCC framework turned out to be tricky and error-prone. The new subreg pass converts this code to a more compiler-friendly form. char modified(int A, char B) {char C = (char)A; return C + B;} In this new form, it's easy to see what the output code should be: mov r1, r3 * Copy A to C swpb r3 * Convert to byte representation ab r2, r3 * B += C (a.k.a (char)A) mov r3, r1 * Copy to return position After some optimization passes: swpb r1 * Convert A to byte representation ab r2, r1 * Return (char)A + B This method turns out to be a lot easier to implement than I thought. With any luck I'll never need to mess with type conversions again. (Fingers crossed). The other changes in this patch are to fix problems mentioned earlier in the thread, and are pretty straightforward. As always, more details are on my blog, and I'd love to hear of any problems anyone sees. Changes in this patch: Fixed compilation error due to missing include in tms9900.h Fixed problem declaring global variables, not always exported Some instruction sizes were defined incorrectly, causing assembly errors Fixed conditional jump displacement limits, they were too small Added compilation pass to add needed SWPB instructions.
  10. Oops, I forgot about that problem... The compiler is fine, but there's a bug which causes the build of libgcc (which provides support for some 32-bit operations and other rarely-used function) to fail. The compiler itself is fine, but it contains a few known bugs which the next patch will fix. If you start reading from post #232 you can see the problems other people have seen with this version. It's nothing huge, but there are some rough edges to look out for. To get past the immediate problem though, change the line in install.sh from "make all-gcc all-target-libgcc" to "make all-gcc", everything (except libgcc) will be built and installed as expected.
  11. There's an installer script which handles all this for you: gcc-installer.tar.gz This was added in post #202, but I've shamelessly copied the brief directions below: This uses the latest set of patches (binutils patch 1.7, gcc patch 1.10) and should do what you need. Let me know if you have any problems. By the way, I'm in the process of putting together a new GCC patch, so that should be ready in the next day or two. This will address all the problems seen since December. (Wow, has it really been that long? Sorry about that...)
  12. Nope you're right, no changes to your code should have been necessary. Behold my shame: Input code: int no_init[10]; int init_zero[10] = {0}; Output assembly: cseg no_init bss 20 def init_zero def init_zero init_zero bss 20 The output SHOULD be: cseg def no_init no_init bss 20 def init_zero init_zero bss 20 This is a side-effect of the earlier change made to keep variables from mistakenly being placed in the .bss section. Fortunately, this is an easy fix, and will get rollled into the next patch. In the mean time, what you've done here is the best workaround. Thanks for the report.
  13. I didn't respond earlier since I thought I could knock out a quick fix, but that doesn't seem likely. I was able to replicate your error compiling on a Linux box with GCC so using clang sould be just fine. The error is occuring when trying to compile libgcc. The compiler itself is fine. If you change the line in install.sh from "make all-gcc all-target-libgcc" to "make all-gcc", everything (except libgcc) will be build and installed correctly. This should be good enough to at least use the compiler. Libgcc consists of a lot of rarely used functions, so you should be OK. I'm still in the investigation stage, but I suspect that the changes I made to prevent word to byte conversions is somehow interfering with the code that assigns registers and memory locations. The error occurs when GCC tries to find an instruction which will handle R27, which is used as a placeholder until a real register or memory location can be assigned. For obvious reasons, this will fail. I'm trying to get a fix out ASAP, but it seems like it won't be as easy as I first thought. Sorry.
  14. I just had another thought, you could scrap the second pass altogether, and just not include variable blocks when calculating the RLE. Another bad example: Expanded map: aaaabcccAAcC RLE compressed map: 4a,1b,3c,1A,1A,1c,1C Change the first block A to block D: Expanded map: aaaabcccDAcC ^--- changed block RLE compressed map: 4a,1b,3c,1D,1A,1c,1C ^--- changed block This has the same effect as what I mentioned earlier, but is a lot simpler, do it should be faster too.
  15. It seems that the problem is that you have destructable blocks, but they are at fixed, known locations. All you need to know is which block to place at that spot. You could RLE compress the static portions, but do not include any block which could be changed. Fill those blocks with sky or something. Keep another RLE map of all the variable sections. Render one pass with the static map, then run another pass to overlay the variable map. The variable map should be very sparse and should compress well, and should render quickly. When a block change is needed, it should be easy to convert screen coordinates back to inidices in the RLE map and make the update there. Updates will happen rarely, so this part can be a little slow. Here's a badly described example in 1D. For the RLE representation, I'm using (repeat count)(symbol) Expanded static map: aaaabcccXXcX Expanded variable map: ________AB_C RLE compressed static map: 4a,1b,3c,2X,1c,1X RLE compressed variable map: 8_,1A,1B,1_,1C Expanded composite map: aaaabcccABcC When block A needs to change to D (or blocks for money bags), only that value needs to change. The rest of the RLE structure is unaffected. After changing A to D: Expanded static map: aaaabcccXXcX Expanded variable map: ________DB_C ^--- changed block RLE compressed static map: 4a,1b,3c,2X,1c,1X RLE compressed variable map: 8_,1D,1B,1_,1C ^--- changed block Expanded composite map: aaaabcccDBcC ^--- changed block I'm not familiar with your rendering algorithm, so this may not be feasable, but I think this should help reduce the memory needed for a level.
  16. OK, who's up for new patches? First, a huge thank you to Lucien2 for tracking down these compiler bugs and making test cases for them. That was super helpful. Here are the latest changes: Binutils --------- Restored ability to have label and code on same line Minor code cleanup GCC ------- Prevented use of R0 as an address base Moved jump tables into text segment to free up space for variables Fixed bug which put initialized data in bss section Fixed negation of byte quantities Minor code cleanup Basically, all the bugs here are the result of me making dumb mistakes. Sorry about that. But the good news is that If these were the most visible bugs still rattling around in the code, the compiler must be getting pretty darn stable. I'm really happy that as more projects are being worked on, the list of problems seems to be drying up. The number of untested edge cases (each potentially hiding a bug or two) must be pretty smalll at this point. I feel kind of bad that I haven't been able to help anyone having problems with Cygwin, or Windows stuff in general. I do all my development in Linux, and don't really have any useful advice for those people. Another big thank you to everyone who has been able to provide that help. At any rate, the patches and gcc installler have been updated in the first post. If anyone is interested there are more development details on my blog. As always, please let me know if you have any problems or see anything that could be improved.
  17. Well it's been a really long time, but here's another set of patches. I went through the comments posted since the last release and I think I fixed every issue mentioned here. To everyone who reported their problems, thank you. It's been incredibly helpfull. So here's what's new: Binutils ---------- Added support for numeric registers Correct handling of comments Added support for dwarf debugging information GCC ------- Changed order of jumps for less-than-or-equal tests to improve performance Fixed several integer type conversion bugs Corrected handling of variable shift by zero bits Fixed signed division Added support for dwarf debugging information For the assember, I wanted to make sure I was being as compatible as possible with Editor/Assembler. It should be able to compile anything that E/A can handle and produce identical output. The obvious defect here was with GAS being unable to handle the numeric registers typically seen in TI assembly (EX: mov 1,2). The other problem was in properly handling assembly comments. There were problems with ambiguous parsing when the "*" comment character was used. I was initially using a simple character search for this, but it turns out that a grammar parser was needed to intellegently determine what was code and what was comment. (For example: "ai r1, value * comment") Tricky, but that seems to be working great. There was more work needed in the compiler. The biggest problems here were yet more int-char type conversion failures (With any luck, we shouldn't see any more of these) and signed division gave nonsense results. Several comments were made about division, and much frustration was had. Sorry about that. The other big thing here is the new support for Dwarf2 debugging. For the longest time I wanted to be able to see mixed source output, with both the original source code and the resulting assembly on a line by line basis. Prior to this, knowing where an assembly instruction came from was a tedious and error-prone task for moderately complicated source code. Now it's much easier to identify inefficient algorithms, or debug tricky logic. (I suppose it would also help to be able to find compiler bugs. If there were any. Which there aren't.) The debug information can be used to view assembly at compile time or to view the source code of compiled programs. The other binutils programs (most importantly readelf and objdump) know how to work with these sections. The elf2cart and elf2ea5 programs I posted earlier will ignore these sections, so there's no problem doing development with debug on. I should mention that a lot of additional non-TI-compliant text will be added to the assembly file if debugging is used. This clutter could make it harder to read, and the file will only compile with GAS. Just something to be aware of if you are using an exotic build method. I've updated the first post with new attachments and the gcc installer has been updated to build this latest version. If anyone has problems or finds any bugs, please post them here. I love bug reports, keep 'em coming. As always, there are more details on my blog.
  18. Wow, Thanks! I'm having fun working on this, but It's been amazing seeing all the projects that everyone else has made with these tools. I'm always so excited to see people be able to express their talent and creativity, and I'm proud to think I had some smalll part in helping to make that easier.
  19. Hey everyone, I've been out of the loop for a while, but I'm making it a point to spend more time doing compiler stuff. First off, I've copied alll the build directions and latest patches to the first post in this thread. That should make it easier to find any changes or improvements. Also, I've added a script to automate the patch and build process in gcc-installer.tar.gz To use this, run the install.sh script and pass as an argument the directory. The script will download the unmodified files from gnu.org, patch, build, and copy the output files to the location specified. For example, after decompressing the archive, and running this script, tms9900-gcc and friends will be copied to /home/eric/tms9900/bin $ tar -xzf gcc-installer.tar.gz $ install.sh /home/eric/tms99900 Let me know if anyone sees any problems
  20. I was looking at integer square roots, and found (on wikipedia no less) this algorithm. It appears to run in constant time and only uses fast operations (shifts, adds and subtracts). This also only uses 16-bit intermediate values, so no libgcc required. I haven't analyzed this for performance or correctness, but it looks interesting. Link: http://en.wikipedia....em_.28base_2.29 Code: short isqrt(short num) { short res = 0; short bit = 1 << 14; // The second-to-top bit is set: 1L<<30 for long // "bit" starts at the highest power of four <= the argument. while (bit > num) bit >>= 2; while (bit != 0) { if (num >= res + bit) { num -= res + bit; res = (res >> 1) + bit; } else res >>= 1; bit >>= 2; } return res; }
  21. I took a quick look at the output of the fixed point code. It looks like the problem is actually in fp_div The left shifting of value A promotes the value to a 32-bit quantity, and the compiler then uses __divsi3 to get the result, which is then demoted back to a 16-bit value. This could be avoided by the use of inline assembly (div uses a 32-bit numerator and 16-bit divisor) alternately, you could try algebraic manipulation to keep all intermediate values in 16-bit representation. Personally, I'd go for the inline assembly since that would be faster to execute and simpler to understand. That being said, the __divsi3 function is included in libgcc, if you want to use it. On to new patches. Since I've been busy lately, this is going to be a pretty sparse update. These are mostly fixes for stuff Tursi has found so far. GCC changes: Fixed R11 restoration in epilogue being dropped by DCE Added support for named sections Removed support for directly zeroing byte memory, was buggy in some memories binutils changes: Added more informative syntax error messages Fixed values like ">6000" in strings being mangled Confirm support for named sections As always, more details are on my blog. binutils-2.19.1-tms9900-1.5-patch.tar.gz gcc-4.4.0-tms9900-1.8-patch.tar.gz
  22. Using assembly language is probably the way to go here. The GCC compiler would be handy to write the non-critical code more quickly. A few months back I looked into writing a Wolfenstein-style raycaster. I did a quick mockup and made an estimate of the work required, and it seemed pretty do-able. Unfortunately, I never found the time to actually do anything on this. There would be severe limitations on the graphics due to the VDP capabilities, but I don't see any reason why this couldn't be done.
  23. Always happy to take requests. I didn't plan for the use of multiple sections, so the assembler, compiler and possibly the linker will need to be changed. The ELF conversion tools should be fine as they are. The biggest impact will be in the assembler. It will need to have a new keyword added to specify an arbitrarily named section. None of these should be too bad, though. Maybe a hundred lines or so total. The work required to get these options to work will also make it easier to make programs using overlays, It seems pretty common to use special sections to keep all the pieces straight. For now, you can get a similar result by creating a library with each function in its own file. The linker will then only include the parts that are actually used. I have a library of common functions that is maintained his way and it works pretty well. Most libc implementations do this too.
  24. Dave Pitts is responsible for the TMS990 GCC support. He had functional code working before I ever started my stuff. The comipiler output should be OK for TMS9900, but there are differences in approach. The biggest one is probably in handlaing of in-register byte operations. I solved that by modifying the mid-level GCC code to propogate type conversions. Dave did it by converting all byte operations to memory operations. The location of the workspace is stored in R10 and offsets to registers are calculated rom there. Personally, I feel my approach is better, but there's room for argument here. I've also implemented a lot of optimizations that are lacking in the 990 compiler. As for libc, I've actually done some work on that. I've got pretty much everything that doesn't involve formatted output or console or file IO working. I could release what I have, but I'm not sure how useful that would be on its own. Unfortunately, there's not a lot I can pull from the 990 code.The math routines for libm could be stolen directly, but according to the comments, those are themselves stolen from Minix. All other libc functions are implemented by performing a syscall to the 990 operating system, which obviously won't work here. Although having said that, people have had good results by bridging into the GPL routines. I'm actually working on IEEE floating point support right now (32-bit only, no doubles). It seemed like it would be fun to do, and there's really not too much to it. My blog hasn't been updated for a while, but I've currently got type conversion working. Addition and subtraction are in progress, but almost done. So far it's about 300 bytes for the float routines, I guess that full support will probably take about 1K when complete.
  25. Compiling each overlay independantly would work, but there is the added work of making sure that any RAM used is located at the same address for each independant overlay section. This is probably not too bad if all overlays use the same memory structures. If all global and static variables are defined in a common header file shared by each of the overlay files, you should be fine. If the overlays are written in a more usual style, with independant needs for global variables, it will be difficult to manually locate each of those variables in RAM. As for the default load location, I'd have to check later in the day, I'm pretty sure its something useless like >0000 or some other invalid location. Since I knew the loading would be different between cart and EA5 images, I wasn't too concerend with specifying a default address since I knew the linker would have to be supplied with one anyway. Position independant code is neat, but has its drawbacks. By using -fpic, the compiler must use a Global Offset Table and additional code to calculate the addresses of code and data based on the current load address. This makes for larger and slower code. Here's an example using a very simple function: void example1() { example2(); } It's assembly would look something like this: example1 b @example2 With position independant code, we don't know where example2 is anymore and must rely on the size and reletive positions of the functions to determine where to jump. We would have to do something like this: getpc mov r11, r1 * Return address of calling instruction b *r11 example1 li r1, {address of getpc} * This could also be provided as an argument bl *r1 ai r1, {offset to this instruction} - {offset to example2} b *r1 This probably won't work right, but you get the idea. Position independant code is bigger, slower and more awkward to use. Check out http://eli.thegreenp...ared-libraries/ for a discussion of how to use position independant code for shared libraries in x86 systems. It includes a more in-depth discussion of how position independant code can be implemented, but the example above gives some idea of the differences and the extra effort involved.
×
×
  • Create New...