Jump to content

insomnia

Members
  • Posts

    91
  • Joined

  • Last visited

  • Days Won

    1

Posts posted by insomnia

  1. I saw there was a lot of impact to the problem of using a "movb" instruction to test for zero, so I rushed out a patch to fix this. Jedimatt42 has already written a good description of the problem, so I won't repeat him here. I also realized that there was no changelog entry for 1.18, so I fixed that too. I also changed the installer code to remove the need for a "tree" command, and we now automatically build libgcc.

     

    Tschack909, I tried to help you out, but I don't have enough information to help you out. I looked at the assembly, and I couldn't find anything that looked like a problem.

     

    So here's the changes for the latest patch:

    Use the "ci" instruction when testing for 16-bit zero values

    Use the combination of "jeq" and "cb" to test for 8-bit zero values

     

    The "ci" test is pretty self explanatory so there's not really much to say about that.

     

    However, I wanted to talk a little more about the 8-bit test since I thought it was kinda neat.

     

    Assume we have a value in r1 we want to test for zero. This is the code that the compiler will now emit:

    >1300       jeq 0
    >9801 fffe  cb @$-1, r1
    

    The "jeq" instruction is effectively a no-op. Since it's more likely for some random instruction to result in a non-zero value, we save two clocks by not taking the jump. In either case, we will proceed to the "cb" instruction.

     

    The "cb" instruction is using the zero of the "jeq" instruction as one of its arguments, and we test against the value in r1 as expected.

     

    So by using this combo, we have a bytewise test for zero that's only two bytes and 8 clocks longer than a "ci" instruction. Neat!

     

    At any rate, thanks for finding a bug, and let me know if you find any problems.

    • Like 7
  2. A new patch is now available in the first post, and the installer has been updated as well.

     

    So in this patch, we only have two fixes:

     

    Fixed 16-bit signed right shift
    Fixed 32-bit unsigned right shift

    The 16 bit shift was shifting in the wrong direction, and the 32-bt shift was performing signed shifts.

     

    TheMole found the 16-bit shift bug, and I found the other one after doing some automated testing. Hopefully that's the last of these kinds of errors.

     

     

     

    Sorry for drip feeding these bug reports, but I'm reporting them as I run into them. If you would prefer to send me a patch for me to test prior to you officially releasing it (and assigning a version number), I'd be happy to help, just let me know.

     

    Making new releases takes no effort at all, so don't worry about that. Also, I appreciate bug reports no matter if they are drip fed or not. Thanks for doing these tests, clearly my own testing has some gaps.

    • Like 5
  3. OK, I'm really embarrassed about this bug. It turns out that instead of emitting a right shift, we are emitting a left shift. As you discovered, this only affects signed integers.

     

    I can only assume this was due to a cut-and-paste error somewhere that escaped testing.

     

    This is a one-line fix, but I'll confirm that the rest of the shift operations work as intended before making another patch.

    • Like 5
  4. OK, it's patch time!

     

    The first post has been edited to include the new patches and an updated GCC installer.

     

    The compiler is fairly mature at this point, so there's not many changes here:

     

    More strict checks for address in BL commands

    Optimization for 32-bit left shift by 8 bits
    Optimization for 32-bit logical right shift by 8 bits
    Fixed 32-bit right shift by more than 16 bits
    Fixed 8-bit multiplies

    The big news here is fixing 8-bit multiplies. This was first seen by TheMole, so thank you for the find.

     

    There's also some edge cases that got worked on in 32-bit shifts. These are nothing major, and were not likely to have been seen by too many people. I found these problems and the BL addressing mode bug while working on my OS project. I don't want to derail this thread, so look here for details on that:

     

    http://atariage.com/forums/topic/282474-linux-like-os-for-the-ti

     

    Anyway, thanks to everyone for continuing to find bugs. I love to swat 'em so don't hold back if you find one.

     

    PS: My blog is majorly out of date, I should do something about that...

    • Like 6
  5. Is there any progress on:

    Linker that can handle non contiguous ram

    SAMS support

     

    The linker can already handle non-contiguous memory. The problem is that the program loader needs to copy that code from its storage location to the active location. The crt0 code in the example programs can handle this for basic cartridges and EA5 files, but more complex designs need to have additional code written.

     

    Unfortunately, the compiler will never natively support SAMS, We really don't want that anyway. SAMS support is something that needs to be in something like an OS layer. We would never be able to make the compiler smart enough to handle all the use cases, and we don't want to have two compilation outputs (one for a system with a SAMS card, one without).

     

    I'm slowly working on a replacement OS for the TI which automatically handles SAMS usage. It isn't ready yet, and might not be useful for your project anyway.

    • Like 4
  6. I'm currently working on a fix for this.

     

    The problem is that char values stored in registers occupy the upper byte, When multiplying two register-stored values, the result is in the upper word of the 32-bit result.

    When multiplying memory-stored values, the char-to-int promotion occurs as expected, and the result is in the lower word of the 32-bit result. For extra fun, when multiplying register-stored and memory-stored values the result is stored in the middle two bytes of the 32-bit result.

     

    It's difficult to control how values are stored, so currently char multiplication can give unexpected results (as has been shown already).

     

    I'm working on a fix to force all char multiplications to be first converted to register-storage format, and to extract the correct portion of the result.

     

    Here's the math behind this: Result = (A * 2^8) * (B * 2^8) = (A * B) * (2^16)

     

    Before the operation completes, the resulting value needs to be converted to register-storage format, with the value stored in the upper byte of the result register.

     

     

     

    Here's an example of what I'm trying to achieve:

    mpy r1, r2   * Value A stored in register R1, value B in R2, result in [r2,r3]
                 * The resulting char value is stored in the low byte  of R2
    swpb r2      * Move result to high byte of R2 to convert to register-stored format
    
    

    The problem I'm having is that I can't get the compiler to recognize that the the low word of the 32-bit result is clobbered by the multiply. This means that any value stored in R3 in the example above will be destroyed by the multiplication, but the compiler will assume it's still valid. That error value will propagate through the code and have unpredictable results.

     

    I can't just promote all char values to int values either, The value we want will then be stored in the low byte of the 32-bit result, and we would still need to convert to register-stored format.

     

    Bear with me, a fix should be coming soon...

    • Like 7
  7.  

    Just trying to understand things a bit here. Are your drivers for floppy drive access based upon the DSR code of the device, or have you created something new from scratch and are pretty much just using the ports on the devices?

     

    Just curious is all.

     

    Beery

     

    The floppy driver is written to directly access the FDC registers, specifically the TI FDC. As you suspected, this means a new driver will need to be written for each hardware variant (Myarc, Corcomp, etc.)

     

    This may sound like an unnecessary hassle, but the driver is only 600 lines of well-commented code. About 100 of those lines are just symbol definitions. Additionally, it looks like most FDC and HDC manufacturers took inspiration from the TI FDC, so there are a lot similarities to exploit. Heck, there may even be common code that can be made device-independent.

     

    The reason for ignoring the DSR was that it didn't provide the features I needed, was slow, and needlessly stepped on the scratchpad and VDP memory. Additionally, the DSR code only works with the TI disk format. I wanted to use a different filesystem that allowed for deep directory trees, support for zero-length files, and softlinks (not yet available, I'll come back to it though.). I also wanted a sector cache to speed up common operations.

     

    The only way to do that would be to embed the CFS volume in the TI volume. The DSR would treat CFS as a big file, and then another layer would do the actual filesystem stuff. I didn't time it, but I suspect that to things this way would increase access time by about 200 to 300 percent. No thank you.

     

    I'm using sector-by-sector access as that's how the hardware wants to operate. File operations ride on top of this and function in the normal Unix-y way. I'm not using inodes though. Standard Unix filesystems use a fixed number of inodes that subtract from storage space. Instead, I'm dynamically allocating file metadata records as needed to maximize space.

     

    Sorry, I've gone off topic a bit. Let me try to reel it back.

     

    Yes each piece of hardware will need it's own driver, but there is a lot of similarity between devices of the same type that can be exploited. Plus it's more fun this way.

     

    Other topics:

     

    I read the AMS overlay document, and that is an approach to paging I have not seen before. I'm not sure it's any better than what people have been using so far, but it's worth more time to think about it.

     

    So far, I'm liking plan 99, and I think the parallels with plan 9 are a good thing.

     

    I would report some progress here, but I got bit by what looks like a compiler bug in my bank-switch code, so I'm working to fix that first.

    • Like 3
  8. First off, I should answr some of these questions.

     

    @ retroclouds RE: scheduler preemption

    I'm using the VDP interrupt to periodically check the process queue to see if there is a higher-priority process that is ready to run. if so the currently running process is packed up and the new one is started. Additionally, each time a syscall is invoked (basically any IPC or file IO operation), the system pauses the calling process and starts the next ready, highest-priority process. In some pathological cases, it is possible for a process to starve same or lower priority processes, but that's a risk with other operating systems as well. In all cases, higher-priority processes are started when expected.

     

    @TheMole RE: additional developers

    Once I get the major system design decisions made and get more of a framework put together it would definitely be helpful to bring on additional help. I'm typically used to being a solo developer, but in the interest of time and feature coverage, it's probably a good idea to get additional hands on the project at some point.

     

    @TeBF RE: Tinix

    I thought about the same name, but unfortunately someone already has a minix-derived system by that name. Oh well.

     

    @gemintronic RE: contiki

    Don't worry about it. I looked into porting Contiki a while ago, and it just wasn't a good fit for the TI. A LOT of compromises were taken to make that OS work. It's a microkernel with stackless processes. The whole project has the feel of a demo program. Impressive, but highly tuned for a single purpose. We have enough resources to make something with a more familiar feel and a lot more capability.

     

    @Vorticon RE: user programming

    Good question. For now I expect cross development on a PC to be the only way to go. I would eventually like to get self-hosted development working (C, assembly, pascal, basic, etc), but that's a ways off.

     

    So last night I looked at the AMS card, and started thinking about how best to use that resource. Specifically, how do we maximize memory for user processes? Even though we have up to 1MB of RAM available, we don't have a Memory Management unit (MMU) or facilities in the processor to detect reads or writes to unmapped space. We would need to add wrapper code to every read or write operation to guarantee that the intended memory is available for use before the operation takes place. This looked even worse as I tried to figure out a way to implement that.

     

    I looked at the source for uClinux, which is a variant of linux which suppports processors without MMUs. They resolve the problem by not allowing virtual memory or similar systems like what I was originally looking at. This wasn't too suprising, since I wasn't very far from making the same decision.

     

    So here's the memory model for user processes:

     

    0x2000 - 0x3fff : Application code, 8KB paged

    0xa000 - 0xfdff : Application memory, 23.5 KB

    0xfe00 - 0xffff : Process stack, 512 bytes

     

    If more memory is needed for data, we can use a floppy or a temp file residing in AMS space.

     

    This should be OK. I've already got infrastructure set up for handling paged kernel code. This should be extendable for user code. In my experience, there's usually about a 80/20 split between code and data. So for 24 KB data, we would typically have a 120 KB executable file, and with a total of 1 MB RAM available, we can simultaneously run about 8 of these executables. Alternatively, we should be able to run up to to 128 lightweight processes (4 KB code, 4 KB data). Not too shabby.

     

    So the next step for me is to get AMS mapping working, and to find a good way to do block management to allow kernel and applications to share memory resources. Temp file management will probably come after that. Finally, I think application loaders will be good to work on. At that point, an alpha release doesn't seem so far away.

     

    • Like 5
  9. So I've mentioned on here a few times that I've been working on a new operating system for the TI. Development has gotten to the point where I think it's time to do some kind of announcement. Additionally, I've read the suggestion that something like Contiki should be ported one too many times. We can do better, folks.

     

    I'm taking inspiration from Linux for several reasons. It's clearly a successful design. If it's similar enough, I might be able to port some code instead of writing everything from scratch. Writing user programs will be easier since there is no shortage of good references out there.

     

    Keep in mind, I'm nowhere close to an alpha release, but I did want to make sure I got some kind of input that what I'm working on might be of interest to someone besides me.

     

    The design philosophy I've been following is that while TI wrote a bunch of neat code for the time, modern developers should only be limited by the constraints of hardware, not software decisions of the past. So that means backwards-compatibility is out. Developing against raw hardware is in.

     

    I'm assuming this as a base platform:

    Console

    1MB AMS card

    Two 96KB floppy drives

    RS232 card

     

    Here's a current feature list:

    Kernel runs from a cart image

    Device driver API

    Drivers for:

    text screen

    keyboard

    floppy drive

    /dev/null

    /dev/zero

    /dev/random

    Posix-compatable filesystem API

    CFS filesystem (a simple filesystem similar to ext2)

    Preemptive multitasking

    Priority scheduling

    Mutexes

    Semaphores

    Shared memory

    Syscall API

    Timers

     

    Conspicuously missing features:

    User shell

    Use AMS for process memory seperation

    loading executables (EA5 and ELF)

    Shell scripting

    Some kind of catchy name

     

    As of today, the kernel size is about 12KB and has about 10,000 lines of code (not including libc or libgcc). The code will eventually be put up on github after i clean it up a bit. I need to impose a coding standard and make sure all dead code has been removed properly.

     

    So, what does everyone think of this project? Is this something that sounds interesting or possibly useful? Is there something I seem to have overlooked? Deep-seated opposition to the very concept? Let me know.

     

    I'm trying to work on documentation, but I suspect it will always lag the current code state. If there's some aspect of this thing someone would like more information on. I'll be happy to share.

     

    Thanks for listening.

    • Like 14
  10. Oh yes, the TMS9900 is definitely "special". And at times really annnoying.

     

    Anyway, there is a libgcc included with the gcc patches. To build it run "make all-target-libgcc; make install".

     

    This will build the library and install it in a location GCC can find it. It may be necessary to include "-lgcc" in your linker options.

     

    It's been a while since I've made a project that needed this, so I might be forgetting something. If you have any problems let me know and I will try to help.

    • Like 4
  11. So I'm trying to write a tiny operating system for the TI, and I'd like to avoid as much of TI's original design decisions as possible. This is mostly for the challenge of it and to see if whatever I can come up with is better then what the original engineers made. This would also let me make maximum use of the limited hardware resources.

    My most basic problem is that I need to do my development on an emulator since my actual hardware is packed away at the moment. So my first question is if there are any Ti99/4a emulators which permit direct access to the FD1771 floppy controller. I'm pretty sure Classic99 does not do this, but I think MESS does.

    I tried writing a really basic driver for the FD1771 and ran it in MESS. The problem I had there was that any attempt to read from the disk failed. I was able to get a lot of functions working properly, but when I tried to read a sector, I only got the first byte. After that I got "disk not ready" errors.

     

    I've been using Thierry Nouspikel's code as a starting point, along with dumps of the floppy controller firmware and the FD1771 datasheet.

     

    That's a bunch of text. Here's the short version:

    1) Does anyone know of an emulator that emulates the floppy controller hardware?

    2) Has anyone come across code which directly interfaces with the floppy controller, bypassing the DSR?

     

     

    I've included the code I'm using under the spoiler tag.

     

     

     

    // This is a driver for the FD1771 single-density floppy drive controller used
    // in the TI floppy drive
    // Double-density CorComp drives use a FD179x controller
    
    
    #include "kernel/cru.h"
    #include "kernel/printk.h"
    #include "errno.h"
    
    
    #define FDC_REG_R_STATUS *((volatile unsigned char*)0x5ff0)
    #define FDC_REG_R_TRACK *((volatile unsigned char*)0x5ff2)
    #define FDC_REG_R_SECTOR *((volatile unsigned char*)0x5ff4)
    #define FDC_REG_R_DATA *((volatile unsigned char*)0x5ff6)
    #define FDC_REG_W_COMMAND *((volatile unsigned char*)0x5ff8)
    #define FDC_REG_W_TRACK *((volatile unsigned char*)0x5ffa)
    #define FDC_REG_W_SECTOR *((volatile unsigned char*)0x5ffc)
    #define FDC_REG_W_DATA *((volatile unsigned char*)0x5ffe)
    
    
    #define FDC_CRU_R_LOAD_HEAD 0x1100
    #define FDC_CRU_R_DRIVE_SEL 0x1102
    #define FDC_CRU_R_DRIVE_SEL1 0x1102
    #define FDC_CRU_R_DRIVE_SEL2 0x1104
    #define FDC_CRU_R_DRIVE_SEL3 0x1106
    #define FDC_CRU_R_MOTOR_STROBE 0x1108
    // constant zero 0x110a
    // constant one 0x110c
    #define FDC_CRU_R_SIDE_SEL 0x110e
    
    
    #define FDC_CRU_W_DRIVE_ENABLE 0x1100
    #define FDC_CRU_W_MOTOR_STROBE 0x1102
    #define FDC_CRU_W_WAIT_STATE 0x1104
    #define FDC_CRU_W_LOAD_HEAD 0x1106
    #define FDC_CRU_W_DRIVE_SEL 0x1108
    #define FDC_CRU_W_DRIVE_SEL1 0x1108
    #define FDC_CRU_W_DRIVE_SEL2 0x110a
    #define FDC_CRU_W_DRIVE_SEL3 0x110c
    #define FDC_CRU_W_SIDE_SEL 0x110e
    
    
    #define FDC_CMD_FORCE_IRQ 0xd0
    #define FDC_CMD_SEEK_ZERO 0x0a
    #define FDC_CMD_SEEK_PREV 0x7a
    #define FDC_CMD_SEEK_NEXT 0x5a
    #define FDC_CMD_SEEK_TRACK 0x1e
    #define FDC_CMD_READ_SECTOR 0x88
    #define FDC_CMD_WRITE_SECTOR 0xa8
    #define FDC_CMD_READ_ID  0xc0
    
    
    #define FDC_STAT_BUSY 0x01
    #define FDC_STAT_DRQ 0x02
    #define FDC_STAT_DATA_LOST 0x04
    #define FDC_STAT_TRACK_0 0x04
    #define FDC_STAT_CRC_ERROR 0x08
    #define FDC_STAT_NOT_FOUND 0x10
    #define FDC_STAT_W_FAULT 0x20
    #define FDC_STAT_W_PROTECT 0x40
    #define FDC_STAT_NOT_READY 0x80
    
    
    static int sector_size = 256;
    static int curr_track = 0;
    static int curr_drive = 0;
    
    
    void command_wait(void);
    
    
    void strobe_motor(void)
    {
      SBZ(FDC_CRU_W_MOTOR_STROBE);
      SBO(FDC_CRU_W_MOTOR_STROBE);
    }
    
    
    void send_command(int command)
    {
      char stat = ~FDC_REG_R_STATUS;
      if(stat & FDC_STAT_NOT_READY)
      {
        strobe_motor();
      }
    
      // Wait for drive to be ready
      while(stat & FDC_STAT_NOT_READY)
      {
        // Note: inverted data bus
        stat = ~FDC_REG_R_STATUS;
      }
    
      // Send command
      FDC_REG_W_COMMAND = ~command;
    
      // Signal HLT pin to allow read or write operations
      SBO(FDC_CRU_W_LOAD_HEAD);
    }
    
    
    int select_drive(int drive)
    {
      volatile unsigned int select = 0x0080 << drive;
      volatile unsigned int val;
    
      // Deselect all drives, set SEL1, SEL2, SEL3 to zero
      LDCR(FDC_CRU_W_DRIVE_SEL, 0, 3);
    
      // Select our drive
      LDCR(FDC_CRU_W_DRIVE_SEL, select, 3);
    
      // Confirm drive selection
      STCR(FDC_CRU_R_DRIVE_SEL, val, 3);
      if(val != select)
      {
        return(ENXIO);
      }
    
      // Turn on controller
      SBO(FDC_CRU_W_DRIVE_ENABLE);
      send_command(FDC_CMD_FORCE_IRQ);
      command_wait();
    
      // Drive successfully selected
      curr_drive = drive;
      return(0);
    }
    
    
    void select_side(int side)
    {
      LDCR(FDC_CRU_W_SIDE_SEL, side << 8, 1);
    }
    
    
    // Wait for command completion
    void command_wait(void)
    {
      unsigned char stat = FDC_STAT_BUSY;
      while((stat & (FDC_STAT_NOT_READY | FDC_STAT_BUSY)) == FDC_STAT_BUSY)
      {
        // Note: inverted data bus
        stat = ~(FDC_REG_R_STATUS);
      }
      if(stat & FDC_STAT_NOT_READY)
      {
        // Drive no longer ready, command failed
        send_command(FDC_CMD_FORCE_IRQ);
        command_wait();
      }
    }
    
    
    void initialize(void)
    {
      // Turn on FDC
      SBO(FDC_CRU_W_DRIVE_ENABLE);
    
      // Clear SEL1, SEL2, SEL3 and SIDE lines
      LDCR(FDC_CRU_W_DRIVE_SEL, 0, 4);
    
      // Strobe motor
      strobe_motor();
    
      // Send "force interrupt" command
      send_command(FDC_CMD_FORCE_IRQ);
    
      command_wait();
    }
    
    
    void seek_track_0(void)
    {
      send_command(FDC_CMD_SEEK_ZERO);
      command_wait();
      char stat = ~FDC_REG_R_STATUS;
      if((stat & FDC_STAT_TRACK_0) == 0)
      {
        printk("failed to seek\n");
        // Failed to seek
      }
      curr_track = 0;
    }
    
    
    void seek_track_prev(void)
    {
      send_command(FDC_CMD_SEEK_PREV);
      command_wait();
      curr_track--;
    }
    
    
    void seek_track_next(void)
    {
      send_command(FDC_CMD_SEEK_NEXT);
      command_wait();
      curr_track++;
    }
    
    
    void seek_track(char track)
    {
      FDC_REG_W_DATA = ~track;
      // Only needed if we changed drives
      // FDC_REG_W_TRACK = ~curr_track;
      send_command(FDC_CMD_SEEK_TRACK);
      command_wait();
      curr_track = track;
    }
    
    
    // Hard coded to read sector 0 for testing
    void read_sector(int sector, char *buffer)
    {
      SBO(FDC_CRU_W_LOAD_HEAD);
      command_wait();
    
      FDC_REG_W_TRACK = ~0;
      FDC_REG_W_SECTOR = ~0;
      send_command(FDC_CMD_READ_SECTOR);
    
      SBO(FDC_CRU_W_WAIT_STATE); // Enable wait states
    
      int size = 256; //sector_size;
      do
      {
        // Unroll read loop
        *buffer++ = ~FDC_REG_R_DATA;
        *buffer++ = ~FDC_REG_R_DATA;
        size-=2;
      } while(size>0);
    
      SBZ(FDC_CRU_W_WAIT_STATE); // Disable wait states
      command_wait();
    }
    
    
    void write_sector(int sector, char *buffer)
    {
      FDC_REG_W_TRACK = ~curr_track;
      FDC_REG_W_SECTOR = ~sector;
      send_command(FDC_CMD_WRITE_SECTOR);
    
      SBO(FDC_CRU_W_WAIT_STATE); // Enable wait states
      int size = sector_size;
      do
      {
        // Unroll write loop
        FDC_REG_W_DATA = ~*buffer++;
        FDC_REG_W_DATA = ~*buffer++;
        size -= 2;
      } while(size);
    
      SBZ(FDC_CRU_W_WAIT_STATE); // Disable wait states
      command_wait();
      // test "not found" "data lost" and "write protect" bits
    }
    
    
    int tifdc_init(void)
    {
       // Turn on fdc rom
    }
    
    
    struct device_operations ti_fdc_ops =
    {
      .read_block = &tifdc_read_block,
      .write_block = &tifdc_write_block,
      .init = &tifdc_init
    };
    

     

     

     

    • Like 1
  12. There's a working implementation in the tms9900 binutils package in binutils-2.19.1/opcodes/tms9900-dis.c. Look for the print_insn_tms9900 function. This function does basically the same thing mizapf describes.

     

    Here's some pseudocode:

    index = (opcode >> 12) & 0x0F 
    switch(index)
    {
      case 0:
        index = (opcode >>  & 0x0F
        switch(index)
        {
          case 0, 1, 12, 13, 14, 15:
            format[] = {"","","","","","","","","","","","","","","",}
            break
    
          case 2, 3:
            index = (opcode >> 4) & 0x1F
            format[] = {"li","", "ai","", "andi","", "ori","", "ci","", "stwp","", "stst","", "lwpi","", "limi","", "idle","", "rset","", "rtwp","", "ckon", "", "ckof","", "lrex", "","","",} 
            break
    
          case 4, 5, 6, 7:
            index = (opcode >> 6) & 0x0F
            format[] = {"blwp", "b", "x", "clr", "neg", "inv", "inc", "inct", "dec", "dect", "bl", "swpb", "seto", "abs", "", ""}
            break
    
          case 8, 9, 10, 11:
            format[] = {"", "", "", "", "", "", "", "", "sra", "srl", "sla", "slc"}
            break
        }
        break
        
      case 1:
        index = (opcode >>  & 0x0F
        format[] = {"jmp", "jlt", "jle", "jeq", "jhe", "jgt", "jne", "jnc", "jno", "jl", "jh", jop", "sbo", sbz", "tb"}
        break
    
      case 2, 3:
        index = (opcode >> 10) & 0x07
        format[] = {"coc", "czc", "xor", "xop", "ldcr", "stcr", "mpy", "div"}
        break
          
      default:
        format[] = {"", "", "", "", "szc", "szcb", "s", "sb", "c", "cb", "a", "ab", "mov", "movb", "soc", "socb"}
        break
    }
    
    if(format[index] != "")
      decode as format[index]
    else
      invalid instruction

    There's more to it than this, but this should give you somewhere to start.

     

    Good Luck!

    • Like 1
  13. OK, actual content time.

     

    I was originally going to make this long-winded explanation about how console_read and console_write should be implemented, but thought a bit and decided to just implement demonstration code andbe done with it.

     

    So I've attached a demo that shows a working example with printf and getchar. The user code is pretty simple and the back end screen and keyboard code is relatively quick. This should be a decent base for something more interesting, or a port of a text-mode program.

     

    The only thing I failed to mention is that file IO is not working yet. That's where I got stuck. Again, Tursi's libti99 would be a great starting point. I shoould probably look into that.

     

    Anyway, let me know if there are any problems or missing features,

     

    BTW, the blog has been reactivated and I'll make sure it gets updated.

    libc_demo.tar.gz

    • Like 4
  14. Yup, that's pretty much it. However, those functions are only needed if you want to use the functions in stdio.I would recommend string-oriented functions like sprintf and sscanf instead, That way you could decouple file IO, screen manipulation and keyboard handling from text operations.

     

    The OS I'm working on is intended to be written from scratch, support multitasking, be Posix-compliant, and not use anything from the existing TI console. It should run on unmodified hardware, and only need a cart for the core OS and an expansion box for the disk drives and 32K expansion memory.

     

    Right now, I've got about 40% of the code written. A lot of the core functionality is done, but I got stuck on writing the floppy driver and finalizing a disk format. I want to be able to supports directories, links, and long filenames. Unfortunately, it's easy to get stuck in a cycle of feature-creep and early optimization.

     

    After hitting that roadblock, I started working on a decompiler. That takes a binary image as input and outputs high-level pseudocode. That's pretty far along too, but still needs a lot of work.

     

    I haven't kept my blog up to date since my time to work on TI stuff has been limited lately. I should probably pick that back up and do more documentation for the stuff I've written so far.

    • Like 6
  15. This is a library which adds single-precision IEEE 754 floating point operations. It provides the implementation for internal GCC calls. This means that the "float" type can be used in C programs without any additional effort.

     

    I've tried to make this library as small as possible, but at the same time tried to avoid writing a cryptic mess.

     

    All code has been tested and should be error-free, but let me know if anyone finds a bug somewhere.

     

    libfloat.tar.gz

    • Like 7
  16. This is a libc implementation which is mostly compliant with the c89 standard. I've been working on this for a while as part of a disk-based OS for the TI, but this can still be used as an independent library.

     

    I chose this standard since more recent ones mainly add support for wide characters, complex floating-point data types and other stuff which is probably not useful for a TI99/4a system.

     

    All of the functions commonly used in stdio.h have been implemented, but there are two unresolved symbols: console_write and console_read. These are intended to provide text-based screen output and keyboard input respectively. In my OS project, these are provided by device drivers, but Tursi's libti99 would be better for anyone intending to use the stdio functions.

     

    Hopefully this library is useful for someone. Feel free to do whatever you want with this.

    libc99.tar.gz

    • Like 5
  17. Well, it's patch time again.

     

    The first post of this thread has been updated with the newest patch file and an updated installer script.

     

    Here's the changes in this latest set:

    Fixed 32-bit right constant shift, failed with some constants
    Fixed all 32-bit variable shifts, was using r0 as temp register
    Fixed carry bit in 32-bit add
    Fixed invalid instruction in some 32-bit add forms

    The first one is a result of the bug TheMole reported earlier. Not much more to say about that one.

     

    While testing shift instructions, I found that in some cases the compiler decided to use R0 as a temp register before the final calculation was complete. I'm still not sure why that was happening, since R0 was supposed to be the last register selected for temp registers. This was fixed by disallowing input registers as temps.

     

    Another unexpected bug was that the carry between the low and high words was inverted, resulting in strange results.

     

    There was another edge case where the compiler tried to use the AI instruction with memory addresses. This was fixed by the use of a temp register.

     

    It looks like the only bugs left are in unusual edge cases, which is great. I'm still trying to find improvements, but at this point, it's getting hard to find clear improvements. So if anyone has a good idea, I'd be happy to hear it.

     

     

    • Like 4
  18. Well, it's patch time again.

     

    The first post of this thread has been updated with the newest patch file and an updated installer script.

     

    Here's the changes in this latest set:

    Fixed 32-bit right constant shift, failed with some constants
    Fixed all 32-bit variable shifts, was using r0 as temp register
    Fixed carry bit in 32-bit add
    Fixed invalid instruction in some 32-bit add forms

    The first one is a result of the bug TheMole reported earlier. Not much more to say about that one.

     

    While testing shift instructions, I found that in some cases the compiler decided to use R0 as a temp register before the final calculation was complete. I'm still not sure why that was happening, since R0 was supposed to be the last register selected for temp registers. This was fixed by disallowing input registers as temps.

     

    Another unexpected bug was that the carry between the low and high words was inverted, resulting in strange results.

     

    There was another edge case where the compiler tried to use the AI instruction with memory addresses. This was fixed by the use of a temp register.

     

    It looks like the only bugs left are in unusual edge cases, which is great. I'm still trying to find improvements, but at this point, it's getting hard to find clear improvements. So if anyone has a good idea, I'd be happy to hear it.

     

     

    • Like 1
  19. Well, it's patch time again.

     

    The first post of this thread has been updated with the newest patch file and an updated installer script.

     

    Here's the changes in this latest set:

    Fixed 32-bit right constant shift, failed with some constants
    Fixed all 32-bit variable shifts, was using r0 as temp register
    Fixed carry bit in 32-bit add
    Fixed invalid instruction in some 32-bit add forms

    The first one is a result of the bug TheMole reported earlier. Not much more to say about that one.

     

    While testing shift instructions, I found that in some cases the compiler decided to use R0 as a temp register before the final calculation was complete. I'm still not sure why that was happening, since R0 was supposed to be the last register selected for temp registers. This was fixed by disallowing input registers as temps.

     

    Another unexpected bug was that the carry between the low and high words was inverted, resulting in strange results.

     

    There was another edge case where the compiler tried to use the AI instruction with memory addresses. This was fixed by the use of a temp register.

     

    It looks like the only bugs left are in unusual edge cases, which is great. I'm still trying to find improvements, but at this point, it's getting hard to find clear improvements. So if anyone has a good idea, I'd be happy to hear it.

     

     

    • Like 1
  20. OK, I found the problem in the compiler. The issue is that there was a bug with right shifts of 32-bit values.

     

    Here's how I found the problem:

     

    After doing a diff between the two assembly snippets, there was only one mismatched block:

    patch 12  (works)
      e6:	c2 46       	mov r6, r9
      e8:	0a 39       	sla r9, 3
      ea:	c0 47       	mov r7, r1      
      ec:	09 d1       	srl r1, 13
      ee:	e2 41       	soc r1, r9
    
    patch 15 (fails)
      e8:	c0 87       	mov r7, r2     
      ea:	08 d6       	sra r6, 13
      ec:	09 d7       	srl r7, 13
      ee:	0a 32       	sla r2, 3
      f0:	e1 c2       	soc r2, r7

    From looking at the source code and surrounding assembly, I was able to determine this block of code was part of the command:

    unsigned int srcbank    = _binary_start_bank + ((start - srcoffset) / BANKSIZE);

    Specifically, this code implemented ((start - srcoffset) / BANKSIZE. The (start - srcoffset) calculation was done by the instructions between offsets 0xde and 0xe4. This leaves us with (N / BANKSIZE). Since BANKSIZE is a power of two, the compiler uses a bit shift to do the division, transforming the code into (N >> 13).

     

    Knowing all this, lets walk through the code to see what's going on.

    1) Assume a bit pattern stored in R6 and R7
    
       .----r6--------. .----r7--------.
       abcdefghijklmnop qrstuvwxyzABCDEF
    
    2) By manual calculation, we should have this result:
    
       0000000000000abc defghijklmnopqrs
    
    
    3) Trace the code
    
                            Assembly        Pseudo        bitfield value
    patch 12  (works)       ----------      ----------    ----------------
      e6:	c2 46       	mov r6, r9
      e8:	0a 39       	sla r9, 3       r9 = r6<<3  = defghijklmnop000
      ea:	c0 47       	mov r7, r1      
      ec:	09 d1       	srl r1, 13      r1 = r7>>13 = 0000000000000qrs
      ee:	e2 41       	soc r1, r9      r9 |= r1    = defghijklmnopqrs  <-- Correct!
    
    
                            Assembly        Pseudo        bitfield value
    patch 15 (fails)        ----------      ----------    ----------------
      e8:	c0 87       	mov r7, r2      r2 = r7     = qrstuvwxyzABCDEF  <-- Should use r6, not r7
      ea:	08 d6       	sra r6, 13      r6 = r6>>13 = 0000000000000abc
      ec:	09 d7       	srl r7, 13      r7 = r7>>13 = 0000000000000qrs
      ee:	0a 32       	sla r2, 3       r2 = r7<<3  = tuvwxyzABCDEF000
      f0:	e1 c2       	soc r2, r7      
      f2:	c2 47       	mov r7, r9      r9 = r7|r2  = tuvwxyzABCDEFqrs  <-- Wrong!
    

    Ultimately, this was a typo in the format used for this instruction, and has been fixed. I'll look for similar problems in the other shift instructions and get a new patch sent out in a day or two.

     

    Thanks for the bug report Mole! If anyone finds anything else, those fixes will get included too.

     

     

    • Like 7
  21. Hey everyone,

     

    I've got a new set of patches for the compiler to send out. It's pretty thin, but that's a good thing.

     

    Here's what's new:

     

    Fixed a multiplication bug reported by Chue, In come cases, the input arguments were clobbered leading to a wrong result.

    Fixed incorrect instruction sizes. This will result in better optimized code.

    Added .size directive to calculate function sizes. This is helpful during development.

    Reduced size of the patch file. This makes it easier to understand which changes were made to the baseline.

     

    I've updated the GCC installer to use this latest patch, and added an updated "hello world" program. This program no longer needs the elf2cart tool, makes improvements to the crt0 as well as fixing some bugs in the vdp_copy_from_sys function.

     

    Honestly, there's not much else to say here. The compiler is pretty mature and stable at this point, but I'm always interested to hear of any problems or opportunities for improvement.

    • Like 5
×
×
  • Create New...