Jump to content
IGNORED

SAMS usage in Assembly


gregallenwarner

Recommended Posts

You're welcome!

 

I have thought about making fbForth 2.0 SAMS-aware; but, I would need to severely limit where the paging occurs. There is no room in low RAM and high RAM is complicated by virtue of the fact that the stack grows from high RAM down towards the dictionary, which grows up towards the stack. I suppose I could leave that up to the user's discretion after appropriate instruction and caveats.

 

...lee

Link to comment
Share on other sites

You're welcome!

 

I have thought about making fbForth 2.0 SAMS-aware; but, I would need to severely limit where the paging occurs. There is no room in low RAM and high RAM is complicated by virtue of the fact that the stack grows from high RAM down towards the dictionary, which grows up towards the stack. I suppose I could leave that up to the user's discretion after appropriate instruction and caveats.

 

...lee

I ran into a similar problem, originally when given a 128K AMS by Chris he wanted a XB that would only work with the AMS and allow swapping the upper 24K pages to make insanely sized XB programs.

 

After about week it looked like a year long project so I asked Chris if it would not be better to make it work with Assembly or as a insanely sized buffer or as a RAMDISK type of device.

 

That is how RXB got BSAVE, BLOAD, and AMSBANK. It allowed RXB to load up to 960K of Assembly or have a buffer space for use in XB programs.

 

Maybe you could use the SAMS like a BUFFER Device to store and quickly load screens at 10 times the speed of Disk or Hard drive access or even RAMDISK speeds are slow in comparison.

Link to comment
Share on other sites

Impossible as the SAMS support for RXB only swtiches the lower 8K used for Assembly, but can also be used for many more functions if you watch the demos of loading graphics or games or Assembly.

Ok, I gotcha. RXB uses the low memory address space as dedicated switching banks, and the program lives in high memory, protecting it from being accidentally switched out. Where does the RXB interpreter live? Does it require itself being run from cartridge space, or can RXB itself be loaded into memory? (I assume your portion of code that actually controls the SAMS hardware is part of the RXB interpreter, and separate from user program space.)

 

First off why would you load code to swap SAMS banks when RXB has that feature built in and management is handled for you by CALL AMSBANK(low page,high page) ?????

I'm simply trying to learn how it's been done by others in the past so I can use those techniques in my ASM programming. Thanks for your explanation of how RXB does it, that's very helpful to someone like me who doesn't know a whole lot about TI-99 programming.

  • Like 1
Link to comment
Share on other sites

I'm simply trying to learn how it's been done by others in the past so I can use those techniques in my ASM programming. Thanks for your explanation of how RXB does it, that's very helpful to someone like me who doesn't know a whole lot about TI-99 programming.

 

Just curious, are you interested in games programming?

Link to comment
Share on other sites

I'm honestly not sure how memory works under the hood on the Geneve. What I do know is I was able to do what I wanted with Myarc BASIC without moving stuff around in memory like I would have had to do with SAMS and RXB.

 

BTW, just remembered: I once wrote some text (with illustrations) about the Geneve memory organisation in our Wiki, so if you are interested, have a look here: http://www.ninerpedia.org/index.php/MDOS_Memory_Management_Functions

  • Like 1
Link to comment
Share on other sites

Ok, I gotcha. RXB uses the low memory address space as dedicated switching banks, and the program lives in high memory, protecting it from being accidentally switched out. Where does the RXB interpreter live? Does it require itself being run from cartridge space, or can RXB itself be loaded into memory? (I assume your portion of code that actually controls the SAMS hardware is part of the RXB interpreter, and separate from user program space.)

 

 

I'm simply trying to learn how it's been done by others in the past so I can use those techniques in my ASM programming. Thanks for your explanation of how RXB does it, that's very helpful to someone like me who doesn't know a whole lot about TI-99 programming.

Yes RXB resides in GROM (or GRAM) from >6000 to >DFFF (REA resides from >E000 to >FFFF) and of course the XB ROMs are needed as they are mostly duplicates of XB ROMs.

 

If you play around with RXB using the SAMS you can write some extremely powerful XB programs that can not be created any where as easy in normal XB and the SAMS.

Matter of fact to do what RXB does in some of the demos is impossible with out doing massive number of disk or hard drive asccess.

Edited by RXB
Link to comment
Share on other sites

I’ve been thinking about this for the last day or so. To answer the original posters question, I think all that is needed is a trampoline routine that allows one to select a bank, and an address in that bank, and branch to it. A sister routine will allow one to return to the original bank and resume. So, very much like a BL (branch and link) instruction, but across banks. In terms of executing code in SAMS memory, this is probably all that is needed. When using this approach, only one 4K address range is needed, in other words, only one SAMS page needs swapping in/out at a time. A software implemented subroutine stack would allow nesting of banked code jumps. So code “in normal” memory could call a routine in SAMS (which pages in the appropriate bank) which calls a routine in a different bank of the SAMS, etc. etc. and eventually it all unwinds as one would expect.

 

My approach would be:

 

* Treat the entire 24K address range as standard, un-banked (non-SAMS) memory;

* Treat the first 4K of low memory as SAMS banked memory;

* Treat the second 4K of low memory as standard, un-banked memory.

 

The second 4K of low memory would hold:

* The trampoline routine

* The return routine

* The subroutine stack

 

The trampoline and return routine functions are likely to only be 150-200 bytes. They would be located at the start of the second 4K routine. Then you want some subroutine stack space. This would be place at the *end* of the second 4K block of low memory, and “grow” towards lower memory addresses. The space in-between the routines and the stack is free memory for general use. Just leave enough space for your subroutine stack to grown into!

 

The subroutine stack would occupy 4 bytes per entry:

* 2 bytes for the return address;

* 2 bytes for the return bank (somewhat wasteful, but faster).

 

This is somewhat wasteful, but in reality, the subroutine stack would never get very large. Even if you had nested subroutine calls some 10 levels deep, you’d only consume 40 bytes of stack space.

 

I’m fairly sure there’s code in the SAMS docs that does pretty much what I’ve described above. The disk that accompanies the SAMS card has quite of DV80 source code. There’s no documents, unfortunately, you have to read the source code (which is commented).

  • Like 1
Link to comment
Share on other sites

Curiosity always kills the cat!

 

Here's something I just put together. Untested. I won't be able to test this for a while. If someone wants to try it/debug it please go ahead!

 

It implements a subroutine stack, as per my post above. The bank is also stored on the stack, so you can jump between banks and nest etc as much as you like. Upon return is restores the previous bank and resumes execution from the instruction following the BL instruction, just as one would expect.

 

Example:

 

        BL @GOBANK
        DATA 7
        DATA >20FE
next  ... ... ...

Jumps to address 20FE in bank 7.

 

The subroutine at 20FE in bank & would exit with a:

 

        B @RETBNK

 

Which will restore the *previous* bank and resume at address next just like a normal BL instruction does.

 

Note that because banks can be endlessly nested, it's only necessary to have one 4K page reserved as a SAMS bank-switched area. The rest of the memory is normal un-paged memory.

 

Before using, the AMS memory should be initialised with a call to AMSINI.

 

 

        aorg >3000          ; second 4K block of low memory
       
substk  data >4000          ; subroutine stack pointer
curbnk  data 3              ; >2000 is SAMS bank 3 as initialised by AMSINI
        ; standard AMS Initialisation routine
amsini  li   r12,>1e00      ; ams cru base
        sbo  0              ; turn on ams
        li   r1,>feff       ; (this is ->0101)
        li   r0,>4000       ; start of memory
amslp   ai   r1,>0101       ; add 1 page
        mov  r1,*r0+        ; move 2 bytes to mem-mapper
        ci   r0,>4020       ; all done?
        jlt  amslp          ; no, init more
        rt                  ; return
        ; call to an address addr in bank bank
        ; example:  BL @GOBANK
        ;           DATA bank  ; the SAMS bank to map into >2000 area
        ;           DATA addr  ; the address to call
        ;
        ; Uses R0, R1, R2 and R11
gobank  mov *r11+,r0        ; get bank number
        mov *r11+,r1        ; get address to call
        mov @substk,r2      ; get subroutine stack address
        ai r2,-4            ; make space for another entry on stack
        mov r2,@substk      ; store new stack pointer address`
        mov @curbnk,*r2     ; place current bank on stack
        mov r11,@2(r2)      ; place return address on stack
        mov r0,@curbnk      ; make new bank the current bank
        bl @dobank          ; perform the bank switch
        b *r1               ; branch to the target address
       
        ; return from a subroutine in a bank
        ; the previous bank is restored to the >2000 area and the program
        ; resumes from the address after the call. Subroutines should call
        ; this routine with a simple B @RETBNK
        ; uses R0 and R1
retbnk  mov @substk,r1      ; get subroutine stack pointer
        mov *r1,@curbnk     ; load current bank
        mov *r1,r0          ; get bank in r0 for DOBANK routine
        bl @dobank          ; perform the bank switch
        mov @2(r1),r0       ; get address to return to
        ai r1,4             ; pop bank and address from stack
        mov r1,@substk      ; update stack pointer
        b *r0               ; return
       
       
        ; subroutine to perform bank switch
        ; maps the bank in R0 to >2000
        ; uses R0 and R12
dobank  li r12,>1e00        ; cru address of SAMS memory mapper
        sbo 0               ; enable access to mapper registers
        swpb r0             ; get bank in high-byte
        mov r0,@>4004       ; page bank in at >2000
        sbz 1               ; turn off mapper
        rt

 

:)

Link to comment
Share on other sites

Thanks for the routines, Willsy! I'm definitely learning a lot from this. Namely, to decide on one 4k page to be designated the sole bank switcher. I was trying to wrap my mind around writing a memory manager that would keep track of many banking pages all over the TI's memory space.

 

Unfortunately, I don't have RXB, and I don't have a GRAM device either, so I'd have to build a GROM emulator if I were to try and download RXB and give it a try. However, I'm very comfortable in ASM, so as long as I'm following the proper techniques, I feel I shouldn't have any trouble.

 

RasmusM, I have a few games ideas, but this isn't directly related to them. I probably won't attempt writing any games for quite some time.

 

Thanks again to everyone who's posted some suggestions. They're all starting to paint a better picture in my head of how things should be done in the TI.

Link to comment
Share on other sites

Another possibility that I’m thinking about is logical addressing to physical address conversion. This would work for up to 64K (16 banks) of SAMS memory at a time.

 

What I’m thinking is this: Think of the SAMS memory as 64K of memory, with a linear address space (0-FFFF). You supply a logical memory address (0-FFF) to a subroutine and it maps the appropriate bank into memory for you, and gives you the physical address – the address where you can find it in physical TMS9900 accessible memory.

 

Something like:

 

 

LI R0,>FACE ; we want this address in SAMS memory
BL @GET_IT

 

At this point, the appropriate bank has been mapped into memory at >2000 and R0 will hold the address of the start of >FACE in that 4K bank (which will obviously be an address between >2000 and >3FFF). So, we *think* of the SAMS as simply linear blocks of 64K, and our code an effectively be written like that too, the address translation is abstracted away from our application code by the address translation subroutines.

 

Of course, it would be expensive use the address translation subroutines all the time, but in reality, you’d seldom need it. For example, if you want to read a string at address >1234 (in virtual memory) and it’s 40 bytes long, well, you only need one call to get the physical address and map the appropriate bank into memory. Subsequent accesses would be normal memory MOVes.

 

How cool would that be? Thoughts?

 

Of course, there’s 1024K in a SAMS card, so it would be trivial to expand it to conceptualise 16 pages of 64K each (16x64=1024). All mapped into a small 4K window at >2000. I have a feeling that this is exactly what expanded memory was on early PC platforms, though I could be wrong.

 

Anyone fancy having a go at coding it? If it’s more than 30 lines of assembly I’d be surprised!

 

Link to comment
Share on other sites

Another possibility that I’m thinking about is logical addressing to physical address conversion. This would work for up to 64K (16 banks) of SAMS memory at a time.

 

What I’m thinking is this: Think of the SAMS memory as 64K of memory, with a linear address space (0-FFFF). You supply a logical memory address (0-FFF) to a subroutine and it maps the appropriate bank into memory for you, and gives you the physical address – the address where you can find it in physical TMS9900 accessible memory.

 

Something like:

LI R0,>FACE ; we want this address in SAMS memory
BL @GET_IT

At this point, the appropriate bank has been mapped into memory at >2000 and R0 will hold the address of the start of >FACE in that 4K bank (which will obviously be an address between >2000 and >3FFF). So, we *think* of the SAMS as simply linear blocks of 64K, and our code an effectively be written like that too, the address translation is abstracted away from our application code by the address translation subroutines.

 

Of course, it would be expensive use the address translation subroutines all the time, but in reality, you’d seldom need it. For example, if you want to read a string at address >1234 (in virtual memory) and it’s 40 bytes long, well, you only need one call to get the physical address and map the appropriate bank into memory. Subsequent accesses would be normal memory MOVes.

 

How cool would that be? Thoughts?

 

Of course, there’s 1024K in a SAMS card, so it would be trivial to expand it to conceptualise 16 pages of 64K each (16x64=1024). All mapped into a small 4K window at >2000. I have a feeling that this is exactly what expanded memory was on early PC platforms, though I could be wrong.

 

Anyone fancy having a go at coding it? If it’s more than 30 lines of assembly I’d be surprised!

 

There Geneve does some similar operations (during XOP routine calls) to determine where the routine was called from prior to banking in buffers and other memory pages. works pretty slick. There are also routines for page mapping, allocating pages, etc. If someone does pursue your coding challenge, the Geneve's memory XOP source code may offer some potential tricks and ideas to take things further.

Link to comment
Share on other sites

On 9/3/2014 at 1:33 PM, Willsy said:

If I recall correctly we can't XOP on the 4A, yes? Something to do with address bus cycle decoding?

 

Not so! Here're a couple of excerpts from the E/A Manual explaining that XOP 2 is available and that XOP 1 may also be available:

 

7.19 EXTENDED OPERATION--XOP
 
... This instruction is on all TI-99/4A Home Computers. However, some only support XOP 2 while others support both XOP 1 and XOP 2. To find out if your TI-99/4A computer supports the XOP 1 instruction, run CALL PEEK in TI BASIC and read one word at address >44. If the word is >FFD8, then XOP 1 is available. If it contains other data (most likely >FFE8), then XOP 1 is not available. ...
 
... XOP 1 is at address >44, with vectors >FFD8 and >FFF8. XOP 2 is at address >48 with vectors >83AO and >8300. The first entry in the vector is the new workspace address. The second entry is the new Program Counter address. ...

 

...lee

Link to comment
Share on other sites

If I recall correctly we can't XOP on the 4A, yes? Something to do with address bus cycle decoding?

 

Can you post one of the simpler Geneve examples so that I can get a flavour of how it works out how a similar system might work?

 

Thanks

Just think of XOP (a software trap) as a glorified BLWP that performs context switching based on a WS and PC vector. If you look at the 9900/9995 manuals, you'll see the XOP vectors start at 0x0040 which in the TI is ROM, the Geneve is RAM.

 

When I converted some of the MDOS OS library routines for my own use, I just had to modify the calls with my own WS/PC. I suppose calling them "XOP routines" may be misleading, as the only XOP "code" is the context switch to get to the routines.

Link to comment
Share on other sites

 

Not so! Here're a couple of excerpts from the E/A Manual explaining that XOP 2 is available and that XOP 1 may also be available:

 

7.19 EXTENDED OPERATION--XOP
... This instruction is on all TI-99/4A Home Computers. However, some only support XOP 2 while others support both XOP 1 and XOP 2. To find out if your TI-99/4A computer supports the XOP 1 instruction, run CALL PEEK in TI BASIC and read one word at address >44. If the word is >FFD8, then XOP 1 is available. If it contains other data (most likely >FFE8), then XOP 1 is not available. ...
... XOP 1 is at address >44, with vectors >FFD8 and >FFF8. XOP 2 is at address >48 with vectors >83AO and >8300. The first entry in the vector is the new workspace address. The second entry is the new Program Counter address. ...

 

...lee

The XOP instruction works, but because the address and workspace vectors they use are hard-coded in the ROM on the 4A, they are of limited practical use. XOP 1 and XOP 2 could be used if you put the code and workspace where the hard-coded values say, but the other XOP vector addresses are used for code in the ROM.

Edited by Stuart
  • Like 1
Link to comment
Share on other sites

XOP is the TMS equivalent of a syscall. I did not see the parallels until I understood the mechanisms of system calls. The TMS99000 processor enters privileged mode when an interrupt occurs, an XOP is executed, or a Macrostore operation is performed. The TMS9900 and 9995 processors do not have a privileged mode, but they can, in principle, find out whether an XOP has been used to call some code by checking the bit in the status register.

 

BTW, the GeneveOS only uses XOP 0. The different functions are selected by argument passing. (I should reformulate some paragraphs in my text on ninerpedia, since it seems as if higher XOPs are used, although only the parameter is set to the specific value.)

 

Thus the XOP on the TI is not useless, since you can install a handler at FFF8 which determines the desired function.

Edited by mizapf
Link to comment
Share on other sites

Thanks for the routines, Willsy! I'm definitely learning a lot from this. Namely, to decide on one 4k page to be designated the sole bank switcher. I was trying to wrap my mind around writing a memory manager that would keep track of many banking pages all over the TI's memory space.

 

Unfortunately, I don't have RXB, and I don't have a GRAM device either, so I'd have to build a GROM emulator if I were to try and download RXB and give it a try. However, I'm very comfortable in ASM, so as long as I'm following the proper techniques, I feel I shouldn't have any trouble.

 

RasmusM, I have a few games ideas, but this isn't directly related to them. I probably won't attempt writing any games for quite some time.

 

Thanks again to everyone who's posted some suggestions. They're all starting to paint a better picture in my head of how things should be done in the TI.

Some in the TI community also make a RXB Cart for Hardware and other versions.

Link to comment
Share on other sites

There Geneve does some similar operations (during XOP routine calls) to determine where the routine was called from prior to banking in buffers and other memory pages. works pretty slick. There are also routines for page mapping, allocating pages, etc. If someone does pursue your coding challenge, the Geneve's memory XOP source code may offer some potential tricks and ideas to take things further.

The TI99/4A is so slow that I doubt the Geneve way of doing this would work effectively.

I also looked into doing this as I originally stated and the problems were the more often I moved around pages the slower XB would get.

Yes I would be able to have monstrous sized XB programs but the cost was gong to make it complicated and slow.

 

So I opted for the USER to load unlimited Assembly or use the SAMS as a gigantic buffer or graphics library as some of my RXB demos do.

Link to comment
Share on other sites

The TI99/4A is so slow that I doubt the Geneve way of doing this would work effectively.

I also looked into doing this as I originally stated and the problems were the more often I moved around pages the slower XB would get.

Yes I would be able to have monstrous sized XB programs but the cost was gong to make it complicated and slow.

 

So I opted for the USER to load unlimited Assembly or use the SAMS as a gigantic buffer or graphics library as some of my RXB demos do.

I would give the TI a bit more credit -- the memory paging operations are fairly simple. Even many of the disk controllers have memory paging of their own to flip between ROM and RAM "pages". The TI would be more than fast enough to handle simple memory management. I just offer the Geneve routines as fodder for anyone interested...

Link to comment
Share on other sites

XOP is the TMS equivalent of a syscall. I did not see the parallels until I understood the mechanisms of system calls. The TMS99000 processor enters privileged mode when an interrupt occurs, an XOP is executed, or a Macrostore operation is performed. The TMS9900 and 9995 processors do not have a privileged mode, but they can, in principle, find out whether an XOP has been used to call some code by checking the bit in the status register.

 

BTW, the GeneveOS only uses XOP 0. The different functions are selected by argument passing. (I should reformulate some paragraphs in my text on ninerpedia, since it seems as if higher XOPs are used, although only the parameter is set to the specific value.)

 

Thus the XOP on the TI is not useless, since you can install a handler at FFF8 which determines the desired function.

 

This makes perfect sense to me! I have been bemoaning the TI's lack of a privileged mode and system calls, but now that I understand how XOP works, it looks like it could serve my purposes perfectly!

 

I think I'm going to put the code that handles bank switching in an XOP routine, so effectively the XOP instruction will serve as an "extended fetch", reaching into the SAMS to fetch the desired data. Banking will be handled automatically by the XOP implementation routine.

 

Thanks again, everybody. I'm learning quite a lot.

  • Like 1
Link to comment
Share on other sites

How would you assemble the bank switched code and load it into SAMS during your test cycles? I don't think we have any tools to assist with that.

I've also been thinking about that. Untested theory: If a user first enabled the memory mapper registers and left them enabled (by using a tiny little assembler routine) then the standard #3 loader could be used.

 

By using AORG and DATA statements in your source you could actually change memory banks:

 

 

AORG >4004
DATA >0303
...code...
...code...
...code...
 
AORG >4004
DATA >0404
...code...
...code...
...code...

Etc...

Edited by Willsy
Link to comment
Share on other sites

 

Not so! Here're a couple of excerpts from the E/A Manual explaining that XOP 2 is available and that XOP 1 may also be available:

 

 

Ah! Thanks for the clarification, Lee. Yes... Hmmm.... Interesting... Might serve nicely as a cheaper alternative to BLWP - potentially a little faster as it's a 2 byte instruction, whereas BLWP is a 6 byte instruction?

 

Does anyone know what the distinction is with respect to XOP1 working/not working? Is it 99/4's, or perhaps the V2.2 99/4A's? Anyone know?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...