This is getting way deeper than I ever imagined from the beginning!
So let me see what we've got so far. Are we still saying that XOP is slower than BLWP? If that's the case, I'd rather go with BLWP, since passing arguments to a BLWP routine isn't inherently more difficult than passing arguments to an XOP routine. Different, but not harder. Speed is a big concern of mine.
Basically, here's what I'm attempting to do: I want to implement a Heap memory structure. I've read the C stdlib source code and found out how malloc() and free() work, when in the context of a single contiguous block of memory. Of course, malloc() and free() are also running in a virtual memory space which is provided to it by the operating system, which we don't have on the TI. But working with hard coded addresses, I believe I could implement a Heap structure in the TI using standard memory.
Now I want to consider using the SAMS as well. malloc() returns an address pointing to the beginning of the allocated chunk of memory, immediately following the length prefix, so in the SAMS case, I believe I can replace this return value with an address/bank pair. I will need to write a fetch routine to operate on this address/bank pair, so that the application software can be agnostic about where in memory its data lies. The memory management routine will take care of the mapping/banking for it.
Of course, what this means is, I'll have a maximum allocation size of 4K when requesting a block from malloc(), since the application software won't be able to tell when it's reached the end of a page, if the memory management function is doing all the management for it. I'll have to think on this some more.
I'm sort of a computer science geek, so I like data structures and structured programming and things like this, so I like the notion of having a managed Heap of memory that I can allocate and deallocate at will, avoiding memory leaks that way. Just in case you guys wanted to know what I'm learning all this for!