Methods for speeding up CALL LINK

senior_falcon · July 25, 2017

In doing some speed tests I found that some surprising behavior when comparing a CALL LINKs to equivalent CALLs in XB and TI BASIC. CALL HCHAR(10,10,42) takes about .0452 seconds in TI BASIC (with MiniMemory or EA) and .0178 seconds in XB. No surprise there; any CALL in BASIC is much slower than in XB. The surprising thing is that the equivalent CALL LINK("HCHAR",10,10,42) is much slower in either language. In BASIC it takes about .067 seconds (vs .0452); in XB it takes about .0412 seconds (vs .0178). I have been doing some investigation to find out why this is so and to see if there are ways to speed up CALL LINK.

Let's start by describing how I did these benchmarks. (BASIC will be on the left, Extended BASIC will be on the right)

Run this program for 60 seconds:

10 x=x+1

20 GOTO 10

11058 loops in BASIC 8838 loops in XB (Yes, BASIC is faster here)

60/11058 = .00543 sec/loop 60/8838 = .0068 sec/loop

Now add an HCHAR to the program and run for 60 seconds:

10 X=X+1

15 CALL HCHAR(10,10,42)

20 GOTO 10

1186 loops 2442 loops

60/1186 = .0506 sec/loop 60/2442 = .0246 sec/loop

Subtract the time needed to increment the counter (lines 10 and 20) and you get the time to do one HCHAR:

.0506 - .00543 = .0452 sec .0246 - .0068 = .0178 sec

Taking this further:

CALL SCREEN(7) takes .033 sec .0086 sec

CALL HCHAR(10,10,42,2) takes .0499 sec .0223 sec

It seems that more numbers make it run slower, and from the above we can estimate that to perform just the CALL takes about:

.0274 seconds .004 seconds (!)

Passing each number and acting on it takes about:

.0056 sec/number .0046 sec/number

Let's see if we can use this to get an estimate of how fast a different CALL will be (XB only here):

15 CALL SPRITE(#1,65,16,100,150,10,10,#2,66,15,110,160,10,10)

There are 14 numbers if you count #1 and #2 as numbers, so:

.004 (CALL)+ 14*.0046 (14 numbers) + .0068 (counter) = .0752 seconds/loop

In one minute it should loop 60/.0752 or 798 times

Actually running the program gave 799 loops for a very close agreement. So it looks as if you can quickly get a reasonable estimate of how long a given CALL will take.

Now lets take a look at CALL LINK using the same methods:

To perform a CALL LINK and immediately return to BASIC takes about:

.0421 seconds .0154 seconds

To pass a number using NUMREF takes about:

.00836 seconds .0086 seconds

To pass a string using STRREF takes about:

.0192 seconds .00895 seconds

You can see the problem. To do just the CALL LINK takes virtually the same amount of time as CALL HCHAR(10,10,42). Worse, passing numbers to CALL LINK using NUMREF is much slower than in a BASIC or XB CALL. What to do...

**********************************************************************************************

Since each number passed takes .00836 or .0086 seconds, is it possible do the same thing using fewer numbers? How about combining two numbers into one:

Row=2 Col=4 could become 204 (not 24)

Character 112 with 99 repeats could become 11299 (You couldn't get more than 99 repeats, but by then the speed of assembly should have outpaced BASIC.)

You should provide the option of passing the numbers separately if that is necessary. Here is a snippet of code that will get 1 or 2 numbers for HCHAR and combine them into the screen address:

------------------------------------------------------------------------------

HCHAR1 BLWP @ST2NUM R15 points to numberstring in arg list;get 1st number into R6

*GETARG has put 100 into R3

C R6,R3 are row & column combined in 1 number?

JLT NOTCMB no, separate rows and columns

CLR R5

DIV R3,R5 divide 100 into R5 now ROW is in R5,COL in R6

JMP ADDRC

NOTCMB MOV R6,R5 ROW is in R5

BLWP @ST2NUM get column into R6

ADDRC SLA R5,5 x32

A R6,R5

AI R5,-33 now ROW and COL combined and point to VDP address

-------------------------------------------------------------------------------

*************************************************************************************

CALL LINK is pretty slow at .0421 or .0154 seconds. So instead of this:

15 CALL LINK("HCHAR",7,7,65)::CALL LINK("HCHAR",8,8,66) (.132 or .0824 seconds) how about:

15 CALL LINK("HCHAR",7,7,65,8,8,66) (.092 or .067 seconds) or better yet CALL LINK("HCHAR",707,65,808,66)(.076 or .050 seconds)

You can pass up to 16 arguments with a CALL LINK, enough for 8 HCHARs if you combine the numbers. The assembly subroutine will keep track of the position in the argument list and return to BASIC/XB when at the end of the list; otherwise keep looping till done.

************************************************************************************************

It takes much less time to pass a string with STRREF than it does to perform a CALL LINK. So instead of:

15 CALL LINK("HCHAR",1,1,42,32)::CALL LINK("VCHAR",1,32,42,24) (.151 or .0096 seconds)how about:

15 CALL LINK("HCHAR",101,4232,"VCHAR",132,4224) (.095 or .0588 seconds)

The assembly subroutine knows how many numbers to retrieve for each loop. If it sees a string where there should be a number (VCHAR in the above example) then it knows not to loop becase a subroutine name is being passed. It would get the string, look it up in the DEF table and get the address, same as BASIC or XB would. But then it has to add 8 to the address in XB or 6 to the address in TI BASIC so it doesn't perform GETARG again which would reset the place in the argument list. Here is a snippet of code:

----------------------------------------------------------------------------------

*Each A/L sub must start as follows:

HCHAR LWPI WKSP Need to load workspace in XB

BL @GETARG get the number of arguments into R8

*Or in TI BASIC:

HCHAR MOV R11,R12 store return address

BL @GETARG get the number of arguments into R8

*GET THE NUMBER OF ARGUMENTS IN LIST AND PUT IT IN R8

GETARG MOV @>8312,R8

SRL R8,8 R8 contains the # arguments.

CLR R1 R1 is the position in the argument list. When R1>R8 then all done, return to BASIC

LI R3,100 for use later on

B *R11

-----------------------------------------------------------------------------------

**************************************************************************************

With the above techniques it should be possible to considerably speed up CALL LINK, and maybe get speeds approaching twice as fast. But by thinking outside of the box we can get speed increases of up to 10x or more!

What is the biggest bottleneck when using CALL LINK? The time spent passing numbers and strings to the assembly subroutines. Could we do something like this to get 4 HCHARs using combined rows and columns:

15 CALL LINK("HCHAR")

16 REM 707,65,808,66,909,67,1010,68

If HCHAR can figure out where the REM line is then maybe the numbers could be read and converted directly.

(The code and addresses I give will be for TI BASIC with MiniMemory. This should be possible with XB as well, but I will leave it to others to figure out the details. In XB the program line is even neater:

15 CALL LINK("HCHAR") !707,65,808,66,909,67,1010,68)

When CALL LINK is performed and the assembly subroutine starts, >832E points to an entry in the line number table that points to the line being executed. But we don't want that line; we want the next line with the REM statement. To find it, subtract 4 from the address in >832E. Read the 2 bytes from that address in VDP ram. That points to the start of the REM line. Subtract 1 and it points to the length byte of the REM line. Read the length byte and now you know how many bytes to read from VDP and where to read them from. Use VMBR or equivalent to read the REM line. My routine strips any trailing spaces from the line (BASIC likes to tack on extra ones) and sets a pointer to the 1st non space character in the REM statement. Then you can read the characters, using commas as separators, and convert them from strings into numbers.

A test program that uses this technique was written that does HCHAR 18 times. It looped 765 times in 60 seconds which is .0785 seconds per loop. Subtracting the counter gives .073 seconds which is .004 seconds per HCHAR compared to XB's .0178 seconds or BASIC's .0452 seconds. (With the faster CALL LINK in XB this would be .00257 seconds per HCHAR!)

I have attached a zipped folder with HCHARTEST.TXT, HCHARTEST.OBJ, and HCHARTEST which are the souce code, object code, and TI BASIC program to run the test. Please note that this program is neither complete nor polished. It is just a "proof of concept" demonstration. This runs out of TI BASIC using the MiniMemory. In the BASIC program, line 11 has some CALL LOADs that force the TI to load the object code into the minimemory. It wants to load the object code from disk 4. Change that as necessary.

I believe that it is possible for the assembly routine to retrieve numeric variables or string variables using the REM line. You'd have to figure out how to find them and I have not investigated that as yet. If you can find a numeric variable you should be able to send it back to BASIC the same way. I don't think it would be practical to send a string variable back to BASIC without using STRASG.

HCHARTEST.zip

Edited July 25, 2017 by senior_falcon

+InsaneMultitasker · July 25, 2017

Interesting analysis. I often push values into known memory locations, starting at 0xA000, using a CALL LOAD followed by a CALL LINK to the routine. I have never assessed the speed differences as my primary reason for this was to save space assembly space by reducing and eliminating as many NUMREF/NUMASG/CIF/CFI calls as possible.

Your last paragraph made me wonder if there would be value in creating a routine to find and save the address of a particular variable name, then use the XB variable "globally" for all subsequent calls to the routine(s). The routine could theoretically change the assignments in-program, so you weren't locked into the first set of variables.

For example,

CALL LINK("FINDVR",R,C,CHAR,REP) - find the internal addresses of the variables and store them for the "HCHAR" routine.

R,C=1::CHAR=65::REP=100::CALL LINK("HCHAR")

I'm guessing this would work best (if it works at all) in Extended BASIC with 32K, where both program and numeric variable space is stored in CPU RAM, and where the addresses are static once prescan and variable allocation has completed. I agree that STRASG is probably most efficient and practical for strings, though a similar approach to bypass STRREF might save some time,.

Edited July 25, 2017 by InsaneMultitasker

RXB · July 25, 2017

CALL LINK in Basic is 100% GPL with some Assembly rarely utilized.

CALL LINK in XB is 70% GPL with more Assembly utilized, the ROMs in XB are used to speed up the process.

Both CALL LINK in Basic or XB use VDP stack to fetch and store values as they arrive. (VDP Stack is just a very slow way to do this.)

In RXB I created CALL EXECUTE(ADDRESS) that does exactly what this post has concluded needs to be done....

SO WHY IS SOMETHING I DID IN 2001 NOT BEEN DISCUSSED? (Did I just vanish from Earth?)

sometimes99er · July 25, 2017

Nice investigation. I guess the REM solution is a bit static, but that does not rule out practical use in programs and games.

senior_falcon · July 25, 2017

CALL LINK in Basic is 100% GPL with some Assembly rarely utilized.

CALL LINK in XB is 70% GPL with more Assembly utilized, the ROMs in XB are used to speed up the process.

Both CALL LINK in Basic or XB use VDP stack to fetch and store values as they arrive. (VDP Stack is just a very slow way to do this.)

In RXB I created CALL EXECUTE(ADDRESS) that does exactly what this post has concluded needs to be done....

SO WHY IS SOMETHING I DID IN 2001 NOT BEEN DISCUSSED? (Did I just vanish from Earth?)

I have noticed that people have a tendency to not use things that they can't understand. Your documentation is not clear on how to use CALL EXECUTE. All I can get it do do is crash. If you answer these questions I'll give it a try:

Is the address decimal or hex?

Where is the workspace?

How do I return from my a/l program? I tried RTWP but to no avail.

RXB · July 25, 2017

I have noticed that people have a tendency to not use things that they can't understand. Your documentation is not clear on how to use CALL EXECUTE. All I can get it do do is crash. If you answer these questions I'll give it a try:

Is the address decimal or hex?

Where is the workspace?

How do I return from my a/l program? I tried RTWP but to no avail.

The EXECUTE subprogram directly goes to the cpu-address and expects to find 4 bytes to be present. The bytes are 1 and 2 define the workspace register address. Bytes 3 and 4 define the address to start execution at in cpu memory. Programmers can see this is a BLWP at a cpu-address. The programmer is responsible for keeping track of the workspace and program space he is using. Also for any registers while doing a BL or another context switch. A RTWP will end either a BL or a BLWP as long as registers set are not changed.

That is the first line in the RXB docs for EXECUTE.

EXECUTE PAGE E5
 -------------------------------------------------------------
 Programs
 Line 100 initializes lower 8k | >100 CALL INIT
 Line 110 loads the assembly   | >110 CALL LOAD(9838,47,0,38,1
 program shown below. VMBR     | 14,4,32,32,44,3,128)
 Line 120 loads registers with | >120 CALL LOAD(12032,0,0,48,0
 VDP address, Buffer, Length.  | ,2,255)
 Line 130 runs line 110 program| >130 CALL EXECUTE(9838)
 Line 140 loads the assembly   | >140 CALL LOAD(9838,47,0,38,1
 program shown below. VMBW     | 14,4,32,32,36,3,128)
 Line 150 loads registers with | >150 CALL LOAD(12032,0,0,48,0
 VDP address, Buffer, Length.  | ,2,255)
 Line 160 runs line 140 program| >160 CALL EXECUTE(9838)
 Line 170 put a command in here| >170 CALL VCHAR(1,1,32,768)
 Line 180 loops to line 160    | >180 GOTO 160
 HEX ADDRESS|HEX VALUE|ASSEMBLY COMMAND EQUIVALENT
 >266E >2F00 DATA >2F00 (workspace area address)
 >2670 >2672 DATA >2672 (start execution address)
 >2672 >0420 BLWP (first executed command)
 >2674 >202C @VMBR (or >2024 VMBW)
 >2676 >0380 RTWP
 -------------------------------------------------------------
 >2F00 >0000 REGISTER 0 (VDP address)
 >2F02 >3000 REGISTER 1 (RAM buffer address)
 >2F04 >02FF REGISTER 2 (length of text)
 Normal XB using LINK.
 Initialize for Assembly. | >100 CALL INIT
 Load support routine.    | >110 CALL LOAD("DSK1.TEST")
 LINK to program.         | >120 CALL LINK("GO")
 RXB EXECUTE EXAMPLE.     |
 Initialize for Assembly. | >100 CALL INIT
 Load support routine.    | >110 CALL LOAD("DSK1.TEST")
 EXECUTE program address. | >120 CALL EXECUTE(13842)

 EXECUTE does no checking so the address must be correct.
 CALL LINK method finds the name and uses the 2 byte address
 after the name to run the Assembly. EXECUTE just runs the
 address without looking for a name thus faster.

This is an example of using RXB CALL EXECUTE(9838) and using CALL LOAD to set up the Workspace Registers instead of using CALL LINK to pass values.

Exactly what you were talking about in the original post of this thread. Once Registers are loaded Assembly can be used to keep track of values.

Look at page E4 and E5 of RXB docs, I do not see why this is not simple to follow for Assembly Programmers as it is just BLWP @address

P.S. Look at my GAME source and docs of IN THE DARK as it uses CALL EXECUTE only.

Edited July 25, 2017 by RXB

senior_falcon · July 25, 2017

Look at page E4 and E5 of RXB docs, I do not see why this is not simple to follow for Assembly Programmers as it is just BLWP @address

P.S. Look at my GAME source and docs of IN THE DARK as it uses CALL EXECUTE only.

I got as far as page 5 where you describe CALL EXEC and thought that is all you had written. No mention of BLWP on page 5. Will give it a try tonight.

Edited July 25, 2017 by senior_falcon

senior_falcon · July 26, 2017

Tonight I ran some speed tests on CALL EXECUTE, and here are my findings:

For an assembly program that does nothing but return to RXB, CALL EXECUTE(9760) takes .0142 seconds. For the equivalent CALL LINK it takes .0154 seconds. CALL EXECUTE has a very slight speed edge here.

This time for CALL LINK is the best case scenario - i.e. only one entry in the DEF table. With 33 entries in the table and the test program dead last it took .0173 seconds. So in the real world CALL EXECUTE would have a bit more of a speed advantage, but still not by a huge amount.

Let's see about loading some numbers that the assembly program can use. I will give an estimate using the method described in my original post, then the actual time.

CALL LOAD(10000,11,11,65) - The estimated time is .0224 seconds (.004 +4x.0046) and the actual time was .024 seconds. (This gives more confirmation that the method described in my original post can give a reasonable estimate of how long a CALL will take.)

Each number takes .008 seconds to load; to do it via the usual NUMREF takes .0086 seconds

How about if there are 18 numbers (plus the address) in the CALL LOAD?

The estimated time is .0914 seconds (.004+19x.0046) and the actual time was .0947 seconds. As before, the estimated and actual times are quite close.

Here each number takes .0053 seconds vs .0086 seconds for NUMREF. So if the number can be one byte then there is potential for a decent amount of speed increase.

However, if you need to pass a 2 byte word then CALL LOAD will always lose the race, taking about .0092 seconds per number vs .0086 for NUMREF, and that doesn't count the overhead of the CALL LOAD and passing the address to RXB.

My impression of CALL EXECUTE is that it would mostly be useful for some specialized niche applications like poking in a short program and then executing it. There is potential to gain a slight speed advantage, and sometimes that could be important. But the speed advantage is nor very great and you lose a lot of the versatility of CALL LINK. Looking at your example programs in the docs kind of reminds me of the Commodore programs back in the 1980's which were mostly peeks and pokes and undecipherable by normal people.

+Lee Stewart · July 26, 2017

Right now, I cannot find the thread where we discussed passing long lists of numbers in various ways. I wrote a variation on the theme where I passed an array, which allowed pretty much as long a list as we desired. I think I even avoided NUMASG in the process. That would make CALL LINK faster, no?

...lee

RXB · July 26, 2017

I got as far as page 5 where you describe CALL EXEC and thought that is all you had written. No mention of BLWP on page 5. Will give it a try tonight.

Hmmm page E5 as my pages are numbered from First letter of command alphabetically and the next is number of pages for that letter.

I always hated page numbers alone as it is not how they make phone books, they use the Alphabet like Dictionary or Phone Books.

RXB · July 26, 2017

Tonight I ran some speed tests on CALL EXECUTE, and here are my findings:

For an assembly program that does nothing but return to RXB, CALL EXECUTE(9760) takes .0142 seconds. For the equivalent CALL LINK it takes .0154 seconds. CALL EXECUTE has a very slight speed edge here.

This time for CALL LINK is the best case scenario - i.e. only one entry in the DEF table. With 33 entries in the table and the test program dead last it took .0173 seconds. So in the real world CALL EXECUTE would have a bit more of a speed advantage, but still not by a huge amount.

Let's see about loading some numbers that the assembly program can use. I will give an estimate using the method described in my original post, then the actual time.

CALL LOAD(10000,11,11,65) - The estimated time is .0224 seconds (.004 +4x.0046) and the actual time was .024 seconds. (This gives more confirmation that the method described in my original post can give a reasonable estimate of how long a CALL will take.)

Each number takes .008 seconds to load; to do it via the usual NUMREF takes .0086 seconds

How about if there are 18 numbers (plus the address) in the CALL LOAD?

The estimated time is .0914 seconds (.004+19x.0046) and the actual time was .0947 seconds. As before, the estimated and actual times are quite close.

Here each number takes .0053 seconds vs .0086 seconds for NUMREF. So if the number can be one byte then there is potential for a decent amount of speed increase.

However, if you need to pass a 2 byte word then CALL LOAD will always lose the race, taking about .0092 seconds per number vs .0086 for NUMREF, and that doesn't count the overhead of the CALL LOAD and passing the address to RXB.

My impression of CALL EXECUTE is that it would mostly be useful for some specialized niche applications like poking in a short program and then executing it. There is potential to gain a slight speed advantage, and sometimes that could be important. But the speed advantage is nor very great and you lose a lot of the versatility of CALL LINK. Looking at your example programs in the docs kind of reminds me of the Commodore programs back in the 1980's which were mostly peeks and pokes and undecipherable by normal people.

You can use CALL MOVES("RR",length,from address,to address) in RXB.

I include with RXB a XB program named LINESHOW that you merge into a XB program and it shows the line numbers and actual address in memory.

This is how I wrote my RXB game IN THE DARK and the assembly does not run from Lower 8K but upper 24K next to the XB program.

I use the Lower 8K for screens only using 362K for screens, I did make a version using 864K of SAMS.

I had wanted to make a section of program that was just imbedded Assembly in XB but you could no longer list the program or renumber.

See the big problem with CALL LINK is it does not work from anything but Lower 8K and does not allow using unused upper 24K RAM memory.

P.S. Just as a note I should add to RXB a report of free unused address in Upper 24K and Lower 8K so from Edit mode or Program mode use CALL SIZE.

Edited July 26, 2017 by RXB

senior_falcon · July 26, 2017

I got this far and figured that was all there was. You can see how I might think it was not enough information. Glad to find out the real docs have more detail!

By the way, I like your way of numbering the docs!

PAGE 5

RXB TO ASSEMBLY DIRECT ACCESS BY ADDRESS:

----------------------------------------------------------------

EXECUTE is much faster than the traditional LINK routine built

into XB. The main problem with LINK is it checks everything and

pushes everything onto the VDP stack. After getting to Assembly

it pops everything off the stack for use or pushes what is to

be passed to XB onto the stack. EXECUTE on the other hand just

passes a address to a 12 byte Assembly program in Fast RAM and

RTWP ends the users program. A LINK will use up 6 bytes for the

name, 2 bytes for the address and wastes time checking things.

The advantage to EXECUTE is you use LOAD or MOVE or MOVES to

place the values needed directly into the registers then do it.

EXECUTE uses less space, is faster, and is easy to debug.

Edited July 26, 2017 by senior_falcon

senior_falcon · July 26, 2017

See the big problem with CALL LINK is it does not work from anything but Lower 8K and does not allow using unused upper 24K RAM memory.

Sure you can. Take a look at "High Memory Assembly Code" in The Missing Link Docs.

RXB · July 26, 2017

Sure you can. Take a look at "High Memory Assembly Code" in The Missing Link Docs.

Ok let me put it this way, CALL LINK ONLY WORKS WITH NAMES IN LOwER 8K, you can not put the link names in Upper 24K as it never looks there unless you force it.

RXB has CALL EXECUTE that only uses address, any address so even OS sections can be used.

I suppose I should make a secondary version of CALL EXECUTE that instead of BLWP @ADDRESS also include a normal version that uses BL @ADDRESS

that way BLWP @ADDRESS uses RTWP to end and normal would use BL @ADDRESS uses RT to end.

I should mention a program with SAMS and CALL LINK is once you change pages CALL LINK ends being useful and the link names are now in a different page.

Thus CALL LINK is very very SAMS unfriendly to the USER.

Edited July 26, 2017 by RXB

+InsaneMultitasker · July 26, 2017

SO WHY IS SOMETHING I DID IN 2001 NOT BEEN DISCUSSED? (Did I just vanish from Earth?)

I was responding to the topic: methods for speeding up call link. RXB never even crossed my mind nor would I have even known about the other functions in 2001 or 2017, had you not written about them

Edited July 26, 2017 by InsaneMultitasker

senior_falcon · July 26, 2017

Ok let me put it this way, CALL LINK ONLY WORKS WITH NAMES IN LOwER 8K, you can not put the link names in Upper 24K as it never looks there unless you force it.

I should mention a program with SAMS and CALL LINK is once you change pages CALL LINK ends being useful and the link names are now in a different page.

Thus CALL LINK is very very SAMS unfriendly to the USER.

HMLOADER can load code completely in high memory, DEF table and all. CALL LINKs work normally. If needed, you'd have to relocate the support routines that CALL INIT loads, but you're probably doing that anyway. There is one exception: the word at >2000 or 8192 has to point to the new lookup routine in high memory and so is off limits. One other problem is easy to fix; After looking through the DEF table in high memory the routine will look for the DEF table in low memory. If interested, I can change a line of code for you so that does not happen.

RXB · July 27, 2017

HMLOADER can load code completely in high memory, DEF table and all. CALL LINKs work normally. If needed, you'd have to relocate the support routines that CALL INIT loads, but you're probably doing that anyway. There is one exception: the word at >2000 or 8192 has to point to the new lookup routine in high memory and so is off limits. One other problem is easy to fix; After looking through the DEF table in high memory the routine will look for the DEF table in low memory. If interested, I can change a line of code for you so that does not happen.

Do you have any examples of this working with SAMS?

senior_falcon · July 27, 2017

I know little about AMS other than that it bank switches RAM in 4K blocks. Can you post a short example showing how to use RXB to:

1 - switch 4k of low memory from >3000 to >3FFF

2 - switch all 8k of low memory from >2000 to >3FFF

BTW, your docs on the AMS have a couple goofs going from hex to decimal. On page AMS2 you have:

>19=31

>18=30

(I tried to copy/paste from the manual but this stupid editor keeps resetting itself)

RXB · July 27, 2017

In RXB on Classic99 or real Iron TI99/4A with RXB cart type:

CALL AMSINIT ! This is not needed in RXB 2015 but older versions of RXB needed it.

CALL AMSBANK(0,1) ! This puts SAMS 4K page 0 into >2000 to >2FFF and puts SAMS 4K page 1 into >3000 to >3FFF

CALL AMSBANK(238,239) ! This pus SAMS 4K page 238 into >2000 to >2FFF and puts SAMS 4K page 239 into >3000 to >3FFF

AMSBANK has 4K pages numbered from 0 to 239 meaning 240 of 4K pages, thus 960K of RAM for lower 8K use.

Thanks for the errors you found in my docs.

senior_falcon · July 27, 2017

That seems simple enough. How do I put a bank at just >3000->3FFF?

RXB · July 28, 2017

That seems simple enough. How do I put a bank at just >3000->3FFF?

Well the default bank for SAMS for low 8K memory is >02 for >2000 and >03 for >3000 but that is in PASS MODE.

This line is from RXB Docs:

The odd ball numbering scheme of AMSBANK results from pages
0 to 15 not being used in MAP mode. AMSBANK creates it's
own numbers of pages 0 to 239 by starting actually at page
16 of the AMS. That would be page 0 of AMSBANK. This lay out
leaves open 8 4K pages for PASS mode, and 8 4K pages for
future use. See docs MANUAL-AMS for additional information.

Basically you are asking to use PASS MODE and MAP MODE at same time.

It can be done but is a pain in butt to do as you are not using CALL AMSBANK much.

Why would you need the PASS MODE memory?

If you have something to load just do it in with CALL AMSBANK why do you need PASS MODE at all?

CALL AMSON ! needed to turn on CRU for mappers to be peeked or poked.

CALL PEEK(16388,A) ! is Lower 8K >2000 to >2FFF mapper.

CALL PEEK(16390,B) ! is Lower 8k >3000 to >3FFF mapper.

CALL AMSOFF ! needed to turn off CRU for mappers so no DSR conflicts result.

The above lines work in PASS MODE or MAP MODE and you can override RXB CALL AMSBANK with a CALL LOAD(16390,pick a page to use)

A much better way is just use RXB set up for example:

10 CALL AMSINIT

20 CALL AMSBANK(0,1) ! setus up a known pages and uses mapped pages.

30 CALL INIT

40 CALL LOAD("DSK#.yourprogram")

50 CALL AMSBANK(0,2) ! RXB page 2 is not the same page as PASS MODE >02 (see above note in box)

60 CALL LOAD("DSK#.otherprogram")

70 CALL LINK("example")

80 CALL AMSBANK(0,1)

90 CALL LINK("whatever")

By the way RXB defines a page as 4K and a bank as 8K

Also I should mention to deal with programs that use INTERUPTS a whole other subject is CALL ISRON(variable) and CALL ISROFF(variable)

But does not work with TML as it overrides XB totally, but does work on most other Lower 8K assembly that uses interrupts.

Edited July 28, 2017 by RXB

senior_falcon · July 28, 2017

Here's what I have found about getting SAMS to work with assembly routines:

As I suspected, there is an easy way to get SAMS to play nice with XB assembly routines. This is by partitioning the low memory so that the assembly routines are located from >2000 to >2FFF and only do the paging in SAMS from >3000 to >3FFF.

100 CALL AMSINIT::CALL AMSBANK(0,1)::CALL INIT::CALL LOAD(8196,48)

The call load tells the assembly loader that the memory ends at >3000 and not >4000 as is usual.

You can load assembly routines normally with CALL LOAD(...)

Then CALL AMSBANK(0,2) CALL AMSBANK(0,3) etc. will page in new ram from >3000 to >3FFF without effecting the assembly programs you loaded into >2000 to >2FFF.

Here's a simple little program to demo this (AMSTEST3K in the attached folder):

100 CALL AMSINIT

110 CALL AMSBANK(0,1)

120 CALL INIT

125 CALL LOAD(8196,48)

130 CALL LOAD("DSK4.AMSTEST.OBJ")

150 FOR I=1 TO 99

190 CALL AMSBANK(0,I):: CALL LOAD(16000,I):: CALL LINK("WRT3K")

200 NEXT I

This sequences through 99 pages. The WRT3K routine will fill each page (>3000 to >3FFF) with "MEMORY PAGE 01" up to "MEMORY PAGE 99". I didn't use NUMREF; it reads the number from 16000. You can watch this run in the Classic99 debugger.

It is a little more involved if you want to use the full 8K low memory. This requires embedding the assembly routines in the XB program in high memory. HMLOADAMS in the attached folder is one way to do this.

RUN HMLOADAMS and you are prompted for the names of assembly routines you would like to load. In this case I just used DSK4.AMSTEST.OBJ. The program does its thing and produces a 1 line XB program with the assembly routine(s) embedded. Then you can add XB lines as necessary. I made the following program (AMSTEST2K) that writes to both pages in low memory. (WRT2K fills the memory from >2000 to >2FFF)

3 CALL LOAD(8192,255,032,"",8198,170,85)

10 CALL AMSINIT

20 CALL AMSBANK(0,1)

30 CALL LOAD(10000,0)

40 CALL LINK("WRT2K")

50 CALL LOAD(8192,255,32,"",8198,170,85)

60 CALL LOAD(16000,1)

70 CALL LINK("WRT3K")

The CALL LOAD to 8192 tells CALL LINK where to to go; 8198 tells XB that CALL INIT has happened. Without it you will get an error message.

HMLOADAMS needs some work - if it can't find the a/l sub it will crash. Read the HMLOADER section of The Missing Link manual for more information about how to use this.

AMS+ASSEMBLY.zip

RXB · July 28, 2017

Very cool you can do this, but wow not for the novice nor is it easy to just write something simple to understand like for XB.

Basically you are just doing Assembly from XB cart where is the XB at? Cause it looks like 90% Assembly with XB or Basic s a secondary feature.

Again very cool and useful, but not the least user friendly for XB programmers which is my niche.

How does this speed up CALL LINK?

senior_falcon · July 28, 2017

Your posts led to this veering off topic

From post #11:

"I had wanted to make a section of program that was just imbedded Assembly in XB but you could no longer list the program or renumber.

See the big problem with CALL LINK is it does not work from anything but Lower 8K and does not allow using unused upper 24K RAM memory."

From post #14:

"Ok let me put it this way, CALL LINK ONLY WORKS WITH NAMES IN LOwER 8K, you can not put the link names in Upper 24K as it never looks there unless you force it."

I have shown how to add assembly subprograms to SAMS in two different ways, one of which is no different than the normal CALL LOAD method, and still allow access to different pages in SAMS without overwriting the assembly subprograms. Whether you or anyone else finds this useful is not for me to say.

Edited July 28, 2017 by senior_falcon

RXB · July 28, 2017

Cool I see I veered you off topic. So sorry about that. So how does this speed up CALL LINK?

It is cool you can load anywhere in XB and still get CALL LINK to work, but how is CALL LINK now faster?

Edited July 28, 2017 by RXB

Methods for speeding up CALL LINK

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members