Jump to content

Photo

Assembly on the 99/4A


554 replies to this topic

#551 Asmusr ONLINE  

Asmusr

    River Patroller

  • 2,399 posts
  • Location:Denmark

Posted Sat Aug 12, 2017 2:14 AM

Sure, but the ISR is the one thing to be aware of that might interfere with your program, by trashing your variables in scratch pad or changing the VDP address. I always turn off the ISR. 



#552 TheBF OFFLINE  

TheBF

    Moonsweeper

  • 280 posts
  • Location:The Great White North

Posted Sat Aug 12, 2017 10:07 PM

 

Well, on the TI-99/4A, I've always used the stack growing in the other direction. To follow your example ...

 

A push is then done with

MOV @DATA,,*SP+

 

and a pop with

DECT SP

MOV *SP,@DATA
   
;)

 

 

A convention in my experience is to have stacks grow downwards from high RAM and memory usage to grow upwards.

Not written in stone somewhere, just something I have seen.

 

BF



#553 sometimes99er OFFLINE  

sometimes99er

    River Patroller

  • 3,865 posts
  • Location:Denmark

Posted Sun Aug 13, 2017 12:47 AM

... my experience is to have stacks grow downwards ...

 

Yes.

 

For my practical use of a stack simply saving and retrieving return addresses. I originally only looked at saving a few bytes and not at CPU cycles used.

 

Growing downwards would need 4 bytes in-line to PUSH and 4 bytes to POP and return. 8 bytes in total. PUSH would consist of 2 lines of instructions, namely DECT and MOV. POP would have MOV and RT.

 

Growing upwards would need 2 bytes in-line to PUSH (1 line is nice and clean) and 6 bytes to POP and return. Also 8 bytes in total. I replaced the in-line POP with a branch which is 4 bytes *). PUSH would consist of 1 line, namely MOV. POP would have 1 line (nice and clean), namely B. The general code to branch to consists of 6 bytes or 3 lines of instructions, namely DECT, MOV and RT.

 

Growing down would use 8 bytes per routine.

 

Growing up would use 6 bytes per routine and one generel chunk of 6 bytes.

 

There's a break-even at 3 routines. 

 

8X = 6X + 6

 

2X = 6

 

X = 3

 

*) Could be optimized and cut to 2 bytes using jump in a very few cases.

 

For fun I now took a look at CPU cycles used according to Classic99.

 

Growing down would use 78 cycles per routine. Growing up without branching use the same cycle count.

 

Growing up with branching would use 130 cycles per routine.

 

Did I get it all right? Hope so.  :)



#554 apersson850 OFFLINE  

apersson850

    Moonsweeper

  • 372 posts

Posted Sun Aug 13, 2017 3:08 AM

If you use it for saving return addresses only, it doesn't matter much.

But if the stack is used for other things too, then it's sometimes handy to have the stack pointer actually pointing at the top of stack all the time, not pointing to the next free space above top of stack. With the TMS 9900 you can easily address top of stack in any case, but if you let the stack grow downwards, as I indicated, then top of stack is accessed by *SP. If you let it grow upwards, and want to use autoincrement, then you have to refer to top of stack as @-2(SP). Doable, but slower and consumes more space.

 

When allocating a frame on the stack, i.e. a larger piece of data, then it's also easy to refer to the items in that record by indexing from the stack pointer with positive indexes. You can do the opposite, but at least to me, I find it easier to think in positive terms.

You push by AI -ITEMSIZE,SP

When you transferred the data to the stack, you reach the top element by *SP and items further down by @OFFSET(SP).

You have use for such data on the stack when traversing graphs, for example.



#555 TheBF OFFLINE  

TheBF

    Moonsweeper

  • 280 posts
  • Location:The Great White North

Posted Sun Aug 13, 2017 9:19 AM

This is the way most Forths are implemented for both stacks for all the reasons you state.

 

In CAMEL99 I tried a common optimization and CACHE the top of stack value in a register.

This changes where you have to push and pop but on balance it speeds up a Forth system by about 10%.

Simple Math operations become:

 

A  *SP+,R4   

versus

A  *SP+,*SP

 

Fetching the contents of a variable is:

 

MOV *R4,R4

 

I like it. :)






0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users