Jump to content
IGNORED

Assembly on the 99/4A


matthew180

Recommended Posts

My old bitmap test app also lets you play with the duplication, seeing the effects of the various masks. :) Unfortunately, it's not terribly well documented. ;)

 

https://github.com/tursilion/bitmaptest

 

You can get into half bitmap mode by pressing 1, 5, F, and 9 to turn on sprites if they are off. You'll get interesting sprite effects with other masks too, check the readme.txt in the zip file.

 

I seem to remember hoping to use this to work out the exact masking of the sprites, but found it wasn't completely intuitive... certain masks will turn on the duplication only for certain thirds. ;) (I don't seem to have my video anymore, though...)

 

bitmaptest.zip

 

(Edit: looks a lot like I remember on JS99er! ;) )

 

 

 

Edited by Tursi
  • Like 3
Link to comment
Share on other sites

18 hours ago, Lee Stewart said:

 

I guess this means that the microcode for MOV *R0+,R1, with >8300 in R0 does this:

  1. Retrieves the contents of R0, which is the address from which to retrieve the data to store in the destination address.
  2. Increments R0, which now contains >8302
  3. Reads the contents of the retrieved address (>8300), which just happens to be the address of R0, which has already been incremented to >8302!
  4. Copies >8302 to R1.

This would seem an undesirable side effect of the microcode—even, dare I say, a TMS9900 bug.

 

...lee

Dang it! That's an evil gotcha! Imagine trying to debug that back in the day without a single-step debugger. Shudder ?

  • Like 1
Link to comment
Share on other sites

15 hours ago, Asmusr said:

 

 

Sorry, it looks like only the cartridge image, which is what I used on my real hardware (and in Classic99/JS99er), is returning DIV=0000, so there must be something wrong with it.

IMG_0726.JPG

 

<geek>

Do I spy a ZX Spectrum/Jupiter ACE character set?

</geek>?

  • Like 3
Link to comment
Share on other sites

18 hours ago, Jeff White said:

What if you did the following?

 

LWPI >8300

STWP R0

MOV  *R0+,*R0+

 

Will the 9900 (99/4A) with its destination prefetch give a different result than the 9995 (in the 99/8, 99/2, Tutor, and Geneve)?

 

I believe so.  What are the values in R0, R1, and R2 after the MOV?

 

R0 = >8304

R1 = >8302

R2 is unchanged

 

On my (real) Geneve:

 

 

 

geneve_autoinc1.jpg

geneve_autoinc2.jpg

Edited by mizapf
  • Like 1
Link to comment
Share on other sites

Here is the result on vanilla TI-99/4A.

 

By way of explaining the code.

PUSH, is macro that pushes a register onto the Forth stack 

R4 is the top of stack cache register.

So what you are seeing left to right in the video is  R0,R1,R2,R4


HEX
CODE TEST
      8300 LWPI,
      R0 STWP,
      R1 CLR,
      R2 CLR,
      R0 *+ R0 *+ MOV,
      R0 PUSH,
      R1 PUSH,
      R2 PUSH,
      NEXT,
ENDCODE

 

 

  • Like 3
Link to comment
Share on other sites

HEX
CODE TEST
      8300 LWPI,
      R0 STWP,
      R1 CLR,
      R2 CLR,
      R0 *+ R0 *+ MOV,
      SP DECT,  
      R0  *SP  MOV,
      SP DECT, 
      R1  *SP  MOV,
      SP DECT, 
      R2 *SP  MOV,
      NEXT,
ENDCODE

This is the code expanded.  I think that makes it clearer. 

The ALC macro PUSH,  does not push the TOS register first. ie: it is not a "Forth" thing. It is an ALC thing for using the SP register as a stack pointer.

It literally is general purpose and pushes the register argument onto the memory stack.

 

Link to comment
Share on other sites

On 1/2/2021 at 8:53 AM, Asmusr said:

What you write about the colors is correct. But the problem is that the pattern mask is dependent of the color mask except for the first two bits. If the color mask is 00XCCCCCCC111111 where 7 Cs are taken from VR3, the pattern mask becomes 00XPPCCCCC111111 where only the 2 Ps and taken from VR4 and the 5 Cs are the least significant bits of VR3. It means that if you reduce the number of colors below 256 you also reduce the number of unique patterns. As you describe your 64-column terminal emulator you used a 2K color table, which is fine, but if you go below that you cannot use a full character set.

You are the Graphics King.  Maybe I knew that ~30 years ago by evidence of how I wrote my TE, but by the time I wrote my explanation I fumbled.  Or did I?  No matter.  It does not follow that the bit masking by VR3 should affect the bit masking of VR4, but if it does it does.  Maybe it is different between the 9918A and V9938.  I had both 99/4A and Geneve versions of my TE at that time, and I recall — likely incorrectly — that I dropped the CT to 64 bytes.

 

In any event, neither Thierry’s nor my explanation IMO covers this VR3 and VR4 masking sufficiently.  Maybe this is a clue to understanding the sprite duplication.

Edited by Jeff White
  • Like 1
Link to comment
Share on other sites

6 hours ago, mizapf said:

R0 = >8304

R1 = >8302

R2 is unchanged

 

On my (real) Geneve:

 

 

 

geneve_autoinc1.jpg

geneve_autoinc2.jpg

After thinking about it, I think the 9900 must suppress the auto-increment on the read destination before write destination after source read and auto-increment.  Otherwise, it would not work correctly though unexpectedly.

 

Thankfully, I do not think there is a good reason to code these edge cases. Is there a good reason?

 

 

Link to comment
Share on other sites

5 hours ago, Jeff White said:

After thinking about it, I think the 9900 must suppress the auto-increment on the read destination before write destination after source read and auto-increment.  Otherwise, it would not work correctly though unexpectedly.

Right, there's no auto-increment on the read before write -- but read before write is a memory operation, not a logic operation. The post-increment is part of the address calculation. Once the address is calculated, then it's read and written. :)

 

Link to comment
Share on other sites

On 1/2/2021 at 1:15 AM, Airshack said:

Admittedly, this being a learning thread I’d have to ask you something clever like, “What?”  

 

I really do do want to understand what you’re suggesting yet it soars above my head.

 

Think in terms of trying to do a fine vertical scroll in bit-mapped mode.  Do you want your SIT characters sitting side-by-side or stacked on top of another making a long tall rectangle.  I was just pointing that you are not locked into using the 0,1,2,3,...255 SIT layout when using bit-mapped mode.

 

The redefinition the SIT layout shows off the flexablity of the TMS9918A bit-mapped mode based on your program/game needs.

 

TI-Artist uses the standard SIT layout of 0,1,2,3,...255.

 

 

  • Like 1
Link to comment
Share on other sites

17 hours ago, Torrax said:

Think in terms of trying to do a fine vertical scroll in bit-mapped mode.  Do you want your SIT characters sitting side-by-side or stacked on top of another making a long tall rectangle.  I was just pointing that you are not locked into using the 0,1,2,3,...255 SIT layout when using bit-mapped mode.

Interesting point made. Thank you.

Link to comment
Share on other sites

Is there a CRU base address similar to >0006 for the second joystick? How about a Test Bit series 0-4 for Joystick #2?
 

CRU bits 3-7 are apparently for “Joystick.” Is there some scheme here where this works for both joysticks?
 

Looked in EA manual and the 9901 data book, TI Tech Data, and snooped around on AtariAge. 
 

There’s something obvious here I’m missing for sure. Maybe another bit to test for the second controller?

Link to comment
Share on other sites

11 hours ago, Airshack said:

Is there a CRU base address similar to >0006 for the second joystick? How about a Test Bit series 0-4 for Joystick #2?
 

CRU bits 3-7 are apparently for “Joystick.” Is there some scheme here where this works for both joysticks?
 

Looked in EA manual and the 9901 data book, TI Tech Data, and snooped around on AtariAge. 
 

There’s something obvious here I’m missing for sure. Maybe another bit to test for the second controller?

First you set up set column to check by writing to CRU bits 18-20 (R12=36). The column is 6 for joystick 1 and 7 for joystick 2.

Then you read the values from CRU bits 3-7 (R12=6).

See www.unige.ch/medecine/nouspikel/ti99/keyboard.htm#quick%20scan

 

Edited: some numbers were hex values.

Edited by Asmusr
  • Thanks 1
Link to comment
Share on other sites

; JOYST ( joystick# -- value )
; Scans the joystick returning the direction value
        li r1,6                     ; use keyboard select 6 for #0, 7 for #1
        swpb r1
        li r12,36
        ldcr r1,3
        li r12,6
        stcr r1,5
        swpb r1
        inv r1
        andi r1,>001f

At the end of this routine, R1 contains a bit pattern as follows:
  • 1=Fire
  • 2=Left
  • 4=Right
  • 8=Down
  • 16=Up

So your can simply mask the bits to examine the joystick/button status that you're interested in.

 

At the start of the routine load R1 with 6 for joystick 0 or 7 for joystick 1.

 

I took this from TurboForth's joystick routine.

 

Cheers

 

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

       

11 hours ago, Asmusr said:

First you set up set column to check by writing to CRU bits 18-20 (R12=36). The column is 6 for joystick 1 and 7 for joystick 2.

 

I found the following code in my One-Time Initialization routine (which I borrowed from Matthew earlier in the thread):

 

JOYSTICK-1:

*    Set the 9901 to always read joystick 1 (column 6)
       LI   R1,>0600          * Keyboard column 6 (joystick 1)
       LI   R12,36            * Set the CRU port
*                                      - the CRU address = bits 3-14 of R12; xxx0 0000 0010 010x = >012
       LDCR R1,3              * Load the CRU, setting the column latch; E/A p.151, "LOAD CRU"
*                                      - sends 3 least significant bits (0,1,2) from >06 to the CRU
*                                      -since bits sent < 9, the source address (@R1) is a byte address
*                                          - byte address = >06 = 0000 0110; so >110 sent to CRU

^^^^ This must be what you're describing as a "column latch" for keyboard column 6.

 

So...for Joystick-2, I simply need to do the following in order to set the column latch to seven for Joystick-2?

 

JOYSTICK-2:

*    Set the 9901 to always read joystick 2 (column 7)
       LI   R1,>0700          * Keyboard column 7 (joystick-2)
       LI   R12,36            * Set the CRU port
*                                      - the CRU address = bits 3-14 of R12; xxx0 0000 0010 010x = >0024 
       LDCR R1,3              * Load the CRU, setting the column latch; E/A p.151, "LOAD CRU"
*                                      - sends 3 least significant bits (0,1,2) from >07 to the CRU
*                                      -since bits sent < 9, the source address (@R1) is a byte address
*                                          - byte address = >07 = 0000 0111; so >111 sent to CRU

 

Sound right? 

 

So now to test the button on Joystick-2:

 

      TB    0        ; test Joystick-2 button
      JEQ   BTN2PR   ; jump if buttong-2 pressed

 

For two-player games we're constantly switching the column latch from 6 to 7 then?          

Link to comment
Share on other sites

5 hours ago, Willsy said:

; JOYST ( joystick# -- value )
; Scans the joystick returning the direction value
        li r1,6                     ; use keyboard select 6 for #0, 7 for #1
        swpb r1
        li r12,36
        ldcr r1,3
        li r12,6
        stcr r1,5
        swpb r1
        inv r1
        andi r1,>001f

At the end of this routine, R1 contains a bit pattern as follows:
  • 1=Fire
  • 2=Left
  • 4=Right
  • 8=Down
  • 16=Up

So your can simply mask the bits to examine the joystick/button status that you're interested in.

 

At the start of the routine load R1 with 6 for joystick 0 or 7 for joystick 1.

 

I took this from TurboForth's joystick routine.

Handy! Thank you @Willsy. Borrowing is always easier than repeatedly bumping ones head into the wall.

 

Slight edit:

******************************************************
*  JOYST - TurboForth's QUICK Joystick Routine
******************************************************
        LI    R1,>0700      ; column 7 for joystick-2 (6 for joystick-1)
        LI    R12,36        ; >24 
        LDCR  R1,3   `      ; load CRU w/ column latch
        LI    R12,6
        STCR  R1,5          ; reads 5-bits from CRU and stores them in R1
                            ; 5-bits start from CRU address in R12(bits3-14)
                            ; R1 = [000/Up Down/Right/Left/Button 0000 0000] (1=inactive,0=active)
        MOVB  R1,@R1LB      ; fast-copy joystick data to least significant byte
        INV   R1            ; Inactive bits = 0, Active bits = 1
        ANDI  R1,>001F      ; clear most significant 11 (unused) bits

 

  • Like 4
Link to comment
Share on other sites

2 hours ago, Airshack said:

Handy! Thank you @Willsy. Borrowing is always easier than repeatedly bumping ones head into the wall.

 

Slight edit:


******************************************************
*  JOYST - TurboForth's QUICK Joystick Routine
******************************************************
        LI    R1,>0700      ; column 7 for joystick-2 (6 for joystick-1)
        LI    R12,36        ; >24 
        LDCR  R1,3   `      ; load CRU w/ column latch
        LI    R12,6
        STCR  R1,5          ; reads 5-bits from CRU and stores them in R1
                            ; 5-bits start from CRU address in R12(bits3-14)
                            ; R1 = [000/Up Down/Right/Left/Button 0000 0000] (1=inactive,0=active)
        MOVB  R1,@R1LB      ; fast-copy joystick data to least significant byte
        INV   R1            ; Inactive bits = 0, Active bits = 1
        ANDI  R1,>001F      ; clear most significant 11 (unused) bits

 

This might not apply to many apps but if you have interrupts running or if you are polling them on and off regularly, you might want to a line at the end to reset the screen timeout.

CLR @>83D6

I did it because I have interrupts running most of the time and it really sucks to have the screen go blank 9 minutes or so into a good game. :) 

Link to comment
Share on other sites

If you are only scanning joysticks, you can leave R12 at CRU address 6 (CRU bit 3) and toggle between them with bit  18 with SBZ 15 and SBO 15.

 

Preset joystick 0:

 

LI R12,6

SBZ 15

SBO 16

SBO 17

 

Read joystick:

 

STCR R1,5

 

Swap to joystick 0:

 

SBZ 15

 

Swap to joystick 1:

 

SBO 15

 

This way, you can speed up joystick scans.  CRU bit 18 at CRU address 36, 15 bits offset from CRU bit 3 at CRU address 6, toggles between joystick 0 and joystick 1 (swapped on Geneve) when CRU bits 19 and 20 at CRU addresses 38 and 40, 16 and 17 bits offset from CRU bit 3 at CRU address 6, are set to 1.

 

I think this is right, but I have beer goggles on.

 

In summary, once you have set up the joystick scan for either joystick, you can swap between joysticks with SBZ and SBO without using LDCR.

 

  • Like 4
Link to comment
Share on other sites

How many clock cycles should "CLR @VDPSTA" take?  (VDPSTA EQU >8802)

 

According to http://www.unige.ch/medecine/nouspikel/ti99/tms9900.htm it should be 26 clock cycles for instructions in fast memory and address in slow memory.

 

Yet Classic99 tells me:

 

8390  04E0  clr  @>8802                 (22)
      8802

 

Does the VDPSTA register somehow only incur the 4-cycle penalty for reads, not writes?

Link to comment
Share on other sites

9 minutes ago, Lee Stewart said:

 

I must be missing something. Why would you attempt to write to a VDP register that can only be read?

 

...lee

I don't care about the write, I only need the read to clear VDPSTA to clear the collision flag, etc.  I have "LWPI VDPWA" in effect, so I don't want to write to any of the R0 registers, and I want to use as few clock cycles as possible.  If you know of a better way...

Link to comment
Share on other sites

2 hours ago, PeteE said:

How many clock cycles should "CLR @VDPSTA" take?  (VDPSTA EQU >8802)

 

According to http://www.unige.ch/medecine/nouspikel/ti99/tms9900.htm it should be 26 clock cycles for instructions in fast memory and address in slow memory.

 

Yet Classic99 tells me:

 

8390  04E0  clr  @>8802                 (22)
      8802

 

Does the VDPSTA register somehow only incur the 4-cycle penalty for reads, not writes?

No, it doesn't. 

I get CLR is 10, plus 3 memory cycles (instruction, read, write). You're in scratchpad, so the instruction read is covered, VDPST adds 4+4.

Symbolic adds 8 plus 1 memory cycle (read argument). You're in scratchpad, so the argument read is covered.

 

10+4+4+8 = 26 cycles by my math too.

 

It looks like at some point I broke the read-before-write timing - all the code is there, but it disagrees on who is responsible. The fix is simple, but I'll do a deeper audit to make sure I got it all.

 

 

  • Like 1
Link to comment
Share on other sites

35 minutes ago, Tursi said:

I get CLR is 10, plus 3 memory cycles (instruction, read, write). You're in scratchpad, so the instruction read is covered, VDPST adds 4+4.

Symbolic adds 8 plus 1 memory cycle (read argument). You're in scratchpad, so the argument read is covered.

 

10+4+4+8 = 26 cycles by my math too.

 

It looks like at some point I broke the read-before-write timing - all the code is there, but it disagrees on who is responsible. The fix is simple, but I'll do a deeper audit to make sure I got it all.

Thanks, Tursi.  I've grown to trust Classic99 based on how much work you've done to make it cycle accurate.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...