Jump to content
IGNORED

Assembly on the 99/4A


matthew180

Recommended Posts

Do you have the issue numbers for the Micropendium articles? I'd like to have a read of those articles...

 

Page 27 Practical EPROM Circuts with diagrams and list of parts. Crap file to big but name is Adobe file mp9008 on the CD ROM under Micropendium in magazines.

Edited by RXB
Link to comment
Share on other sites

Indeed. Sorry, I should have been more clear! I meant in terms of distributing cartridges with GROM data inside them. For example, TurboForth is a 16K ROM application. It would have been nice to have GROM in there too, but it's just not possible yet from a technical standpoint; the cartridge PCB with the GROM/GRAM emulation doesn't exist yet.

 

Yeah, except for my AVR-based GROM simulator which has existed for almost three years now. ;) As proof of concept, it was recently used by someone here on these forums to reproduce a GROM-only cartridge (first instance to my knowledge except for my demo carts).

 

http://www.harmlesslion.com/software/simulator

 

The reproduction -- I really thought it was posted here, but apparently not! This fellow managed to build a USB-updatable version, his page is here:

http://www.floka.com/grom_gnusb.html

 

It works. You DO have to do some of your own work to use it in its current form, specifically, you have to compile the code to include your ROM data, program the AVR, and you have to build your own PCB, but that shouldn't scare anyone here anymore. ;)

 

That said, when I get off my lazy butt (disclaimer: it's a very busy butt!) and finish the changes I want to make, the PCB in that picture will be able to use my code with a whole bunch of other features, and be programmable much more simply in the TI itself. But, given the way things look, even that's a little ways out for me yet.

Link to comment
Share on other sites

Emulation of GROM, thanks Tursi that is exactly what could add to any cart someone makes like TurboForth or ROM as this would allow more to be on loaded to the system then from disk.

 

So my REA with Built in Editor/Assembler/GPL Assembler and a SuperCart is possible thanks to your design.

Link to comment
Share on other sites

  • 1 month later...

I read through this thread. I just have a few comments.

In some places the BL instruction is called Branch and Load, in other places Branch and Link. Branch and Link is the correct name. Not very important, of course.

Regarding BLWP, you can in several places in this thread get the idea that registers 13, 14 and 15 in the calling workspace are used by BLWP. They are not. It's the workspace of the called routine which uses registers 13, 14 and 15 to know which workspace to reload, where it came from and what to load into the status register.

Then we have stacks. The examples above use R10 as a stack pointer and let the stack grow towards higher memory locations. If you want to use the 9900 instruction set as good as possible, and also have the benefit of being able to access the word currently at the top of the stack by a simple *R10, then it's better to let the stack grow towards lower memory location.

Stack declaration, initialization, push and pop are then done like this.

SP	 EQU  10
STACK  BSS  50
STKBOT
START  LI   SP,STKBOT
PUSH   DECT SP
   MOV  R0,*SP
POP	MOV *SP+,R0
* Subroutines
   BL  @SUB1
* More code
SUB1   DECT SP
   MOV  R11,*SP
   BL   @SUB2
   B	*SP+
SUB2   DECT SP
   MOV R11,*SP
   BL   @SUB3
   B	*SP+
* and so on, with SUB3 and others defined below

Especially if you want to use the stack for more data than just return addresses, it's very convenient to be able to access the top of stack with *SP, the next word down the stack with @2(SP) etc.

 

I understand that you who have posted here seems to be mostly interested in speed optimized code for running games. My interest when I used my 99/4A wasn't primarily into that field. I wasn't so careful with memory use either, since one of the best things I did with my 99/4A was when I installed 64 KB of 16-bit wide RAM in the machine. Thus the ordinary memory expansion was replaced with memory where all accesses to words took two cycles, not six. It's like having the scratch pad RAM all over the place.

Due to the ability to shadow console ROM, this also made it possible to modify otherwise fixed interrrupt vectors and such, so the machine was more flexible. My design also allowed paging the ordinary 8-bit memory expansion in and out, to give more storage space.

Edited by apersson850
Link to comment
Share on other sites

...

I understand that you who have posted here seems to be mostly interested in speed optimized code for running games. My interest when I used my 99/4A wasn't primarily into that field. I wasn't so careful with memory use either, since one of the best things I did with my 99/4A was when I installed 64 KB of 16-bit wide RAM in the machine. Thus the ordinary memory expansion was replaced with memory where all accesses to words took two cycles, not six. It's like having the scratch pad RAM all over the place.

Due to the ability to shadow console ROM, this also made it possible to modify otherwise fixed interrrupt vectors and such, so the machine was more flexible. My design also allowed paging the ordinary 8-bit memory expansion in and out, to give more storage space.

 

I, for one, would love to see details of your 64KB-RAM installation and how you used it.

 

...lee

Link to comment
Share on other sites

Ohh gee, I haven't really used my 99/4A for years. But it's actually all put together and ready to power on, in a room by itself up in the attic (the advantage of a large house...)

Basically, it's two 32 K static RAM circuits piggy-backed on console ROM. Then various LS-TTL circuits to create chip selects and 8 bits of CRU output at 0400H. These bits, when reset, page in the 16-bit RAM where the 32 K memory expansion normally would be. Then I can page in chunks of 8 K RAM at 0000H, 4000H, 6000H and 8000H if I want RAM to overlay the console ROM, DSR, cartridge and console RAM/memory mapped devices. I can also page out the fast memory expansion and page in the normal, 8-bit expansion in the PEB if I want that.

Finally, one CRU bit inserts a wait state on VDP access, since code taking advantage of the fact that you in some cases can omit the recommended "spend time" instruction between certain VDP accesses now will not work any longer, when everything always runs in fast RAM. Code can be run in 8-bit RAM for compatibility, but with this I/O bit set, only the actual VDP accesses need to be slowed down.

 

Pure assembly programs which run with all code and workspaces in 32 K expansion memory run about 110% faster. Most software does of course use scratch pad RAM for workspace, and accesses VDP and stuff frequently. There the speed gain is smaller. Pascal programs run about 10% faster, BASIC speedup is barely noticeable (probably spends most of the time in the GPL interpreter, which is in 16-bit ROM anyway). Forth increases more than Pascal in speed. Both have their inner interpreters in 16-bit RAM from the beginning, but Forth always runs code from the memory expansion, while Pascal's primary code pool is in VDP RAM.

 

I used this extra memory together with the memory in my 56 K Maximem module, as well as some more RAM on cards I've designed for the PEB, to create a RAM-disk which worked with the UCSD p-system. Thus files like SYSTEM.EDITOR, SYSTEM.FILER and SYSTEM.COMPILER could reside in RAM. That allowed for faster turnaround when developing software for that system.

 

I may have some documentation laying around. I'll look for it. Probably best to make a thread by itself for that.

Edited by apersson850
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

  • 1 year later...

You're welcome. It is nice to know people are still looking back at the older threads. There is a lot of really good info in this forum. Most people tend to shy away from assembly because they think it is hard or something, but I really enjoy it and it is really well suited to these older limited-resource computers. It does take a little more time to "get in to" than something like BASIC or XB, but assembly is also *very* rewarding. This forum has a good mix of assembly programmers too (I am a speed/performance freak, others are more balanced, and a few will use whatever works). For me helping people is a lot of fun, so I hope you have tons of questions! :-)

  • Like 2
Link to comment
Share on other sites

You're welcome. It is nice to know people are still looking back at the older threads. There is a lot of really good info in this forum. Most people tend to shy away from assembly because they think it is hard or something, but I really enjoy it and it is really well suited to these older limited-resource computers. It does take a little more time to "get in to" than something like BASIC or XB, but assembly is also *very* rewarding. This forum has a good mix of assembly programmers too (I am a speed/performance freak, others are more balanced, and a few will use whatever works). For me helping people is a lot of fun, so I hope you have tons of questions! :-)

 

Yeah, great stuff Matt. You and I have our differences on the validity of using BLWP, but I definitely respect and admire your knowledge and work in educating everyone about TMS9900 assembly. :)

 

Adamantyr

  • Like 1
Link to comment
Share on other sites

You're welcome. It is nice to know people are still looking back at the older threads. There is a lot of really good info in this forum. Most people tend to shy away from assembly because they think it is hard or something, but I really enjoy it and it is really well suited to these older limited-resource computers. It does take a little more time to "get in to" than something like BASIC or XB, but assembly is also *very* rewarding. This forum has a good mix of assembly programmers too (I am a speed/performance freak, others are more balanced, and a few will use whatever works). For me helping people is a lot of fun, so I hope you have tons of questions! :-)

 

Thank you for being so generous with your knowledge. Be sure I will bug you if I get stuck :)

 

Today I spent sometime reading your tutorials here and other documents on the VDP. Then I copied your code(the randomize routines come specially handy ;) I don't know if I am going to use them as they are but they are a starting point ) from this thread and adapted it to my needs. And at the moment I have graphics mode 2 working + sprites. Now it's time to get dirty with 'real' TMS9900 assembly as I need to start putting the game together(I am coding the same game on the 6502 and 6809, so I only need to translate what I already have and then add what else might be needed).

  • Like 1
Link to comment
Share on other sites

I used this extra memory together with the memory in my 56 K Maximem module, as well as some more RAM on cards I've designed for the PEB, to create a RAM-disk which worked with the UCSD p-system. Thus files like SYSTEM.EDITOR, SYSTEM.FILER and SYSTEM.COMPILER could reside in RAM. That allowed for faster turnaround when developing software for that system.

 

I may have some documentation laying around. I'll look for it. Probably best to make a thread by itself for that.

 

YES, Please DO share what you find for documentation. I love dabbling in both hardware and software. Been meaning to learn A/L for about 20 years now but I find myself working on hardware because I have more experience there (which isn't a lot but enough to be dangerous). If you don't mind expanding, I'd be interested in hearing what all you've designed!

 

-Dano

  • Like 1
Link to comment
Share on other sites

(I am coding the same game on the 6502 and 6809, so I only need to translate what I already have and then add what else might be needed).

 

What 6809-based system are you working on? The 6809 was popular in arcade machines, but I didn't think it was used in many (if any) home computers.

 

  • Like 1
Link to comment
Share on other sites

If you're doing a version for the 6502, you might want to consider a version of your game for the Creativision console. It has a 6502 CPU, but shares the VDP and sound chip with the TI-99, so you could probably re-use all the graphics and much of the 6502 code you have.

 

Today I spent sometime reading your tutorials here and other documents on the VDP. Then I copied your code(the randomize routines come specially handy ;) I don't know if I am going to use them as they are but they are a starting point ) from this thread and adapted it to my needs. And at the moment I have graphics mode 2 working + sprites. Now it's time to get dirty with 'real' TMS9900 assembly as I need to start putting the game together(I am coding the same game on the 6502 and 6809, so I only need to translate what I already have and then add what else might be needed).

  • Like 1
Link to comment
Share on other sites

If you're doing a version for the 6502, you might want to consider a version of your game for the Creativision console. It has a 6502 CPU, but shares the VDP and sound chip with the TI-99, so you could probably re-use all the graphics and much of the 6502 code you have.

 

It seems a good candidate for a port. ;)

  • Like 1
Link to comment
Share on other sites

You're welcome. It is nice to know people are still looking back at the older threads. There is a lot of really good info in this forum. Most people tend to shy away from assembly because they think it is hard or something, but I really enjoy it and it is really well suited to these older limited-resource computers. It does take a little more time to "get in to" than something like BASIC or XB, but assembly is also *very* rewarding. This forum has a good mix of assembly programmers too (I am a speed/performance freak, others are more balanced, and a few will use whatever works). For me helping people is a lot of fun, so I hope you have tons of questions! :-)

Some of us even (try to) use what doesn't work! :-o hehehehe

 

The biggest challenge for me long ago was understanding how the various pieces fit together and what was needed to replicate something as simple as clearing the screen or gathering input from a device. I remember studying the EA manual trying to make heads or tails of what I was "reading". It wasn't until I had some existing code to work with and change that things started making sense. Now the challenge is finding time to code...

  • Like 1
Link to comment
Share on other sites

  • 1 month later...

I'm new to TMS9900 assembler (bought my second TI-99/4A a few months ago after a 30 years break) and I have been reading this thread with great interest. I only had a console with XB and a tape recorder those many years ago - no E/A, 32K, PEB, disks, etc. - and it's really exciting to come back and explore all the things I missed (thanks to the nanoPEB). As my first project I have managed to write a pretty decent bitmap mode line drawing routine in assembler following the advice in this thread, however, I have a mixed bag of questions that I hope someone can answer:

 

1. The VDP read routine provided by Matthew in this thread does not have the delay prescribed in the E/A manual between sending low and high byte. Is this a mistake or is it not really necessary? I know the F18A does not need the delay, and since I have one of those great boards installed in my console it's difficult to test. How about emulators, do they care about the delay? Is there a window after vsync where the delay is not needed?

 

VSBR   MOVB @R0LB,@VDPWA        * Send low byte of VDP RAM write address
      MOVB R0,@VDPWA           * Send high byte of VDP RAM write address
      MOVB @VDPRD,R1           * Read byte from VDP RAM
      B    *R11

 

 

2. Is there a clever way to read a VDP byte, modify it (e.g. OR with a bit mask), and write it back without having to set the write address twice? (like temporarily disabling auto-increment)

 

3. There doesn't seen to be any bit instructions in the instruction set. If I want to set bit n (variable) in a register I can set it to >8000 and shift right (n-1) times. Is there a faster way?

 

4. Is immediate addressing, e.g. LI R1,>0001, faster than memory addressing, e.g. MOV @ONE,R1? What if ONE is in scratch pad?

 

5. Are byte instructions, e,g. MOVB, faster than word instructions, e.g. MOV?

 

6. Is it possible to construct a full screen monochrome bitmap mode with double buffering (by flipping between two character tables)? My research so far says no, but I thought I would ask the experts. This old post is the most promising I have found, but according to Thierry's tech page the described mode is not available: https://groups.google.com/forum/?hl=en&fromgroups=#!topic/comp.sys.ti/r1ZZPxkqZwI

 

7. Can anyone point me to a good alternative to the KSCAN routine? The one in the ROM is pretty lousy as far as I understand.

 

8. What is the easiest way to place a small routine in scratch pad? I don't suppose you can use AORG?

 

Thanks,

Rasmus

Link to comment
Share on other sites

...

3. There doesn't seen to be any bit instructions in the instruction set. If I want to set bit n (variable) in a register I can set it to >8000 and shift right (n-1) times. Is there a faster way?

 

ORI or SOC should work.

 

4. Is immediate addressing, e.g. LI R1,>0001, faster than memory addressing, e.g. MOV @ONE,R1? What if ONE is in scratch pad?

 

LI is a good bit faster on the same width address bus. Scratch pad access is faster for the same instruction because it is on a sixteen-bit bus. I don't know how LI on an 8-bit bus compares to MOV on a 16-bit bus.

 

5. Are byte instructions, e,g. MOVB, faster than word instructions, e.g. MOV?

 

Only for indirect autoincrementing access of workspace registers, I believe.

 

...

8. What is the easiest way to place a small routine in scratch pad? I don't suppose you can use AORG?

 

You pretty much have to copy it there, I think.

 

...lee

Link to comment
Share on other sites

1. The VDP read routine provided by Matthew in this thread does not have the delay prescribed in the E/A manual between sending low and high byte. Is this a mistake or is it not really necessary?

 

With experimentation, a lot of people have found that the delay isn't necessary. Code I've transferred to my CF7 reader on my actual TI has ran just fine without them.

 

3. There doesn't seen to be any bit instructions in the instruction set. If I want to set bit n (variable) in a register I can set it to >8000 and shift right (n-1) times. Is there a faster way?

 

Yeah, there are no direct bit operations. What you can do if you want to affect a single bit in a word is use R0 as a rotation value and do SRC to move a bit to the correct location. Then you can do a SOC operation.

 

Alternatively, you can just process each bit one after the other by using shift operations to move them to the left, so that when you get a carry bit you perform an action.

 

4. Is immediate addressing, e.g. LI R1,>0001, faster than memory addressing, e.g. MOV @ONE,R1? What if ONE is in scratch pad?

 

Absolutely it's faster. The first instruction involves no memory fetching, the immediate value follows the opcode. The second requires it to fetch the value at address @ONE first, then perform the move action. Even if the value is in the scratchpad, you're still doing an extra memory operation.

 

5. Are byte instructions, e,g. MOVB, faster than word instructions, e.g. MOV?

 

No real difference between the two; both are using internal registers to deal with the even/odd addressing problems.

 

6. Is it possible to construct a full screen monochrome bitmap mode with double buffering (by flipping between two character tables)? My research so far says no, but I thought I would ask the experts. This old post is the most promising I have found, but according to Thierry's tech page the described mode is not available: https://groups.googl....ti/r1ZZPxkqZwI

 

Hm... for regular bitmap mode, it's not possible because you have to put the other pattern table SOMEWHERE, and there's really no room anywhere that won't overlap the color table.

 

However, you could probably do this with the bitmap text mode hybrid. Since it completely ignores the color table and gets the base colors in the Video Register 7 instead, you could place a second set of character patterns in its place.

 

The only downside here is that you only have 240 pixels to work with, and have to deal with the last two bits of each character being ignored.

 

Adamantyr

Link to comment
Share on other sites

1. The VDP read routine provided by Matthew in this thread does not have the delay prescribed in the E/A manual between sending low and high byte. Is this a mistake or is it not really necessary? I know the F18A does not need the delay, and since I have one of those great boards installed in my console it's difficult to test. How about emulators, do they care about the delay? Is there a window after vsync where the delay is not needed?

 

It also depends on the location of the code. For instance, in 16-bit memory (ROM or scratch pad or internal 32K or TMS9995-internal RAM) the instructions require less cycles, so this may overrun the VDP. I know that in the Geneve you have to use delays.

 

2. Is there a clever way to read a VDP byte, modify it (e.g. OR with a bit mask), and write it back without having to set the write address twice? (like temporarily disabling auto-increment)

 

No, each read automatically increases the memory pointer of the VDP.

 

3. There doesn't seen to be any bit instructions in the instruction set. If I want to set bit n (variable) in a register I can set it to >8000 and shift right (n-1) times. Is there a faster way?

 

As already said you should use SOC for this purpose. BTW, SZC can be used to clear those bits that were set with SOC.

 

4. Is immediate addressing, e.g. LI R1,>0001, faster than memory addressing, e.g. MOV @ONE,R1? What if ONE is in scratch pad?

 

Immediate values require an own cycle to be read. For the special values 0 and FFFF you should use CLR or SETO which are faster. Being in scratch pad only means that the time-multiplexed data bus operation is suspended, so there are less cycles.

 

5. Are byte instructions, e,g. MOVB, faster than word instructions, e.g. MOV?

 

No. Byte operations always require the CPU to load the full word first. For instance, if you want to write 01 to address A000, you expect that address A001 remains the same. However, the CPU is 16-bit, it cannot move half a word out of the ALU. Accordingly, what it does is to load the complete word at A000 first, keep the value of A001 in an internal storage, modify the byte at A000, and write the complete word back.

 

The TMS9995 (in the Geneve) is more flexible here because it has only 8 databus lines. It can indeed address single bytes and therefore never needs to read before the write. (In this case, fewer data bus lines actually make the architecture faster!)

 

7. Can anyone point me to a good alternative to the KSCAN routine? The one in the ROM is pretty lousy as far as I understand.

 

Why? I'd recommend to stay with the standard key scan routine as long as you don't have tight timing constraints. Using the standard routine ensures that systems with a different keyboard (like the Geneve) will run with your program. If you start to work with bare CRU lines this compatibility is gone.

 

8. What is the easiest way to place a small routine in scratch pad? I don't suppose you can use AORG?

 

I never tried, but you are likely to interfere with the loader. As long as the Editor/Assembler cartridge is running (and GPL is executing), the scratch pad is heavily in use. AORG is only useful for tagged object code, and once you switch to memory image code you don't have it anymore. The standard is to copy a routine from some location in your program into the scratch pad.

Link to comment
Share on other sites

 

7. Can anyone point me to a good alternative to the KSCAN routine? The one in the ROM is pretty lousy as far as I understand.

 

8. What is the easiest way to place a small routine in scratch pad? I don't suppose you can use AORG?

 

Thanks,

Rasmus

 

 

You can enter the KSCAN routine directly (in case you want a stand alone EA5 program) and it would be quite a bit of coding to reproduce what it does probably at no advantage. If you are wanting to work with the joysticks and not the keyboards then that is another story. You can gather all you joystick data with a pretty small chunk of code. The TI Tech Pages has a good example.

 

Loading into the scratch pad needs to be done in your program somewhere. As Michael said if you try to load it directly from the disk then you will bonk the EA (or any other most likely) loader. You are on the right track by AORGing the code. A little trick you can do...

 

1) Write your routine with no references outside your code that is going into the scratch pad unless they are absolute (ie >8400 etc....)

2) Assemble this routine separately from the main code and create an assembly listing to the disk drive. This will create a DV80 file that you can edit. The far right column of the listing will contain the machine code you need to poke into the scratch pad as Hex data.

4) Edit this file removing everything but the hex data. Add data statements and hex markers (>) in front of the data. Add a label at the start and stop of the code and re-save it as source.

5) use the copy directive to include it in your main listing.

6) at program init you will need to move the machine code from where ever it is located (start label) to the scratch pad area. The label after the data will be your stopping point so you don't have to reinvent your poke machine when you make changes.

 

It can be kind of tedious and confusing but is worth the effort in the end.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...