Jump to content
IGNORED

Smooth scrolling


Asmusr

Recommended Posts

 

Another question is, would this work on an unexpanded TI or would it require a 32k upgrade ?

 

Hi David,

 

Holiday greetings and happy new year to you too.

 

The limited number of characters is definitely a big issue when smooth scrolling on the TI, but in Graphics I mode you can switch between up to 8 different pattern tables in VDP RAM (by setting a VDP register) giving you up to 2048 available characters. However, since you also need VDP RAM for name tables etc in practice you would either settle for 8x128 characters or 4x256 characters. I used this technique in TI Scramble, Road Hunter and Flappy Bird.

 

In graphics II (bitmap) mode you don't have the option of switching pattern tables, but you do have 256 characters available for each 3rd of the screen. I used this in the my mini Sports game. Alternatively you can upload the characters to the VDP on the fly, but that's much slower, so if you want to scroll at full speed (60 FPS) you're limited to around 70 characters per frame. I used this in Titanium and Bouncy (none of which scroll at full speed).

 

To make perfect smooth scrolling you also need double buffering because if you update characters (name table) and patterns at the same time you get visible artifacts on the screen. Double buffering I usually achieve by splitting the character set in two and only showing one part on the screen at any time (while the other part in being updated in an off-screen buffer).

 

Magellan is a very nice tool for working with scrolling graphics. It has a tool to count the number of characters you need. There are also examples of assembly code included.

 

In theory you don't need 32K for smooth scrolling if you have enough ROM and the scrolling map is static, but for games where the map can change, e.g. because you can pick up items, you very quickly run out of scratch pad ram for storing the changes.

 

Rasmus

Link to comment
Share on other sites

 

Hi David,

 

Holiday greetings and happy new year to you too.

 

The limited number of characters is definitely a big issue when smooth scrolling on the TI, but in Graphics I mode you can switch between up to 8 different pattern tables in VDP RAM (by setting a VDP register) giving you up to 2048 available characters. However, since you also need VDP RAM for name tables etc in practice you would either settle for 8x128 characters or 4x256 characters. I used this technique in TI Scramble, Road Hunter and Flappy Bird.

 

In graphics II (bitmap) mode you don't have the option of switching pattern tables, but you do have 256 characters available for each 3rd of the screen. I used this in the my mini Sports game. Alternatively you can upload the characters to the VDP on the fly, but that's much slower, so if you want to scroll at full speed (60 FPS) you're limited to around 70 characters per frame. I used this in Titanium and Bouncy (none of which scroll at full speed).

 

To make perfect smooth scrolling you also need double buffering because if you update characters (name table) and patterns at the same time you get visible artifacts on the screen. Double buffering I usually achieve by splitting the character set in two and only showing one part on the screen at any time (while the other part in being updated in an off-screen buffer).

 

Magellan is a very nice tool for working with scrolling graphics. It has a tool to count the number of characters you need. There are also examples of assembly code included.

 

In theory you don't need 32K for smooth scrolling if you have enough ROM and the scrolling map is static, but for games where the map can change, e.g. because you can pick up items, you very quickly run out of scratch pad ram for storing the changes.

 

Rasmus

 

Thanks for your prompt reply. could you kindly send me any notes you have that explains your scrolling techniques especially of the source code that you had kindly made public about 2 years ago?

Link to comment
Share on other sites

 

Thanks for your prompt reply. could you kindly send me any notes you have that explains your scrolling techniques especially of the source code that you had kindly made public about 2 years ago?

 

I posted some notes here:

http://atariage.com/forums/topic/210888-smooth-scrolling/page-1?do=findComment&comment=2754421

Link to comment
Share on other sites

It uses >42B2 bytes.

 

I have attached my notes, I warned you they were long! Please let me know if anything is unclear or incorrect, or you have suggestions for how the algorithm could be improved.

 

These notes are simply great, well done Rasmus, I am amazed at how talented and resourceful you became in a short period of time. Did you get special training in assembly language prior to starting on TI or you simply learn quickly simply by reading material you find? I think your background extends beyond programming as your deep understanding in math and logic is outstanding. You even go into the bandwidth of how many bytes the TI handles per frame, a detail which I never looked up before but came natural to you to research or calculate before you started coding. The assembly language both for this and Sabrewulf which I downloaded to study are so clean and perfectly formatted, it would take a normal person months to come up with such a polished product. Well done, I am honoured that I have the chance of discussing code with you. I hope you are a top paid Senior Developer working for Google/Microsoft/Apple, at least I would understand why in your spare time you can afford to do such great work for nothing, effortlessly.

 

All the best for 2016.

  • Like 3
Link to comment
Share on other sites

  • 1 year later...

My other smooth scrolling demos have relied on pattern reuse in some form, but here's an example of a 'brute force' approach.

 

 

This routine is scrolling a generic bitmap of 128x512 pixels.

 

To support double buffering only half of the patterns are updated each time the screen is scrolled, and I'm alternating between showing a screen with characters 0-127 and another with characters 128-255.

 

Actually it's only almost half, because I need to reserve a few characters for the non-scrolling part of the screen. In the top and bottom thirds these characters can be reserved by not scrolling the first and the last character rows. In the middle I have to narrow the scrolling region. Alternatively I could have limited the whole scrolling region to 15 columns.

 

I think the speed is ok for a slow moving background in a shoot'em up game, for instance, but is 15 or 16 characters columns enough? Probably not, so I'm thinking of a way to buffer individual rows instead. This will not look so pretty because the whole screen will not update at the same time, and scrolling more columns will be slower, of course.

 

Bitmaps take up a lot of space, so in a game the scrolled graphics could be generated from a tile set and a map instead.

bmpscroll.dsk

bmpscroll8.bin

  • Like 7
Link to comment
Share on other sites

What else is being stored in VDP?

I mean 1 Meg of SAMS RAM can hold one hell of a lot more than 16K of VDP?

 

Even with how slow RXB is with GPL I can switch 2079 bytes with GPL MOVE command so fast you can not see what is going on.

So assembly should just be invisible to the naked eye.

Link to comment
Share on other sites

What could you do using SAMS instead?

 

SAMS could be used for storing large bitmaps or lots of tiles, but so could a ROM cartridge. SAMS wouldn't speed up the scrolling because the data transfer is already unidirectional (from CPU memory to VDP memory) as Matthew explained.

 

With a few changes the demo could run on an unexpanded console from ROM, but if it was tile based instead of bitmap based it would slow down things if we didn't have CPU RAM for a screen buffer.

Link to comment
Share on other sites

How long does it take for SAMS vs Cartridge to swap out 32K at a time?

 

I know for a fact that the SAMS can do it much faster than 8K at a time from a Cart.

 

This would also magnify when talking VDP as 2 pages of 8K from a Cart is much slower than 16K from SAMS.

 

I know you guys love Carts but they have limitations the SAMS does not have. i.e. The SAMS does not need to move from 8K cart to location.

 

And why all this fascination with CONSOLE ONLY? It is like Model T vs Jet car.

Link to comment
Share on other sites

I think the console only fascination is that it allows the maximum number of users to actually use the newly developed software--especially useful for folks that are new to (or just returning to) the TI. SAMS cards are a nice thing--which is why I made sure they were available again, but they aren't the answer to all problems when it comes to memory use. As long as you're using a 9918 VDP you have to deal with the limitations of the VDP memory. 32K, large cartridges, or SAMS can mitigate some of the issues, but not all. I love seeing the things Rasmus does here in these demos, as he'd really pushed the TI to its limits, something that goes way beyond what folks were doing even 10 years ago.

  • Like 2
Link to comment
Share on other sites

If you can achieve some masterwork on a console-only or a "classic" configuration (+ 32K + floppy), it gives you a convincing proof that the TI was better than most people thought, including us, and that its problem was simply a lack of adept programmers.

 

It does make sense to consider both ways, classic and expanded, but the classic way should also receive its proper appreciation, although or because it abstains from the impressive expansions that we have.

  • Like 2
Link to comment
Share on other sites

As mentioned earlier, I also believe it gives people coming in to the hobby a decent path of new things to follow.
With a plane Jane console, they can start with an FR99 and enjoy the classics as well as new programs, then from there it's one more thing. Always, just one more thing... :)

Link to comment
Share on other sites

How is SAMS faster at memory access than cartridge ROM?

 

Unless what you are spanning fits in 32k, you will be on average, bank switching every 4k instead of every 8k. And to my understanding, a SAMS bank switch operation is a couple CRU instructions and a memory write, where cartridge bank switches are just a single memory write.

 

?

 

-M@

Link to comment
Share on other sites

I love seeing the things Rasmus does here in these demos, as he'd really pushed the TI to its limits, something that goes way beyond what folks were doing even 10 years ago.

 

Same here! I'm hoping to a continuation of the Castle Wolfenstein project. With more than 1/2 the users here already using an F18A and a good portion of us having 1meg SAMS cards and FR99's a game such as this would have a wider audience (of active users) than in years past. In fact I'm willing to bet a game like that would DRIVE HARDWARE SALES for the TI. Heck, a game like that would be just PERFECT for the FinalGROM99 when it's finally released.

Link to comment
Share on other sites

How is SAMS faster at memory access than cartridge ROM?

 

...

 

It is not. The 8-bit bus access on the 99/4A is consistent no matter what address range is being accessed. But even the 16-bit ROM and scratch-pad are the same in access time (going from memory, I might need to verify this). The memory cycle on the 9900, even without wait-states, is pretty slow anyway.

Link to comment
Share on other sites

 

It is not. The 8-bit bus access on the 99/4A is consistent no matter what address range is being accessed. But even the 16-bit ROM and scratch-pad are the same in access time (going from memory, I might need to verify this). The memory cycle on the 9900, even without wait-states, is pretty slow anyway.

I should explain I meant switching 8K each time in a Cart vs Switching 32K at a time like the SAMS can do?

 

Seems logical that having to repeat switching a smaller 8K more times vs 32K at a time would just take more time to do the same exact thing.

Link to comment
Share on other sites

It might, depending on how the data was laid out and such. But switching a bank is literally one MOV instruction. Plus, you can only push so much data to the VDP per-frame, so you are going to run out of VDP bandwidth before you run out of ROM and need to bank-switch. The overhead of bank-switching on 8K blocks vs 32K blocks is going to be too small to measure.

Link to comment
Share on other sites

I agree. I've never used the SAMS thing. Does it bank switch based on writing something to a specific memory address?

My own internal to the console 64 K RAM memory design bank switches by CRU bits. I wanted a CRU bit to control the hardware wait state for VDP access as well, and since I have eight output bits, I couldn't associate one bit with each 8 K RAM segment. What I did was associate the 8 K RAM segments between >0000 and >9FFF with one CRU bit each, and then the remaining 24 K RAM at >A000 to >FFFF with one single CRU bit.

When all CRU bits are reset to zero (default state), their output is inverted for the memory segments that make up the normal memory expansion. Thus the memory at >2000 - >3FFF and >A000 - >FFFF is enabled, but the rest is disabled. This means the console starts normally, with 32 K RAM expansion, supporting two instead of six clock cycles for accessing a word, available.

By setting memory segment bits, the segments which normally are not RAM page in RAM in their address space, but the segments that normally are RAM expansion instead page out their RAM, and let you access a standard 32 K RAM expansion, if there is any.

The advantage of that CRU bit selection is that no special memory address need to be reserved to control bank switching, and you can flip all the switches with one LDCR instruction. As long as you don't switch yourself out of the address space, it's fine to do that.

 

Personally, as far as we limit ourselves to memory expansions built inside the console, I think my design is superior to everything else I've seen.

  1. It gives you a fast 32 K RAM expansion.
  2. I makes it possible to get a 64 K contiguous RAM in the machine, if you want to. But you can't access the VDP or any other such thing as long as you have all that memory enabled, of course. You can use the 8 K RAM then covering the memory mapped devices as a buffer, for example.
  3. While making RAM possible everywhere, it still allows you to access all standard devices in the machine. Even the standard 32 K RAM expansion can co-exist with this internal memory, providing another 32 K RAM for buffering, making a small RAM disk or whatever.
  4. Being able to copy console ROM to RAM makes it possible to redefine interrupt vectors and such stuff, if you want to make full use of the interrupt capability in the machine. For a terminal emulator, for example, using the TMS 9902 interrupt and servicing it in the fastest way possible.

Nowadays 64 K RAM extra isn't much to brag about, but remember that this design comes from around 1985. If internet had been available at that time, then maybe enough people would have liked it enough so we could have had an interest in writing software using it. But there are only three consoles I know of that has exactly this same modification.

Link to comment
Share on other sites

SAMS is bank switching 4K at a time using 16 memory mapped registers starting at >4000. It's a shame you can only switch pages into the ordinary RAM areas. Being able to switch into >5000, >6000 and >7000 areas (at least) would have been nice.

  • Like 1
Link to comment
Share on other sites

It was a long time ago I worked with modifying the 99/4A. I have to dig into the schematics to be sure, but it could be impossible to overlay other areas than >2000 - >5FFF and >A000 - >FFFF from the expansion box. My memory expansion is inside the console, so it can catch the memory decoding and reroute it before it reaches the chips it would normally activate.

 

But, again, I don't remember this for sure.

 

Checking a bit, I see that the SAMS card uses a memory mapper chip, a version of the old 74LS612. Still, it would at least be possible to design the card in such a way, that it would also present a 4K page at >5000 - >5FFF. Although it would make the design of the card more complex, since if you provide that capability, then you need to have 32 K visible at all times (corresponds to normal expansion) and another 4 K only when the card is enabled, since the memory at >5000 - >5FFF would overlap other DSR programs.

Edited by apersson850
  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...