Jump to content
IGNORED

"Bus stuffing" like The Graduate.


kskunk

Recommended Posts

Hi guys,

 

Thanks to Curt's great archiving work, we have the designer's notes on The Graduate peripheral for the 2600:

http://www.atarimuseum.com/videogames/consoles/2600/a3000.html

 

They invented a 3-cycle "Bus Stuff" mode, to achieve an even faster TIA register update rate than the Harmony's 5-cycle/DPC+ "Fast Fetch" mode.

 

This works by loading Y with $FF at the beginning of the kernel, and then having the 6507 execute 3-cycle STY $REG instructions. At the critical moment when the $FF is being written, The Graduate hardware steps in and overdrives the the desired value on the bus. This avoids the extra 2-cycle LDA $VALUE used by Harmony.

 

Even though it seems pretty evil to overdrive the 6507's bus, the designers knew it was fairly safe because the NMOS 6507 used pull-up resistors to drive 1s on the bus, which could be grounded to 0s without overheating the 6507.

 

I'm sure this technique has been discussed before, but I couldn't find any actual hardware that could do it.

 

Has anyone tried "bus stuffing"? Can the Harmony do it, or are there resistors in the way?

 

- KS

  • Like 1
Link to comment
Share on other sites

Hi guys,

 

Thanks to Curt's great archiving work, we have the designer's notes on The Graduate peripheral for the 2600:

http://www.atarimuseum.com/videogames/consoles/2600/a3000.html

 

They invented a 3-cycle "Bus Stuff" mode, to achieve an even faster TIA register update rate than the Harmony's 5-cycle/DPC+ "Fast Fetch" mode.

 

This works by loading Y with $FF at the beginning of the kernel, and then having the 6507 execute 3-cycle STY $REG instructions. At the critical moment when the $FF is being written, The Graduate hardware steps in and overdrives the the desired value on the bus. This avoids the extra 2-cycle LDA $VALUE used by Harmony.

 

Even though it seems pretty evil to overdrive the 6507's bus, the designers knew it was fairly safe because the NMOS 6507 used pull-up resistors to drive 1s on the bus, which could be grounded to 0s without overheating the 6507.

 

I'm sure this technique has been discussed before, but I couldn't find any actual hardware that could do it.

 

Has anyone tried "bus stuffing"? Can the Harmony do it, or are there resistors in the way?

 

- KS

The Harmony can do this, and we have talked about in the past. Some of the conversations are private so I can't provide links, but I will mention some of the things we talked about.

 

Anyway, the 6507 uses transistors rather than resistors to drive high. I understand they are fairly weak, but it's not clear how weak exactly. While it seems that the 6507 survives just fine with occasional bus contention, I'm not sure how well it would fare with bus contention on every third cycle. It wouldn't be too hard to test this - I could just create a simple routine that stuffed zeros every third cycle and had an occasional JMP to keep the PC from running off the edge of cart space, and run it for a while and see how hot things get. I didn't see any test results from the Graduate pdf that Curt posted, and while it's likely they did testing and were satisfied with the results, it's also possible that the Graduate prototypes fried some CPUs and that may have led to its failure.

  • Like 1
Link to comment
Share on other sites

The Harmony can do this, and we have talked about in the past.

Has anyone tried it?

 

Anyway, the 6507 uses transistors rather than resistors to drive high. I understand they are fairly weak, but it's not clear how weak exactly.

Right, sorry for oversimplifying it. The usual circuit design in NMOS uses a single depletion mode transistor per net, to pull the net up to 5V if it is otherwise floating. Any of the enhancement mode transistors on the same net can safely pull the net down to ground without straining the depletion mode pull-up.

 

I'm curious how much The Graduate designers knew. Were they being reckless? Or did they know the structure of the 6507 output driver? If it is designed with a depletion mode pull-up, then their design should be perfectly safe. Just adding another (open-collector) driver off-chip would be no different than using the on-chip one.

 

Maybe I should ask the Visual 6502 guys what the output drivers look like. In NMOS, you can build a push-pull driver with enhancement mode transistors on the high side. This would not be as happy being switched off. You would be able to see that on a scope because the logic 1s would be closer to 4V than 5V.

 

It's also possible that the Graduate prototypes fried some CPUs and that may have led to its failure.

Yeah, that would put a quick end to the approach. :(

 

- KS

Edited by kskunk
Link to comment
Share on other sites

The Harmony can do this, and we have talked about in the past.

Has anyone tried it?

 

Anyway, the 6507 uses transistors rather than resistors to drive high. I understand they are fairly weak, but it's not clear how weak exactly.

Right, sorry for oversimplifying it. The usual circuit design in NMOS uses a single depletion mode transistor per net, to pull the net up to 5V if it is otherwise floating. Any of the enhancement mode transistors on the same net can safely pull the net down to ground without straining the depletion mode pull-up.

 

I'm curious how much The Graduate designers knew. Were they being reckless? Or did they know the structure of the 6507 output driver? If it is designed with a depletion mode pull-up, then their design should be perfectly safe. Just adding another (open-collector) driver off-chip would be no different than using the on-chip one.

 

Maybe I should ask the Visual 6502 guys what the output drivers look like. In NMOS, you can build a push-pull driver with enhancement mode transistors on the high side. This would not be as happy being switched off. You would be able to see that on a scope because the logic 1s would be closer to 4V than 5V.

 

It's also possible that the Graduate prototypes fried some CPUs and that may have led to its failure.

Yeah, that would put a quick end to the approach. :(

 

- KS

Yes, the Visual 6502 guys would probably know.

 

I am guessing the output drivers are weak from the datasheet. I believe each data line is rated to source 100uA at 2.4v and can sink around 2mA at around 0.5v. This suggests the output drivers have fairly low impedance and might survive being overdriven.

 

I have never tried it but I will look into it later. I think all I will do is modify the basic 4k scheme to drive the bus if it encounters a write to $49 (which is COLUBK, so I can see the results on the screen) and force varying values on the bus (though mostly with bits cleared to maximize the stress) and let it run for a while, and measure the temperature of the 6507 with my infrared thermometer and compare with a control running a regular game.

Link to comment
Share on other sites

I have never tried it but I will look into it later. I think all I will do is modify the basic 4k scheme to drive the bus if it encounters a write to $49 (which is COLUBK, so I can see the results on the screen) and force varying values on the bus (though mostly with bits cleared to maximize the stress) and let it run for a while, and measure the temperature of the 6507 with my infrared thermometer and compare with a control running a regular game.

Sounds like a fascinating test. I assume it's not too hard to make Harmony drive the bus in open-collector/open-drain mode during the experiment. Accidentally driving a 1 against the 6507's 0 would definitely heat up something.

 

If it's enough to make the chip hot, it will show up on a bench top power supply as an increased current draw as well.

 

- KS

Edited by kskunk
Link to comment
Share on other sites

I have never tried it but I will look into it later. I think all I will do is modify the basic 4k scheme to drive the bus if it encounters a write to $49 (which is COLUBK, so I can see the results on the screen) and force varying values on the bus (though mostly with bits cleared to maximize the stress) and let it run for a while, and measure the temperature of the 6507 with my infrared thermometer and compare with a control running a regular game.

Sounds like a fascinating test. I assume it's not too hard to make Harmony drive the bus in open-collector/open-drain mode during the experiment. Accidentally driving a 1 against the 6507's 0 would definitely heat up something.

 

If it's enough to make the chip hot, it will show up on a bench top power supply as an increased current draw as well.

 

- KS

Well, bus stuffing does seem to work. I modified the 4k to stuff zeros every time COLUBK is written and I programmed a few games, and each time the background comes up black when it's supposed to be another color. Currently running the control game while I work up a binary that will stuff zeros every third cycle.
Link to comment
Share on other sites

OK, here's the results after 15 minutes of running. The temperature on the 6507 wasn't constant across the chip, so I measured the hottest temp I found (close to the notch end of the chip.)

 

Control game (Asteroids):

6507 before: 25.1 C 6507 After: 48.9 C

Harmony MCU before: 23.3 C Harmony MCU after: 26.3 C

 

Bus stuffing:

6507 before: 25.5 C 6507 After: 51.1 C

Harmony MCU before: 23.0 C Harmony MCU after: 30.8 C

 

It's worth noting that bus stuffing seems to work, even every third cycle. The binary does a clean start, a LDA #$FF, then a loop that does 100 bus-stuffed STA COLUBK writes in a row, followed by an INY/STY COLUBK (which is not bus-stuffed) so you can see something on the screen to know the code is still running. The binary never failed, and properly shows a black background where expected.

 

I will keep it running until I go to bed and see if it eventually levels off (as I'm typing, 6507 is 53.8 C, Harmony is 34 C) then do a long-term test of Asteroids again.

 

So far, the difference seems pretty insignificant and should be well within the limits of these chips.

  • Like 1
Link to comment
Share on other sites

Looks like the temp leveled off after running each game for about a half hour. I also think the room has cooled down a bit as the "bus stuffing" temp went down.

 

So Asteroids vs. bus stuffing now stands at 51.5 C vs 52.4 C. The Harmony temperature is 32.8 C vs. 32.5 C. So, I don't think there is a significant difference, and perhaps bus stuffing is safe. I will let a console bus-stuff for a few days before I say for sure, though.

Link to comment
Share on other sites

Hey, this idea is really cool. 3 cycles resolution ;-) I also think the NMOS chips will be able to cope with it. If the outside transistor provides sufficient power to drive the pin low without heating up it will be stable. There also was Atari XL hardware like TurboFreezer which used the same method to implicitly drive signals in the machine via "outbound" pin on the peripherials connector.

Link to comment
Share on other sites

Right, sorry for oversimplifying it. The usual circuit design in NMOS uses a single depletion mode transistor per net, to pull the net up to 5V if it is otherwise floating. Any of the enhancement mode transistors on the same net can safely pull the net down to ground without straining the depletion mode pull-up.

 

The 6502 data bus floats pretty well when it's not doing a memory write; the data bus pull-ups are at least somewhat active. That being said, your experiments would suggest that overdriving low's onto the bus is probably pretty safe. It might not be a bad idea to output "overdrive" bytes by storing zero in the data register and setting the output-enable register to the desired pattern. That would avoid accidentally trying to drive "1"'s on the bus. My first 2600jr I ever owned bit the dust when I misprogrammed a PLD in a cartridge and back-drove a high onto the address bus.

Link to comment
Share on other sites

Right, sorry for oversimplifying it. The usual circuit design in NMOS uses a single depletion mode transistor per net, to pull the net up to 5V if it is otherwise floating. Any of the enhancement mode transistors on the same net can safely pull the net down to ground without straining the depletion mode pull-up.

 

The 6502 data bus floats pretty well when it's not doing a memory write; the data bus pull-ups are at least somewhat active. That being said, your experiments would suggest that overdriving low's onto the bus is probably pretty safe. It might not be a bad idea to output "overdrive" bytes by storing zero in the data register and setting the output-enable register to the desired pattern. That would avoid accidentally trying to drive "1"'s on the bus. My first 2600jr I ever owned bit the dust when I misprogrammed a PLD in a cartridge and back-drove a high onto the address bus.

If done right, there's no danger in driving a "1". When you want to stuff the bus, always place zeros on the data pins, but set some as input or output depending on the value you want. The output pins should force zeros while the input pins stay at whatever they were before, so you will never drive a 1.

Link to comment
Share on other sites

  • 1 year later...

I want a demo program I can run to knock my socks off, like SpiceWare did in the DPC+ demo.

I know... it's good to want...

Would you have to flash the Harmony to a single rom for Bus Stuffing to work?

 

I love seeing the progression of hardware techniques. The Harmony's 32K and all those bank switching schemes, then Space Rocks using the ARM for game logic, and soon Frantic's use of digitised speech with the display on and ARM for game logic.

Link to comment
Share on other sites

If you use the ARM to generate a bunch of nop, sty $reg, sty.w $reg instructions (so you have full control of the timing when a register is written), together with bus-stuffing, the VCS becomes a audio/video signal generator...

Edited by roland p
Link to comment
Share on other sites

If you use the ARM to generate a bunch of nop, sty $reg, sty.w $reg instructions (so you have full control of the timing when a register is written), together with bus-stuffing, the VCS becomes a audio/video signal generator...

 

Will we finally get Hellbound Hellraiser ported to the 2600?!?

http://www.cenobite.com/collect/vg.htm

Link to comment
Share on other sites

  • 3 years later...

After reading much about bus stuffing, I got another idea of how to use or maybe could be used for (if possible).

 

The problem I would focus on are the different color codes of NTSC and PAL (maybe also SECAM).

 

 

The (maybe possible solution) without modifying all the rom code:

 

When the CPU executes the following:

LDX #$2A

STX COLUPF

 

Now when the card detects that a color register (like COLUPF) is written, then it gets the code from the bus ($2A) and overrides it with the correct color of the system (maybe a $4A).

 

In theory it should work but... maybe there are some problems:

 

1.) normally the CPU must output $FF to the bus before overriding it, so overriding another value could be heat up or even damage the CPU.

 

2.) the detection and reaction of the card has to be very fast, to get bus stuffing running.

 

 

Somebody ever tried something like this?

Edited by MacrosCode
Link to comment
Share on other sites

My understanding on how it works is the values put on the bus by the 6502 and the cartridge are ANDed together. This is how I implemented it in Stella:

 

  value &= myCart.busOverdrive();

Looking at the TIA Color Charts, NTSC blue #000088* is $80 while the equivalent for PAL is $D0. There's nothing you could AND with $80 to end up with $D0.

 

* if you leave your mouse pointer over a color you should see a tooltip popup similar to this in Safari (Mac browser):

post-3056-0-89938300-1471274168.png

Link to comment
Share on other sites

My understanding on how it works is the values put on the bus by the 6502 and the cartridge are ANDed together. This is how I implemented it in Stella:

 

  value &= myCart.busOverdrive();

 

Well, if it's equvalent to an AND of CPU- and CARD-output, then of course this means: it does not work. :(

 

Before I thought the $FF-output of the CPU is needed to prevent overheating or hardware damage.

Link to comment
Share on other sites

  • 2 weeks later...

We're making some good progress. This 48 pixel kernel:

 

ShowGraphic:
; X = # of rows to output
SGloop:
    sta WSYNC
;---------------------------------------
    sty GRP0        ; 3  3 
    sty GRP1        ; 3  6 GRP0 now showing 1st 8 pixels
    sty GRP0        ; 3  9 GRP1 now showing 2nd 8 pixels
    SLEEP 30        ;30 39
    dex             ; 2 41
    sty GRP1        ; 3 44 GRP0 now showing 3rd 8 pixels
    sty GRP0        ; 3 47 GRP1 now showing 4th 8 pixels
    sty GRP1        ; 3 50 GRP0 now showing 5th 8 pixels
    stx GRP0        ; 3 53 GRP1 now showing 6th 8 pixels, this update to GRP0 is not shown
    bne SGloop      ; 2 66 (3 67)
    sta WSYNC
;---------------------------------------
    stx GRP1        ; 3  3 GRP0 now blank (as X was 0)
    stx GRP0        ; 3  6 GRP1 now blank
    rts

With this prep in vertical blank, which initializes datastream0 and tells BUS that STY GRP0 and STY GRP1 are to be bus stuffed using the values in datastream0:

    lda #>MMlogo
    sta DS0PTR
    lda #<MMlogo
    sta DS0PTR
     
    lda #1
    sta DS0INC
    lda #0
    sta DS0INC
     
    ldx #GRP0
    stx SETADDRESS
    lda #$00
    sta MAPADDRESS
    sta MAPADDRESS
    sta MAPADDRESS
    sta MAPADDRESS
     
    ldx #GRP1
    stx SETADDRESS
    sta MAPALL

and this data table, as the values in datastream0. Shown as a screenshot as graphic data stored in binary values is oh-so-legible when using jEdit :)

post-3056-0-48844300-1472082595_thumb.png

 

Generates this 48 pixel display on the Atari:

post-3056-0-89434500-1472082623_thumb.jpg

 

We've also got preliminary support in Stella:

post-3056-0-20574000-1472083139_thumb.png

 

including the BUS specific tab in the debugger:

post-3056-0-95986900-1472082667_thumb.png

  • Like 1
Link to comment
Share on other sites

For contrast, this is the 48 pixel kernel I used in Medieval Mayhem:

ShowMEgraphic SUBROUTINE
        ; call with Y holding the lines-1 to show
        ; G48 thru G48+$B must be preloaded with pointers to the
        ; 48 pixel graphic image
        STY G48temp1
;        sta WSYNC
.graphicLoop
        ldy G48temp1         ;+3  63  189
        lda (G48),y          ;+5  68  204
        sta GRP0             ;+3  71  213      D1     --      --     --
        sta WSYNC            ;go
        lda (G48+$2),y       ;+5   5   15
        sta GRP1             ;+3   8   24      D1     D1      D2     --
        lda (G48+$4),y       ;+5  13   39
        sta GRP0             ;+3  16   48      D3     D1      D2     D2
        lda (G48+$6),y       ;+5  21   63
        sta G48temp2         ;+3  24   72
        lda (G48+$,y       ;+5  29   87
        tax                  ;+2  31   93
        lda (G48+$A),y       ;+5  36  108
        tay                  ;+2  38  114
        lda G48temp2         ;+3  41  123              !
        sta GRP1             ;+3  44  132      D3     D3      D4     D2!
        stx GRP0             ;+3  47  141      D5     D3!     D4     D4
        sty GRP1             ;+3  50  150      D5     D5      D6     D4!
        sta GRP0             ;+3  53  159      D4*    D5!     D6     D6
        dec G48temp1         ;+5  58  174                             !
        bpl .graphicLoop     ;+2  60  180
        lda #0
        sta GRP0
        sta GRP1
        sta GRP0
        sta GRP1
        rts        
  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...