Jump to content

Photo

36 Character Demo

36 Character

38 replies to this topic

#26 enthusi OFFLINE  

enthusi

    Moonsweeper

  • 469 posts
  • Location:Potsdam, Germany

Posted Wed May 23, 2018 7:51 AM

With sloppy I meant that you'd hardly be pressed to optimize the ARM code alot or even at all.

From what I heard here and for other systems it is usually even in C.

Filling a linear bitmap with a 70 (?) Mhz ARM in most cases will allow for 'sloppy ARM code'.

I was not referring to existing projects but the trend as I see it.



#27 Mr SQL OFFLINE  

Mr SQL

    Stargunner

  • 1,975 posts

Posted Wed May 23, 2018 10:03 AM

A full monochromatic 160 x whatever bitmap was the obvious thing to come from bus-stuffing/ARM.

Then you can realize ANY game imaginable. Doom, Super Mario Land, Final Fantasy.

All of that has been done before on other plattforms. None of that has anything to do with the Atari2600 in my opinion.

The ARM has the power of the Nintendo DS...

I wonder why people try so hard to escape the natural limits of the system?

I am all in for optimizing code for a given original system more and more.

This trend however seems to aim at the opposite direction. Sloppy ARM code for the beautyful 2600.

I hope no one feels offended by my thoughts on this.

 

These thoughts are all fair enthusi, ARM bus stuffing technology is interesting should definitely be considered a modern processor driving the Atari like a dumb terminal rather than being compared to the DPC chip in Pitfall.



#28 SpiceWare OFFLINE  

SpiceWare

    Draconian

  • 12,374 posts
  • Medieval Mayhem
  • Location:Planet Houston

Posted Wed May 23, 2018 10:37 AM

hardly be pressed to optimize?   :lol: the delusions are strong with this one.



#29 SpiceWare OFFLINE  

SpiceWare

    Draconian

  • 12,374 posts
  • Medieval Mayhem
  • Location:Planet Houston

Posted Wed May 23, 2018 11:04 AM

This is just one example, the results of spending a few weeks over the holidays optimizing Stay Frosty 2. This included many contributions from Thomas, Fred, Chris, etc.
 

1966 bytes freed during the Christmas Crunch! There's more that could be done, but I'll leave that for later if we need it.

Current ROM free state:
ARM - $3f8 - 1016 (+68) (Σ 712)
Bank4 - $227 - 551 (+71) (Σ 488)
Bank5 - $25b - 603 (+191) (Σ 512)
Display Data - $1ae - 430 (nc) (Σ 254)



#30 Mr SQL OFFLINE  

Mr SQL

    Stargunner

  • 1,975 posts

Posted Wed May 23, 2018 11:19 AM

Optimization is relative; designing a soft blitter chip to output full screen playfield animation at 60 FPS is a good example of highly optimized code in that there's nothing else like it and it's written in 6502 Assembly.

 

imo an ARM based soft blitter chip designed around a 160x192 pixel grid using bus stuffing wouldn't need as high a level of optimization and could be written in C.   

 

C optimizations don't compare to asm optimizations because it's relatively bloated slow code when you look at the assembly output (is anyone even optimizing that?); even BASIC has far more potential for optimization than C.



#31 Omegamatrix OFFLINE  

Omegamatrix

    Quadrunner

  • Topic Starter
  • 6,210 posts
  • Location:Canada

Posted Wed May 23, 2018 9:42 PM

A full monochromatic 160 x whatever bitmap was the obvious thing to come from bus-stuffing/ARM.


It does not have to be monochromatic. There are enough cycles to update the background color and player color each line.
 
.loopKernelA:
    sty    GRP0                  ;3  @4    starts at cycle 1, not 0.
    sty    GRP1                  ;3  @7
    sty    COLUBK                ;3  @10
    sty    COLUP0                ;3  @13
    sty    COLUP1                ;3  @16
    sty    COLUPF                ;3  @19   for pixel drawn by the ball
    sty    ENAM1                 ;3  @22
    sta    RESP0                 ;3  @25
    sty    GRP0                  ;3  @28
    sta    RESP0                 ;3  @31
    sty    GRP1                  ;3  @34
    sty    GRP0                  ;3  @37
    sty    ENAM1                 ;3  @40
    sty    GRP0                  ;3  @43
    sta    RESP0                 ;3  @46
    sty    GRP0                  ;3  @49
    sta    RESP0                 ;3  @52
    dex                          ;2  @54   line count
    sty    GRP0                  ;3  @57
    sty    GRP0                  ;3  @60
    sta    RESP0                 ;3  @63
    sty    GRP0                  ;3  @66
    sty    GRP0                  ;3  @69
    sty    GRP0                  ;3  @72
    SLEEP 2                      ;2  @74
    bne    .loopKernelA          ;2³ @76/1

.loopKernelB:
    SLEEP 4                      ;4  @5    starts at cycle 1, not 0.
    sty    GRP0                  ;3  @8
    sty    GRP1                  ;3  @11
    sty    COLUBK                ;3  @14
    sty    COLUP0                ;3  @17
    sty    COLUP1                ;3  @20
    sty    COLUPF                ;3  @23   for pixel drawn by the ball
    sta    ENAM1                 ;3  @26   A=0, disable
    sty    GRP0                  ;3  @29
    sta.w  RESP0                 ;4  @33
    sty    GRP0                  ;3  @36
    sta    RESP0                 ;3  @39
    sty    GRP1                  ;3  @42
    sty    GRP0                  ;3  @45
    sty    ENAM1                 ;3  @48
    sty    GRP0                  ;3  @51
    sta    RESP0                 ;3  @54
    sty    GRP0                  ;3  @57
    sta    RESP0                 ;3  @60
    dex                          ;2  @62   line count
    sty    GRP0                  ;3  @65
    sty    GRP0                  ;3  @68
    sta    RESP0                 ;3  @71
    sty    GRP0                  ;3  @74
    bne    .loopKernelB          ;2³ @76/1
It is also real close to having AUDV0 updates. The problem is there are only 2 cycles left in Kernel A. Of course dropping either the COLUBK, or both COLUP0 and COLUP1 updates would still allow 1 color update per line and audio updates. You could also do two line kernel where you just update COLUBK one line, and the player colors on the second. Then you can do AUDV0 updates every line with color changes.

#32 ZackAttack OFFLINE  

ZackAttack

    Dragonstomper

  • 733 posts
  • Location:Orlando, FL US

Posted Thu May 24, 2018 8:22 AM

It is also real close to having AUDV0 updates. The problem is there are only 2 cycles left in Kernel A.


There's an easy fix for that. Just omit the branch instruction. Each line is 50 bytes of instructions so the PC will need to be reset back to $f000 ever 80 or so lines. When a color hasn't changed you replace the STA COLUxx with JMP $f000 and as long as that happens at least once every 80 lines it will be good to go.

 

I think the bigger problem is going to be the limited RAM in the harmony cart. 3,840 bytes of graphics data and 384 bytes of color data leaves just 3,968 bytes of RAM to hold driver, display kernel, and game variables. The kernel code for bus stuffing will be much larger because in between servicing the 6502 bus it has to figure out which values to write to the TIA registers based on the graphics buffer. If we did all the bit shifting, anding, and oring during overblank it probably would take multiple frames just to draw the full bitmap and the buffer would need to be about 1.5K bigger. Plus there wouldn't be any time left for game logic. We only need to load one of the kernels into RAM each frame, so we could always swap them to save RAM. Obviously that will eat up some CPU time though.



#33 SpiceWare OFFLINE  

SpiceWare

    Draconian

  • 12,374 posts
  • Medieval Mayhem
  • Location:Planet Houston

Posted Thu May 24, 2018 8:25 AM

Slick! Haven't kept up with Bus Stuffing, any luck getting it to work on those problem Jrs and 7800s?

I suspect somebody's trying to provoke shit again. As a reminder:

ignore.png

#34 ZackAttack OFFLINE  

ZackAttack

    Dragonstomper

  • 733 posts
  • Location:Orlando, FL US

Posted Thu May 24, 2018 8:49 AM

Slick! Haven't kept up with Bus Stuffing, any luck getting it to work on those problem Jrs and 7800s?

Yeah, everyone that tested the most recent attempts reported success. Batari's suggestion to use sta,stx,sty,sax so only 6 of 8 bits need to be stuffed really made a big difference. Couple that with stuffing high for low failures and it seems to be enough for even the most problematic systems. Unfortunately all that complexity eats up a lot of CPU cycles, but if you're very careful there's just enough time to figure out what value and where to stuff it between each store instruction.

#35 Mr SQL OFFLINE  

Mr SQL

    Stargunner

  • 1,975 posts

Posted Thu May 24, 2018 8:51 AM

Slick! Haven't kept up with Bus Stuffing, any luck getting it to work on those problem Jrs and 7800s?

I suspect somebody's trying to provoke shit again. As a reminder:

attachicon.gifignore.png

 

It's interesting to discuss pushing the technology, but potty mouth posts limit the exchange of ideas.

 

Why not stop doing that and keep an open mind since new ideas you disagreed with before turned out to mirror new technology inventions discovered in parallel research with sate-of-the-art modern technology:

 

https://forums.blurb...04c0ef29824fb5a

 

Since new discoveries are possible with just with the TIA, think of what you can accomplish with the ARM with intelligent discussions instead of throwing insults.



#36 SpiceWare OFFLINE  

SpiceWare

    Draconian

  • 12,374 posts
  • Medieval Mayhem
  • Location:Planet Houston

Posted Thu May 24, 2018 8:56 AM

Awesome! There were a couple very slick projects(not mine) started using BUS, would be nice for them to see the light of day.

#37 ZackAttack OFFLINE  

ZackAttack

    Dragonstomper

  • 733 posts
  • Location:Orlando, FL US

Posted Thu May 24, 2018 10:34 AM

Updated the POC to include placeholder writes for audio and color. It fits perfectly! Next step is to generate the GRP values from the bitmap buffer.

 

160POC.png

Kernel A:
{
	vcsJmp3(); // AUDV0
	BusStuff(COLUP1, 0x2c);
	vcsJmp3(); // COLUP0
	vcsJmp3(); // COLUP1
	vcsJmp3(); // COLUBK
	BusStuff(COLUPF, 0xcc);
	BusStuff(ENAM1, i);
	
	vcssta3(RESP0);
	BusStuff(COLUP0, 0x3a);
	vcssta3(RESP0);
	BusStuff(COLUP1, 0x64);
	BusStuff(COLUP0, 0x48);
	BusStuff(ENAM1, i >> 2)
	BusStuff(COLUP0, 0x56);

	vcssta3(RESP0);
	BusStuff(COLUP0, 0x72);
	vcssta3(RESP0);
	vcslda2(aMask);
	BusStuff(COLUP0, 0x80);
	BusStuff(COLUP0, 0x9e);

	vcssta3(RESP0);
	BusStuff(COLUP0, 0xb2);
	BusStuff(COLUP0, 0xe4);
	BusStuff(COLUP0, 0xfa);
	vcslda2(aMask);
	BusStuff(COLUP0, 0x1e);
}

Kernel B:
{
	vcslda2(aMask);
	vcsJmp3(); // AUDV0
	BusStuff(COLUP0, 0x1e);
	BusStuff(COLUP1, 0x2c);
	vcsJmp3(); // COLUP0
	vcsJmp3(); // COLUP1
	vcsJmp3(); // COLUBK
	BusStuff(COLUPF, 0xcc);
	BusStuff(ENAM1, i);
	BusStuff(COLUP0, 0x3a);

	vcssta4(RESP0);
	BusStuff(COLUP0, 0x48);
	vcssta3(RESP0);
	BusStuff(COLUP1, 0x64);
	BusStuff(COLUP0, 0x56);
	BusStuff(ENAM1, i >> 2)
	BusStuff(COLUP0, 0x72);

	vcssta3(RESP0);
	BusStuff(COLUP0, 0x80);
	vcssta3(RESP0);
	vcslda2(aMask);
	BusStuff(COLUP0, 0x9e);
	BusStuff(COLUP0, 0xb2);

	vcssta3(RESP0);
	BusStuff(COLUP0, 0xe4);
	vcslda2(aMask);
}


Edited by ZackAttack, Thu May 24, 2018 10:35 AM.


#38 ZackAttack OFFLINE  

ZackAttack

    Dragonstomper

  • 733 posts
  • Location:Orlando, FL US

Posted Thu May 24, 2018 11:45 PM

160x192 procedurally generated bitmap is working in Stella. Need to optimize a bunch and then it can be tested with harmony.

 

160procedural_test_pattern.png

 

 



#39 stephena OFFLINE  

stephena

    River Patroller

  • 3,284 posts
  • Stella maintainer
  • Location:Newfoundland, Canada

Posted Fri May 25, 2018 5:11 AM

As always, I would suggest upgrading to the latest version of Stella (currently, 5.1.2).  I see you're using a pre-release of 5.1 ...






0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users