Jump to content
IGNORED

Chess


Andrew Davie

Recommended Posts

3 hours ago, SpiceWare said:

 

Correct.

 

 

my custom mode file (syntax highlighting rules) for jEdit colorizes binary, which really makes graphics in source code stand out. Since the categories went away I created a jEdit index for my blog entries related to it.

 

Screen Shot 2019-12-15 at 11.40.43 AM.png

That's lovely and what I need, but I'm firmly committed to using Atari Dev Studio these days.

It's probably easy enough to fiddle with the syntax highlighting myself... will look into it.

 

BTW: while I have your attention, the getRandom() function in your CDFJ collect source code didn't work well at all for me (randomly moving the pieces).  I recall I was using the low bits to select squares. Perhaps it was my coding/usage, but I replaced it with the following which seems to be doing an excellent job...

 

unsigned int m_z  = 2342;
unsigned int m_w = 122561;

unsigned int getRandom32b() {
    m_z = 36969 * (m_z & 65535) + (m_z >> 16);
    m_w = 18000 * (m_w & 65535) + (m_w >> 16);
    return (m_z << 16) + m_w;
}

The initial values I just chose... randomly.

Link to comment
Share on other sites

39 minutes ago, Andrew Davie said:

That's lovely and what I need, but I'm firmly committed to using Atari Dev Studio these days.

It's probably easy enough to fiddle with the syntax highlighting myself... will look into it.

 

I'd mentioned it to @mksmith about a year ago.  He looked into it and said the background colors could not be set. I told him that even just different foreground colors would help, as that's how I'd originally implemented it. Don't know if he looked into it anymore after that.

 

Screen Shot 2020-03-06 at 9.18.50 AM.png

 

 

 

39 minutes ago, Andrew Davie said:

BTW: while I have your attention, the getRandom() function in your CDFJ collect source code didn't work well at all for me (randomly moving the pieces).  I recall I was using the low bits to select squares. Perhaps it was my coding/usage, but I replaced it with the following which seems to be doing an excellent job...

 

I've been using that since I started doing DPC+ projects. I thought I got it from the wiki page I linked to, but don't see the code - possibly the page was edited, or I found it somewhere else. The ARM has an inline barrel roller so bit-shifting is done for free. I cover that here in the comments of Part 3 of the CDFJ tutorial. So I tend to use different bits, such as this snippet of the robot initial position routine in Frantic:

 

                gSpriteX[spot] = 12 + (x * 28) + (((getRandom32() & 0xff) * 18) >> 8);
                gSpriteY[spot] = 10 + (y * 60) + (((getRandom32() & 0xffff) * 43) >> 16);
                // start each robot's eye in a different position
                gSpriteAnimFrame[spot] = (6 * (getRandom32() & 0xffffff)) >> 24;

 

 

 

 

 

Link to comment
Share on other sites

Here's how I define the pieces... I manually toggle the 1's and 0's in these blocks of code.

Yes, I could write an tool to convert a graphics image (I did that for the earlier engine), but you know, you start doing something manually just to figure it out, and then you find you're nearly done and the diversion to write the tool seems greater than just continuing hacking away the way you've been doing it.  You're nearly finished, and yet a year later you realise you're still hacking away manually and you could have written that tool after all and it would have been much easier. But.. c'est la vie... it's too late now so you continue to hack.

 

const unsigned char WHITE_KNIGHT[] = {

//      RED     BLUE    GREEN     MASK      SCANLINE #

    0b00001, 0b00001, 0b00001, 0b11110,    //  0
    0b00001, 0b00001, 0b00001, 0b11110,    //  1
    0b00101, 0b00101, 0b00101, 0b11010,    //  2
    0b00110, 0b00110, 0b00110, 0b11000,    //  3
    0b01011, 0b01011, 0b01010, 0b10000,    //  4
    0b11011, 0b11011, 0b01010, 0b00000,    //  5
    0b11110, 0b11110, 0b01110, 0b00000,    //  6
    0b11111, 0b11111, 0b11110, 0b00000,    //  7
    0b11111, 0b11111, 0b11110, 0b00000,    //  8
    0b11110, 0b11110, 0b11110, 0b00000,    //  9
    0b01111, 0b01111, 0b01110, 0b00000,    // 10
    0b11111, 0b11111, 0b11110, 0b00000,    // 11
    0b00110, 0b00110, 0b00110, 0b10000,    // 12
    0b01111, 0b01111, 0b01110, 0b10000,    // 13
    0b01111, 0b01111, 0b01110, 0b10000,    // 14
    0b01110, 0b01110, 0b01110, 0b10001,    // 15
    0b01100, 0b11100, 0b01100, 0b00001,    // 16
    0b11010, 0b11010, 0b11000, 0b00001,    // 17
    0b11111, 0b11111, 0b11110, 0b00000,    // 18
    0b11111, 0b11111, 0b11110, 0b00000,    // 19
    0b11111, 0b11111, 0b11110, 0b00000,    // 20
    0b11111, 0b11111, 0b11110, 0b00000,    // 21
    0b00000, 0b00000, 0b00000, 0b00000,    // 22
    0b11111, 0b00000, 0b00000, 0b00000,    // 23
};

Each chess piece is defined as a 5-bit x 24 scanline "image" represented as binary in the code above.  Each line in the definition gives a single scanline of 5 pixels. Each pixel is one of the bits (left to right) in the binary numbers.  There are three columns - one for each colour, and a fourth column giving a mask.  So on line 1 of the screen where the shape is drawn, we'd see the pixels in the "RED" column, scanline 1.  On line 2, the colour "rolls" so we'd see the pixels from the "BLUE" column. On line 3, the "GREEN" column would be used.  So, we're continually "rolling" through RED/BLUE/GREEN/RED/BLUE/GREEN... etc... as we go down the scanlines.  But the next frame we start with BLUE instead of RED. And the frame after that start with GREEN... and then RED... etc.

 

So, the 4th and final byte in each "line" is a mask.  This has 0 where there is any pixel in use in the corresponding RED/GREEN/BLUE columns.  You'd think you could generate this mask given you just have the RED/GREEN/BLUE already, but this is not the case. Even if your RGB pixels are completely unused, you may wish to mask out background before you draw. For example, black pixels - should they be transparent (showing background instead of black)... or opaque... showing black?  The mask is where you determine this. If you want black, then you set the mask pixel as 0 (= used).  If you want transparent, set to 1 (= unused).  This is useful for putting black borders around objects, or having non-seethrough stuff (like an eye on the knight).

 

Effectively we're going ....  screen[y][x] = (screen[y][x] & mask) | pixel

 

The squares are coloured simply by drawing a similarly-defined "square" shape.  It works exactly the same way, but has all-BLUE pixels, and of course the mask is set to all-zero; causing all the previous contents to be "erased" to the square colour.  So to draw a piece, we first draw the blank-square piece (blue or black), and then we draw the piece over that.

 

This differs significantly from the pure 6507 version of the earlier engine, which had to pre-define all the shifted versions of all the pieces on both square colours for all the pieces for both piece colours. It took quite a few banks of shape definitions.  Here I'm trading the power of the ARM to just define the pieces in isolation, and use the ARM's efficient shifting and speed to do the draw in "efficient" stages.  I have separate code for each of the 8 horizontal positions of the squares, specifically because you have to shift/mask/mirror the image based on the skewy 2600 PF registers. Specifically, PF0, PF2 are mirrored and PF1 is not.  But the squares are 5-pixels wide, so each of the 8 positions are unique combinations of mirroring/masking/shifting.

 

For example, here's file "C"...

 

// column C = PF1 D1D0, PF2<-D0D1D2
int column = 2;
image = *charSet[screen[row][column]];

for (int y = 0; y < 24; y++) {
  int scanline = row * 24 + y;
  int lineIndex = y * 4;
  int maskIndex = lineIndex + 3;
  int colourIndex = lineIndex + ((_rgb & 3));

  // the SQUARE first
  // remove span of pixels used
  arena_pf1_left[scanline] &= 0b11111100;
  arena_pf2_left[scanline] &= 0b11111000;

  // add the square colour

  if (!((row + column) & 1)) {
    arena_pf1_left[scanline] |= (squareBaseImage[colourIndex] >> 3) & 0b00000011;
    arena_pf2_left[scanline] |= (BitReversal(squareBaseImage[colourIndex]) >> 5) & 0b00000111;
  }

  if (screen[row][column]) {
    arena_pf1_left[scanline] &= (image[maskIndex] >> 3) | 0b11111100;
    arena_pf1_left[scanline] |= (image[colourIndex] >> 3) & 0b00000011;
    arena_pf2_left[scanline] &= (BitReversal(image[maskIndex]) >> 5) | 0b11111000;
    arena_pf2_left[scanline] |= (BitReversal(image[colourIndex]) >> 5) & 0b00000111;
  }

  _rgb++;
  if (_rgb > 2)
    _rgb = 0;
}

It's not terribly efficient - and I draw all 64 squares every single frame.  That's why I doubt this will run on hardware. But that's an easy fix.... triple-buffer the screen, and then just present the buffers on demand. And rather than drawing 64 pieces every single frame, I just draw one piece/frame, or as time allows. That should work, provided I have RAM for the triple-buffer.  It's all a learning experience becoming familiar with the limitations of CDFJ.

 

Especially hard without the hardware I need to test it's all working properly.  So, I guess I'm saving up for a Harmony Encore cartridge!

 

 

 

Edit: I realise...

(a) I'm using char-based accesses and ARM is 32-bit-based. Inefficient.
(b) since I'm only using 20 bits/line, I could/should mangle them all together and use the ARM masking/shifting to extract. This would reduce the 72-byte requirement for each down to 24 bytes.  So, yeah, that's the way to go. I'll do that ASAP.  And then, we also have the (a) sorted, as we'll just declare them as 24 ints.


 

 

Edited by Andrew Davie
  • Like 2
Link to comment
Share on other sites

2 hours ago, SpiceWare said:

 

I'd mentioned it to @mksmith about a year ago.  He looked into it and said the background colors could not be set. I told him that even just different foreground colors would help, as that's how I'd originally implemented it. Don't know if he looked into it anymore after that.

 

Screen Shot 2020-03-06 at 9.18.50 AM.png

 

 

 

 

I've been using that since I started doing DPC+ projects. I thought I got it from the wiki page I linked to, but don't see the code - possibly the page was edited, or I found it somewhere else. The ARM has an inline barrel roller so bit-shifting is done for free. I cover that here in the comments of Part 3 of the CDFJ tutorial. So I tend to use different bits, such as this snippet of the robot initial position routine in Frantic:

 


                gSpriteX[spot] = 12 + (x * 28) + (((getRandom32() & 0xff) * 18) >> 8);
                gSpriteY[spot] = 10 + (y * 60) + (((getRandom32() & 0xffff) * 43) >> 16);
                // start each robot's eye in a different position
                gSpriteAnimFrame[spot] = (6 * (getRandom32() & 0xffffff)) >> 24;

 

 

 

 

 

Hi guys, still on the list at this point as the way the rendering engine works in VS Code make's it almost impossible.  It been 10 months since I opened an item in Github and VS Code has made some changes (I believe) around rendering so they may have some enhancements in this area (there has been a lot of feedback to allow this sort of thing).  I'll take another look shortly and see if it might be possible. 

  • Like 4
Link to comment
Share on other sites

I've spent many hours trying to get some good shading/detail on the pieces. I've mostly finished white;  black pieces have a bit of work to do here and there. But most of my time has been spent with that really mind-killing reverse-order of PF0 and PF2 and trying to correctly mask a 5-pixel-wide shape onto a mirrored-normal-mirrored stretch of 20 pixels. Getting there, but I still have some masking issues (removing the previous content) so the blue squares disappear here and there, and columns C and G have incorrect colours for some pieces (because of what's left behind).  

 

 

But overall, I think it's looking nice.

I tried to record/convert the video to preserve the shimmer.  I still haven't seen this on a CRT.

 

  • Like 1
Link to comment
Share on other sites

So I ended up following my own advice from earlier, and rewrote the code to use the compacted shape definitions. 

I was hoping to be able to do some macro string manipulation so I could pass strings with discernable visuals for 0 and 1 (the shaded ASCII block characters would have done nicely).  But I couldn't see any way to do a substring operator in the macro, so I ended up just using 1s and 0s.  Here's what I ended up with...

 

// range: all 0 - 0b11111
#define B(red, green, blue, mask) \
      ((0b##red & 0b11111) << 0) \
    | ((0b##green & 0b11111) << 5) \
    | ((0b##blue & 0b11111) << 10 ) \
    | ((0b##mask & 0b11111) << 15)

 // Pieces defined thus... 
 const unsigned int WHITE_PAWN[] = {
    //     R      G      B   MASK
    B( 00000, 00000, 00000, 00000 ),    //  0
    B( 00000, 00000, 00000, 00000 ),    //  1
    B( 00000, 00000, 00000, 00000 ),    //  2
    B( 00100, 00000, 00100, 00100 ),    //  3
    B( 00100, 00100, 00100, 00100 ),    //  4
    B( 00100, 00100, 00100, 00100 ),    //  5
    B( 00100, 00100, 00100, 00100 ),    //  6
    B( 00100, 00000, 00000, 00100 ),    //  7
    B( 00000, 00000, 00000, 00100 ),    //  8
    B( 01110, 01100, 01110, 01110 ),    //  9
    B( 01110, 01100, 01110, 01110 ),    // 10
    B( 00000, 00000, 00000, 01110 ),    // 11
    B( 00100, 00100, 00100, 00100 ),    // 12
    B( 00100, 00100, 00100, 00100 ),    // 13
    B( 00100, 00100, 00100, 00100 ),    // 14
    B( 00100, 00100, 00100, 00100 ),    // 15
    B( 00100, 00100, 00100, 00100 ),    // 16
    B( 00100, 00100, 00100, 00100 ),    // 17
    B( 01110, 01100, 01110, 01110 ),    // 18
    B( 01110, 01100, 01110, 01110 ),    // 19
    B( 01110, 01100, 01110, 01110 ),    // 20
    B( 00000, 01110, 00000, 01110 ),    // 21
    B( 01110, 00000, 00000, 01110 ),    // 22
    B( 00000, 00000, 00000, 00000 ),    // 23
};

 

Link to comment
Share on other sites

Well, I figured out a kind of workaround that does make things a bit easier... even without highlighting...

 


// range: all 0 - 0b11111
#define B(red, green, blue, mask) \
      ((red & 0b11111) << 0) \
    | ((green & 0b11111) << 5) \
    | ((blue & 0b11111) << 10 ) \
    | ((mask & 0b11111) << 15)


#define _____ 0b00000
#define ____X 0b00001
#define ___X_ 0b00010
#define ___XX 0b00011
#define __X__ 0b00100
#define __X_X 0b00101
#define __XX_ 0b00110
#define __XXX 0b00111
#define _X___ 0b01000
#define _X__X 0b01001
#define _X_X_ 0b01010
#define _X_XX 0b01011
#define _XX__ 0b01100
#define _XX_X 0b01101
#define _XXX_ 0b01110
#define _XXXX 0b01111
#define X____ 0b10000
#define X___X 0b10001
#define X__X_ 0b10010
#define X__XX 0b10011
#define X_X__ 0b10100
#define X_X_X 0b10101
#define X_XX_ 0b10110
#define X_XXX 0b10111
#define XX___ 0b11000
#define XX__X 0b11001
#define XX_X_ 0b11010
#define XX_XX 0b11011
#define XXX__ 0b11100
#define XXX_X 0b11101
#define XXXX_ 0b11110
#define XXXXX 0b11111

const unsigned int WHITE_PAWN[] = {
    //     R      G      B   MASK
    B( _____, _____, _____, _____ ),    //  0
    B( _____, _____, _____, _____ ),    //  1
    B( _____, _____, _____, _____ ),    //  2
    B( __X__, _____, __X__, __X__ ),    //  3
    B( __X__, __X__, __X__, __X__ ),    //  4
    B( __X__, __X__, __X__, __X__ ),    //  5
    B( __X__, __X__, __X__, __X__ ),    //  6
    B( __X__, _____, _____, __X__ ),    //  7
    B( _____, _____, _____, __X__ ),    //  8
    B( _XXX_, _XX__, _XXX_, _XXX_ ),    //  9
    B( _XXX_, _XX__, _XXX_, _XXX_ ),    // 10
    B( _____, _____, _____, _XXX_ ),    // 11
    B( __X__, __X__, __X__, __X__ ),    // 12
    B( __X__, __X__, __X__, __X__ ),    // 13
    B( __X__, __X__, __X__, __X__ ),    // 14
    B( __X__, __X__, __X__, __X__ ),    // 15
    B( __X__, __X__, __X__, __X__ ),    // 16
    B( __X__, __X__, __X__, __X__ ),    // 17
    B( _XXX_, _XX__, _XXX_, _XXX_ ),    // 18
    B( _XXX_, _XX__, _XXX_, _XXX_ ),    // 19
    B( _XXX_, _XX__, _XXX_, _XXX_ ),    // 20
    B( _____, _XXX_, _____, _XXX_ ),    // 2X
    B( _XXX_, _____, _____, _XXX_ ),    // 22
    B( _____, _____, _____, _____ ),    // 23
};

Quite the glorious hack.

 

2112912649_ScreenShot2021-01-22at3_49_51pm.thumb.png.ded4ce8767c1920a6d470535a0ce3baf.png

Link to comment
Share on other sites

Here's a version.  Firstly, it's pretty much @SpiceWare's collect demo with just a few tweaks. All the special CDFJ magic is nothing to do with me. So, thanks again for that.  What this is is a simple Interleaved Chronocolour(TM) display of a chessboard.  The board is generated every frame in ARM code, and frankly I don't see how this could work on real hardware. If you have a Harmony Cart, you might try to run it - but as noted... doubt it will work.

 

But, it works on Stella and that's all I really wanted.

 

If you do run on Stella, you should (to be fair) set the phosphor setting (TAB/Video&Audio/TV Effects) to something which you think approximates a real TV, in terms of phosphor persistance. I find that anything from 50% upwards seems "fair" to me, but YMMV.

 

I guess at this stage that's a wrap. I've done all I wanted to do and this one can be put in the filing cabinet.

 

 

 

CDFJChess.bin

Edited by Andrew Davie
  • Like 2
Link to comment
Share on other sites

9 hours ago, Andrew Davie said:

Well, I figured out a kind of workaround that does make things a bit easier... even without highlighting...

 

That's how I used to do it:

graphics.h

 

 

; graphics
	SEG.U VARS

; preceeding with zz so these variables don't replace others in Stella's debugger


zz________ = %00000000; $0 0
zz_______X = %00000001; $1 1
zz______X_ = %00000010; $2 2
zz______XX = %00000011; $3 3
zz_____X__ = %00000100; $4 4
zz_____X_X = %00000101; $5 5
zz_____XX_ = %00000110; $6 6
...

 

Link to comment
Share on other sites

59 minutes ago, Andrew Davie said:

The board is generated every frame in ARM code, and frankly I don't see how this could work on real hardware. If you have a Harmony Cart, you might try to run it - but as noted... doubt it will work.

 

Board looks correct, though the screen rolls:

 

IMG_1559D.thumb.jpg.3acd918ceb4a1547ae3d8a1f56b17ba1.jpg

 

IMG_1558D.thumb.jpg.62f07e5012f5a2459e6685cba94b0630.jpg

 

 

 

 

One of the things I do in my projects early on is add a way to show VB and OS time remaining to make it easier to figure out where I'm having timing issues. From Part 8 - Score & Timer:

 

Quote

One of the challenges when developing DPC+ARM code is that Stella does not emulate how long ARM code takes to run. As far as it's concerned, all ARM code will finish executing in 0 cycles of 6507 time. Because of this it's very easy to write something that will run just fine in emulation, but will cause screen jitters and/or rolls, or even a fatal crash when run on a real Atari. We're already checking timers in our 6507 code, so we can easily save those values and display them in the score.

 

...

 

Left B and Right A - Timing Remaining in Vertical Blank and Overscan

 

collect3_3.png

 

 

 

Link to comment
Share on other sites

1 minute ago, SpiceWare said:

 

Board looks correct, though the screen rolls:

 

One of the things I do in my projects early on is add a way to show VB and OS time remaining to make it easier to figure out where I'm having timing issues. From Part 8 - Score & Timer:

 

Thanks for testing and for the timer info.

 

Can you explain how things work on the cart itself?  How does the ARM have "heaps" of time to draw stuff, when the 6507 is running too?  Specifically, in the PlusCart for example, the address bus needs to be serviced pretty regularly and quickly.  I'm unclear/unsure what exactly the relationship is between the CDFJ cart and the 6507.  How does the ARM "get away" with having heaps of time to do stuff, and yet the 6507 is still running?  Is the bus servicing interrupt driven?

Link to comment
Share on other sites

30 minutes ago, Andrew Davie said:

Can you explain how things work on the cart itself?

 

The ARM runs in 2 modes.  

 

  1. When 6507 code is running the ARM is put in a very tight loop monitoring the address bus and reacting as needed to implement the CDFJ registers and such.
  2. When the 6507 code does LDY #$FF*/STY CALLFN to trigger your custom C code the ARM captures the current 6507 address, puts a NOP on the databus to idle the 6507, runs your custom code, then puts a JMP ADDRESS+1 on the databus to return the 6507 to the instruction after STY CALLFN.

At the bottom of this blog entry is Further Reading with a few links to @cd-w blog entries about the Harmony that should prove enlightening.

 

 

If you used LDY #$FE/STY CALLFN for digital audio support then an interrupt is used on the ARM to have it output the proper values on the databus to update AUDV0 once per scanline. I'm not sure of the specifics, but it probably outputs LDA #xx/STA AUDV0/NOP

  • Like 1
Link to comment
Share on other sites

In regards to #2, as the 6507 executes the NOPs the PC will be incrementing. So the STY CALLFN should be towards the beginning of the ROM address space. If you have it towards the end you risk the PC wrapping around to $0000, which would crash the 6507 code.

Link to comment
Share on other sites

@SpiceWare thanks for the explanation. I understand.

 

I obviously need to get that timing info onscreen ASAP.


Meanwhile, it only works if I can either draw the entire board in the available time... OR I setup a buffered board (3 frames for RGB) on the ARM, and feed these directly as the original PrepArenaBuffers() did.  That would require a 2nd set (3 frames for RGB) to be available to draw into while the buffered set are being displayed. A fair whack of (ARM) memory;  6 bytes x 192 scanlines x 3 colours (RGB) * 2 buffer sets = 6912 bytes.

 

The current draw takes just the 6 x 192 bytes.

 

I've had a go at optimising the current draw, just to see if it will fit.  It's almost twice as quick, I'm pretty sure. I guess I'm asking nicely for someone to test the attached binary on their Harmony Cart and see if I've fixed the timing and the screen no longer rolls.

 

For posterity, here is the complete screen draw source code...

void PrepArenaBuffers()
{
    // This function loads the selected Arena layout into the 6 playfield buffers.

    
    // Set the cycling colours for ICC
    int cno = cbase;
    for (int i = 1; i < 192; i++) {  //???
        if (cno > 2)
            cno = 0;
        RAM[_BUF_COLUPF + i] = ColorConvert(RGB[cno]);
        cno++;
    }

unsigned char *arena_pf0_left   = RAM + _BUF_PF0_LEFT;
unsigned char *arena_pf1_left   = RAM + _BUF_PF1_LEFT;
unsigned char *arena_pf2_left   = RAM + _BUF_PF2_LEFT;
unsigned char *arena_pf0_right  = RAM + _BUF_PF0_RIGHT;
unsigned char *arena_pf1_right  = RAM + _BUF_PF1_RIGHT;
unsigned char *arena_pf2_right  = RAM + _BUF_PF2_RIGHT;

#define PF0 0
#define PF1 1
#define PF2 2

    unsigned char *arenas[][3] = {
        { arena_pf0_left, arena_pf1_left, arena_pf2_left },
        { arena_pf0_right, arena_pf1_right, arena_pf2_right },
    };

    // Choose a random square. If there's a piece there, try to move it randomly
    int x = getRandom32b() & 7;
    int y = getRandom32b() & 7;

    if (screen[y][x] && (getRandom32b() & 0xFF) < 0x20) {

        int tox = getRandom32b() & 7;
        int toy = getRandom32b() & 7;

        if (!screen[toy][tox]) {
            screen[toy][tox] = screen[y][x];
            screen[y][x] = 0;
        }
    }


    int rgb = cbase;
    int scanline;
    int mask, mask2;
    unsigned int piece;
    const unsigned int *im;
    int boardSquareImage;
    int pixels;

    // The draw of all pieces

    for (int row = 0; row < 8; row++) {
        for (int half = 0; half < 2; half++) {

            // column A = PF0<-D4D5D6D7, PF1 D7

            piece = screen[row][half * 4];
            if (!piece)
                piece += row & 1;

            im = *charSet[piece];
            boardSquareImage = *(*charSet[row & 1]);
            mask = boardSquareImage >> 15;

            scanline = row * 24;
            for (int y = 0; y < 24; y++) {

                if (rgb > 2)
                    rgb = 0;

                mask2 = *im >> 15;
                pixels = (((boardSquareImage >> (rgb * 5)) & ~mask2)
                    | (*im >> (rgb * 5))) & 0b11111;
                mask2 |= boardSquareImage >> 15;

                arenas[half][PF0][scanline] =
                    (arenas[half][PF0][scanline] & (~BitReversal(mask2 >> 1) & 0b11110000))
                    | (BitReversal(pixels >> 1) & 0b11110000);

                arenas[half][PF1][scanline] =
                    (arenas[half][PF1][scanline] & ~(mask2 << 7))
                    | ((pixels << 7) & 0b10000000);

                scanline++;
                im++;

                if (++rgb > 2)
                    rgb = 0;
            }

            // column B = PF1 D6D5D4D3D2

            piece = screen[row][1 + half * 4];
            if (!piece)
                piece += (row + 1) & 1;

            im = *charSet[piece];
            boardSquareImage = *(*charSet[(row + 1) & 1]);

            scanline = row * 24;
            for (int y = 0; y < 24; y++) {

                mask2 = *im >> 15;
                pixels = (((boardSquareImage >> (rgb * 5)) & ~mask2)
                    | (*im >> (rgb * 5))) & 0b11111;
                mask2 |= boardSquareImage >> 15;

                arenas[half][PF1][scanline] =
                    (arenas[half][PF1][scanline] & ~(mask << 2))
                    | ((pixels << 2) & 0b01111100);

                scanline++;
                im++;

                if (++rgb > 2)
                    rgb = 0;
            }

            // column C = PF1 D1D0, PF2<-D0D1D2

            piece = screen[row][2 + half * 4];
            if (!piece)
                piece += (row + 2) & 1;

            im = *charSet[piece];
            boardSquareImage = *(*charSet[(row + 2) & 1]);

            scanline = row * 24;
            for (int y = 0; y < 24; y++) {

                mask2 = *im >> 15;
                pixels = (((boardSquareImage >> (rgb * 5)) & ~mask2)
                    | (*im >> (rgb * 5))) & 0b11111;
                mask2 |= boardSquareImage >> 15;

                arenas[half][PF1][scanline] =
                    (arenas[half][PF1][scanline] & ~(mask >> 3))
                    | ((pixels >> 3) & 0b11);

                arenas[half][PF2][scanline] =
                    (arenas[half][PF2][scanline] & ~(BitReversal(mask) >> 5))
                    | ((BitReversal(pixels) >> 5) & 0b111);

                scanline++;
                im++;

                if (++rgb > 2)
                    rgb = 0;
            }

            // column D = PF2<-D3D4D5D6D7

            piece = screen[row][3 + half * 4];
            if (!piece)
                piece += (row + 3) & 1;

            im = *charSet[piece];
            boardSquareImage = *(*charSet[(row + 3) & 1]);

            scanline = row * 24;
            for (int y = 0; y < 24; y++) {

                mask2 = *im >> 15;
                pixels = (((boardSquareImage >> (rgb * 5)) & ~mask2)
                    | (*im >> (rgb * 5))) & 0b11111;
                mask2 |= boardSquareImage >> 15;

                arenas[half][PF2][scanline] =
                    (arenas[half][PF2][scanline] & ~BitReversal(mask))
                    | (BitReversal(pixels) & 0b11111000);

                scanline++;
                im++;

                if (++rgb > 2)
                    rgb = 0;
            }
        }

    if (++cbase > 2)
        cbase = 0;
  

    }
}

 

 

CDFJChess.bin

Edited by Andrew Davie
  • Like 1
Link to comment
Share on other sites

On 1/21/2021 at 3:20 AM, Andrew Davie said:

// range: all 0 - 0b11111
#define B(red, green, blue, mask) \
      ((0b##red & 0b11111) << 0) \
    | ((0b##green & 0b11111) << 5) \
    | ((0b##blue & 0b11111) << 10 ) \
    | ((0b##mask & 0b11111) << 15)

 // Pieces defined thus... 
 const unsigned int WHITE_PAWN[] = {
    //     R      G      B   MASK
    B( 00000, 00000, 00000, 00000 ),    //  0
    B( 00000, 00000, 00000, 00000 ),    //  1
    B( 00000, 00000, 00000, 00000 ),    //  2
    B( 00100, 00000, 00100, 00100 ),    //  3
    B( 00100, 00100, 00100, 00100 ),    //  4
    B( 00100, 00100, 00100, 00100 ),    //  5
...

Alternatively, you could define this data in 6502 assembly, and call it from C/ARM directly. I store all my CDFJ data in assembly banks, as I found C arrays use overhead-bytes in ROM.

Also, as you're only using 20 bits per line, you don't need the full 32 bits of int.

Link to comment
Share on other sites

Just now, Dionoid said:

Alternatively, you could define this data in 6502 assembly, and call it from C/ARM directly. I store all my CDFJ data in assembly banks, as I found C arrays use overhead-bytes in ROM.

Also, as you're only using 20 bits per line, you don't need the full 32 bits of int.

Noted, and thanks for the comments. I was wondering what the advantages were.

Having said that, the issue here is speed, not memory.

 

  • Like 1
Link to comment
Share on other sites

Just recording the best RGB triplet I've found so far...  

 

const int RGB[] = { 0x32, 0xD6, 0x82 };

White pieces use colours 0, 1, 2 (and 0, 2 for shading)

Black pieces uses colour 0 (and 1 for shading)

Squares use colour 3

 

NOTE: It doesn't look like this to the eye... each line "shimmers", quite unlike interlaced flicker, but kind of like it.

It's a 20Hz colour interlace with the lines in a static (same) position. Hard to describe, even harder to emulate.

 

1998613213_ScreenShot2021-01-24at12_56_13am.thumb.png.def523ecf310513399116efbba29ab5f.png

 

These result in fairly "rich/vibrant" colours.

Image is 1/3 frames of the ICC triplet (i.e., just a screen grab).

Stella phosphor is set to 30% per others' recommendations.

 

Edit: brighter with...

 

const int RGB[] = { 0x36, 0xDA, 0x86 };

2105319541_ScreenShot2021-01-24at1_03_09am.thumb.png.8449cccfabe09b8257bf529f2cdcc8ab.png

Edited by Andrew Davie
Link to comment
Share on other sites

12 hours ago, Andrew Davie said:

I guess I'm asking nicely for someone to test the attached binary on their Harmony Cart and see if I've fixed the timing and the screen no longer rolls.

 

Still rolls :(

 

12 hours ago, Andrew Davie said:

A fair whack of (ARM) memory;  6 bytes x 192 scanlines x 3 colours (RGB) * 2 buffer sets = 6912 bytes.

 

In CDFJ the data streams for the 6507 are restricted to the 4K Display Data RAM. Display Data wraps, so after fetching $0fff the next byte would be returned from $0000.

 

The forthcoming CDFJ+ supports more memory, and Display Data is no longer limited to 4K.

On 11/16/2020 at 2:42 PM, SpiceWare said:

...

CDFJ+ is currently in development. Still uses 2K of ROM and 2K of RAM, but supports a newer Melody board with configurations of:

  • 64K ROM & 16K RAM
  • 128K ROM & 16K RAM
  • 256K ROM & 32K RAM
  • 512K ROM & 32K RAM

There's at least one game in the pipeline that'll be using CDFJ+.

 

Link to comment
Share on other sites

This is a "how it works" version showing in slow-motion the three ICC frames.

This is playing at 1/8 normal cycling speed, so you can clearly see the makeup of each ICC 'pixel'.  If you look, for example, at the green dot in both the normal (60 Hz) version, and in this (7.5 Hz) version, you can see how the green scanlines "roll" over 3 lines.  But in the fast version this isn't readily apparent, and the whole 3 lines merges into a reasonable "green".

 

 

CDFJChess.bin

  • Like 1
Link to comment
Share on other sites

14 hours ago, SpiceWare said:

Still rolls :(

It's been bugging me big-time. So close.

 

OK, I've worked a new algorithm for drawing the screen. This is at least twice as quick as the last version.

I'm running out of magic tricks for speedups.  Fingers crossed that this one will work though.

 

 

 

 

CDFJChess.bin

Edited by Andrew Davie
Even more speedup (TM)
Link to comment
Share on other sites

This is the entire of the new draw. Not much room left for optimisation of code. 

Perhaps data structures might be changed for efficiency; but there's not a lot there I can see to improve.

 

// The draw of all pieces

for (int row = 0; row < 8; row++) {
  for (int half = 0; half < 2; half++) {

    // column A = PF0<-D4D5D6D7, PF1 D7
    // column B = PF1 D6D5D4D3D2
    // column C = PF1 D1D0, PF2<-D0D1D2
    // column D = PF2<-D3D4D5D6D7

    piece = screen[row][half * 4] + (row & 1) * 32;
    im = *charSet[piece];
    piece2 = screen[row][1 + half * 4] + ((row + 1) & 1) * 32;
    im2 = *charSet[piece2];
    piece3 = screen[row][2 + half * 4] + ((row + 2) & 1) * 32;
    im3 = *charSet[piece3];
    piece4 = screen[row][3 + half * 4] + ((row + 3) & 1) * 32;
    im4 = *charSet[piece4];


    scanline = row * 24;
    for (int y = 0; y < 24; y++) {

      int shifter = rgb * 5;

      pixels = *im++ >> shifter;
      pixels2 = *im2++ >> shifter & 0b11111;
      pixels3 = *im3++ >> shifter & 0b11111;
      pixels4 = *im4++ >> shifter & 0b11111;

      arenas[half][PF0][scanline] = BitReversal(pixels >> 1);
      arenas[half][PF1][scanline] = ((pixels << 7) & 0b10000000) | (pixels2 << 2) | (pixels3 >> 3);
      arenas[half][PF2][scanline] = BitReversal(pixels3) >> 5 | BitReversal(pixels4);

      scanline++;

      if (++rgb > 2)
        rgb = 0;
    }

  }

 

  • Like 1
Link to comment
Share on other sites

I see one more thing; replace 'BitReversal' with table lookup.

... and if I get really desperate (I'm close)... change the shape definitions to have both normal and mirrored shapes so that I don't have to do the bit reversal at runtime.

 

Edited by Andrew Davie
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...