Question regarding storing multiple values into a single 16-bit wide value

skywaffle · February 5, 2019

Hello everyone,

I have been playing around with some different methods to condense data being stored. I typically would store information in various look up tables that would be accessed to help set up levels among other things, however, many values only need to be between 0 and 1, or 0-15, or 0-255. It seems that regardless of the size of the value in these tables, they still take up 16-bits worth of data which was eating a bit of extra ROM space.

I found that I could save space by combining usually two 4 bit values and an 8 bit into a 16-bit value, but was wondering if there was an easier method to retrieve the data that would use less overhead.

Currently I have a subroutine like this:

DECODE_VALUE: PROCEDURE

#VALUE1 = #ENCODEDVALUE / 256

#VALUE2 = ((#ENCODEDVALUE - (#VALUE1 * 256)) / 16)

#VALUE3 = #ENCODEDVALUE - ((#VALUE1 * 256) + (#VALUE2 * 16))

RETURN

END

I will store the value of the data to be retrieved in #ENCODEDVALUE, and GOSUB the code above to return three separate values that I then will copy to other variables.

I just was curious if I was going about this all wrong, and maybe there was a much more efficient method. Saving every bit of space is useful, and I was even contemplating for other game ideas to try to store some values into single bits to avoid having arrays that only need to hold a 0 or 1, but want to avoid using lots of unnecessary computations.

Thanks in advance,

Matthew

carlsson · February 5, 2019

Perhaps the modulo operator would help you?

#VALUE1 = #ENCODEDVALUE / 256

#VALUE3 = #ENCODEDVALUE % 256

#VALUE2 = #VALUE3 / 16

#VALUE3 = #VALUE3 % 16

One assignment more, but fewer calculations.

Edit: I'm reading 22 assembler instructions for your original method, 13 with the % operator.

Edited February 5, 2019 by carlsson

intvnut · February 5, 2019

In IntyBASIC, I suspect you would be better off using bitwise masks in many cases, rather than referring to previously unpacked values. For example:

DECODE_VALUE:  PROCEDURE
    #VALUE1 = #ENCODEDVALUE / 256
    #VALUE2 = (#ENCODEDVALUE / 16) AND 15
    #VALUE3 = #ENCODEDVALUE AND 15
END

.

This generates (in 1.4.0):

.

label_DECODE_VALUE: PROC
    BEGIN
    ;[4]     #VALUE1 = #ENCODEDVALUE / 256
    SRCFILE "/tmp/unpack.bas",4
    MVI var_&ENCODEDVALUE,R0
    SWAP R0
    ANDI #255,R0
    MVO R0,var_&VALUE1
    ;[5]     #VALUE2 = (#ENCODEDVALUE / 16) AND 15
    SRCFILE "/tmp/unpack.bas",5
    MVI var_&ENCODEDVALUE,R0
    SLR R0,2
    SLR R0,2
    ANDI #15,R0
    MVO R0,var_&VALUE2
    ;[6]     #VALUE3 = #ENCODEDVALUE AND 15
    SRCFILE "/tmp/unpack.bas",6
    MVI var_&ENCODEDVALUE,R0
    ANDI #15,R0
    MVO R0,var_&VALUE3
    ;[7] END

.

You can do better in assembly language by reusing previous partially-decoded values held only in registers. In IntyBASIC it tends to not be a win, since it doesn't have a good way to make use of intermediate results without writing extra stuff to memory. In assembly, I might do:

.

;  Assume R0 is #ENCODEDVALUE at start
;  R0 is #VALUE1, R1 is #VALUE2, R2 is #VALUE3 at end.
    MOVR   R0,     R1
    ANDI   #$FF00, R0
    XORR   R0,     R1
    SWAP   R0
    MVII   #$000F, R2
    ANDR   R1,     R2
    SLR    R1,     2
    SLR    R1,     2

.

intvnut · February 5, 2019

And, if you want to replace your unpack routine with some assembly:

.

DECODE_VALUE: PROCEDURE
    ASM MOV  var_&ENCODEDVALUE, R0
    ASM MOVR R0,     R1
    ASM ANDI #$FF00, R0
    ASM XORR R0,     R1
    ASM SWAP R0
    ASM MVO  R0,     var_&VALUE1
    ASM MVII #$000F, R0
    ASM ANDR R1,     R0
    ASM MVO  R0,     var_&VALUE3
    ASM SLR  R1,     2
    ASM SLR  R1,     2
    ASM MVO  R0,     var_&VALUE2
    RETURN
END

.

You can do even slightly better if you instead declare this as an assembly function (rather than using inline assembly), do avoid the cost of "BEGIN/RETURN", replacing them with "JR R5".

You might also just consider unpacking the field you need only when you use a value.

And if you have a bunch of 1-bit values that you're just testing for "set / not-set" in a lookup table, consider something like this:

.

' Return zero / non-zero based on whether the corresponding bit
' is set in the ROM bitmap 'Foo':
DEF FN IsFooSet(index) = (Foo(index / 16) AND Pow2(index % 16))

' Power-of-2 table.  Used as a bit-mask in IsFooSet().  Useful
' elsewhere whenever you need 2^N (aka. 1 << N).
Pow2:   DATA $0001, $0002, $0004, $0008, $0010, $0020, $0040, $0080
        DATA $0100, $0200, $0400, $0800, $1000, $2000, $4000, $8000
          
' Bitmap with 1 bit per element.
Foo: DATA $AA55 '...

Edited February 5, 2019 by intvnut

skywaffle · February 5, 2019

This is actually something I had never used, but I see it simplifies things quite a bit. One of those things that I just never had really understood. Thank you for the example. I have other routines that would benefit from using this operator as well (routines to determine even/odd numbers for example)

Perhaps the modulo operator would help you?

#VALUE1 = #ENCODEDVALUE / 256

#VALUE2 = #ENCODEDVALUE % 256

#VALUE3 = #VALUE2 / 16

#VALUE2 = #VALUE2 % 16

One assignment more, but fewer calculations.

Edited February 5, 2019 by skywaffle

carlsson · February 5, 2019

Odd/even might also be doable with AND 1 like in intvnut's examples.

skywaffle · February 5, 2019

I was wondering if there was a way to mask the values. The AND function is something I have not used often, and using it does not come natural for me still. I do rely on it with sprite collision, along with a lookup table to determine which sprite another sprite is colliding with, but have never really found any other ways to really utilize it.

Thank you for the detailed example, along with the assembly code. I will have to take some time to really try to get a better understanding of that, as I have only dabbled in some 6809 assembler many years back and really never quite had a good understanding of it.

In IntyBASIC, I suspect you would be better off using bitwise masks in many cases, rather than referring to previously unpacked values. For example:
DECODE_VALUE:  PROCEDURE
    #VALUE1 = #ENCODEDVALUE / 256
    #VALUE2 = (#ENCODEDVALUE / 16) AND 15
    #VALUE3 = #ENCODEDVALUE AND 15
END
.

This generates (in 1.4.0):

.
label_DECODE_VALUE: PROC
    BEGIN
    ;[4]     #VALUE1 = #ENCODEDVALUE / 256
    SRCFILE "/tmp/unpack.bas",4
    MVI var_&ENCODEDVALUE,R0
    SWAP R0
    ANDI #255,R0
    MVO R0,var_&VALUE1
    ;[5]     #VALUE2 = (#ENCODEDVALUE / 16) AND 15
    SRCFILE "/tmp/unpack.bas",5
    MVI var_&ENCODEDVALUE,R0
    SLR R0,2
    SLR R0,2
    ANDI #15,R0
    MVO R0,var_&VALUE2
    ;[6]     #VALUE3 = #ENCODEDVALUE AND 15
    SRCFILE "/tmp/unpack.bas",6
    MVI var_&ENCODEDVALUE,R0
    ANDI #15,R0
    MVO R0,var_&VALUE3
    ;[7] END
.

You can do better in assembly language by reusing previous partially-decoded values held only in registers. In IntyBASIC it tends to not be a win, since it doesn't have a good way to make use of intermediate results without writing extra stuff to memory. In assembly, I might do:

.
;  Assume R0 is #ENCODEDVALUE at start
;  R0 is #VALUE1, R1 is #VALUE2, R2 is #VALUE3 at end.
    MOVR   R0,     R1
    ANDI   #$FF00, R0
    XORR   R0,     R1
    SWAP   R0
    MVII   #$000F, R2
    ANDR   R1,     R2
    SLR    R1,     2
    SLR    R1,     2
.

intvnut · February 5, 2019

I was wondering if there was a way to mask the values. The AND function is something I have not used often, and using it does not come natural for me still. I do rely on it with sprite collision, along with a lookup table to determine which sprite another sprite is colliding with, but have never really found any other ways to really utilize it.

Thank you for the detailed example, along with the assembly code. I will have to take some time to really try to get a better understanding of that, as I have only dabbled in some 6809 assembler many years back and really never quite had a good understanding of it.

FWIW, AND 15 is the same as % 16, and AND 255 is the same as % 256. In general, ANDing with (2ⁿ - 1) is the same as taking modulo 2ⁿ.

I just used AND because I think in terms of bitwise operators. Heck, I even made that my Texas license plate while I still lived there. ;-)

skywaffle · February 5, 2019

That's awesome! I really have only started thinking about masking bits and trying to store information just because of the constraints of the Intellivision. It is easy to get carried away with many arrays that don't necessarily hold much information. Not to mention on the Castlevania game I have been working on, I have had so many issues with rom space. I had decided to try avoiding page flipping if I could help it. Months back I was struggling to fit half of the game in at 84kb. Now I am still right around there, but managed to get the other half of the game implemented. I still continue to try to find ways to write more efficiently if I can, hence all this about packing data. 100 bytes saved here or there feels like an accomplishment. But even after learning some of these functions, I can see simplifying other routines. It is amazing how much space convoluted logic takes up.

FWIW, AND 15 is the same as % 16, and AND 255 is the same as % 256. In general, ANDing with (2ⁿ - 1) is the same as taking modulo 2ⁿ.

I just used AND because I think in terms of bitwise operators. Heck, I even made that my Texas license plate while I still lived there.

bitwise.png

+DZ-Jay · February 5, 2019

One thing that always bother me in some high-level languages is that the modulo operator is a division that discards the quotient. Consequently, if you need both the result and the remainder, you have to essentially perform the division twice, which can get expensive.

That's not only IntyBASIC, I remember the same issue back in my Delphi 7 days. These days, I tend to write an assembly language routine and wrap it with a macro to retrieve both values if necessary.

However, I would recommend one of the alternatives offered by intvnut for more efficiency.

Perhaps intyBASIC could one-up Delphi and provide a secondary Division statement or operator that performs division and returns the remainder and the quotient somehow?

dZ.

carlsson · February 5, 2019

The manual further says:

A=A/B Simple unsigned division (done by repeated substraction, it can be slow)

Division by 2/4/8/16/32/64/128/256 is internally optimized.

Division of variable by variable uses Nanochess' fast division routine (214 to 517 cycles)

When using --jlp switch the division is accelerated by hardware

A=A%B Simple unsigned remainder

Note it does remainder by repeated substraction (can be slow).

Remainder by powers of 2 internally uses AND.

Remainder of variable by variable uses Nanochess' fast remainder routine (214 to 517 cycles)

+DZ-Jay · February 6, 2019

The AND function is something I have not used often, and using it does not come natural for me still.

I wrote a description of how to do masking with AND, explaining the concepts from first very principles, to help another user in this forum. Perhaps it can provide some insight?

http://atariage.com/forums/topic/285644-how-are-the-walls-made-in-game2bas/?p=4173285

-dZ.

First Spear · February 7, 2019

This thread is awesome, I am humbled to learn what I do each time I spend some minutes in here.

I do wonder though, if a game runs on JLP, there are many, many variables at your disposal (8 and 16 bit). What kind of design choices did you make that led you to want to pack multiple values into a single 16-bit variable? Speed of value retrieval?

+DZ-Jay · February 7, 2019

This thread is awesome, I am humbled to learn what I do each time I spend some minutes in here.

I do wonder though, if a game runs on JLP, there are many, many variables at your disposal (8 and 16 bit). What kind of design choices did you make that led you to want to pack multiple values into a single 16-bit variable? Speed of value retrieval?

Not to answer for skywaffle but in general, the reasons that drive you to pack values like that are almost always related to storage space.

Keep in mind that there will always be additional overhead incurred in unpacking the values, so there is already a performance hit built-in.

If a game is to be released in JLP hardware only and the on-board cartridge RAM is sufficient, then by all means, use that instead. Accessing memory directly will always be faster than having to retrieve, then unpack or decode, a value.

However, just be aware of the limitation you are imposing: the ROM won't run in emulators or platforms which do not support the extended RAM. This may or may not be a problem.

dZ.

intvnut · February 7, 2019

This thread is awesome, I am humbled to learn what I do each time I spend some minutes in here.

I do wonder though, if a game runs on JLP, there are many, many variables at your disposal (8 and 16 bit). What kind of design choices did you make that led you to want to pack multiple values into a single 16-bit variable? Speed of value retrieval?

My understanding here was that the ROM itself was too large, and so structures in ROM are getting packed into 16-bit words. That's somewhat orthogonal to whether you use JLP's additional RAM.

+DZ-Jay · February 7, 2019

My understanding here was that the ROM itself was too large, and so structures in ROM are getting packed into 16-bit words. That's somewhat orthogonal to whether you use JLP's additional RAM.

Well, kinda. My understanding is that skywaffle thought that the active state of the world (e.g., opened doors, picked up objects, etc.) would be too large to fit into RAM as individual arrays for each screen throughout the entire game.

-dZ.

skywaffle · February 7, 2019

For the Castlevania game, it has been all about freeing up space. My enemy look up tables for placement in levels originally had 5 values in data statements for each 5 gram-wide segments of levels. These would be used for spawning enemies and moving platforms. Just condensing two 8-bit values into a single 16 bit value and then decoding it frees up some 400 bytes.

Thinking about other games, I had wondered if projectile on/off status, among other things that only need a 1 bit value would benefit from storing everything into a single 16 bit variable. Although I have not currently run into any issues with not having enough RAM with the Castlevania game.. I feel that I have been fairly wasteful on arrays, but even then still have 20 8 bit variables and 5 16 bit variables that are unused.

As far as JLP enhanced features, I have not played with any of that yet.

This thread is awesome, I am humbled to learn what I do each time I spend some minutes in here.

I do wonder though, if a game runs on JLP, there are many, many variables at your disposal (8 and 16 bit). What kind of design choices did you make that led you to want to pack multiple values into a single 16-bit variable? Speed of value retrieval?

intvnut · February 7, 2019

OK, so it sounds like it could end up being a little bit of both: Packing ROM lookup tables and possibly packing values in memory. The initial need was ROM tables, but RAM variables may also need it eventually.

Storing projectile on/off status in a single 16-bit variable shouldn't be too costly. I'd recommend having two lookup tables: One for setting bits and one for clearing bits. Then you could define some simple functions like this:

.

DEF FN SetBit(var, bit) = var = NOT (NOT var AND ClrMask(bit))
DEF FN ClrBit(var, bit) = var = var AND ClrMask(bit)
DEF FN TstBit(val, bit) = ((val) AND SetMask(bit))

' This is the same as the power-of-2 table, so you can reuse that
' if you've already got it somewhere, rather than adding a redundant table.
SetMask:  DATA $0001, $0002, $0004, $0008
          DATA $0010, $0020, $0040, $0080
          DATA $0100, $0200, $0400, $0800
          DATA $1000, $2000, $4000, $8000

' This is the bitwise-inverse of the SetMask table.
ClrMask:  DATA $FFFE, $FFFD, $FFFB, $FFF7
          DATA $FFEF, $FFDF, $FFBF, $FF7F
          DATA $FEFF, $FDFF, $FBFF, $F7FF
          DATA $EFFF, $DFFF, $BFFF, $7FFF

.

SetBit() is written the way it is due to the fact the CPU doesn't have an OR instruction. That particular sequence generates reasonable code it appears.

(Edited to add the missing equals signs in the DEF FN above.)

Edited February 9, 2019 by intvnut

skywaffle · February 7, 2019

I am interested in the possible benefits of both. Currently I have really poured my time into just one project. But I can't help but wonder about other types of games. Some may have a need for smarter management of memory. But currently my biggest obstacle has been ROM space. I just am wanting to avoid the need for page flipping, and honestly it seems like I won't have to now.

When I started my Castlevania project, I did not know anything about IntyBASIC, and really am not a programmer. However, over the months, I have learned an awful lot, and despite trying to always use the manual for reference, some benefits to commands are not always obvious from the get go. For example, in just the past month or so I learned just how invaluable Define Varptr is. I always wondered why it seemed as if I could not define grams dynamically, as the Screen command could do so (and did not use "Varptr"). Getting rid of tons of define commands alone freed up space, not to mention made for much easier animation routines.

OK, so it sounds like it could end up being a little bit of both: Packing ROM lookup tables and possibly packing values in memory. The initial need was ROM tables, but RAM variables may also need it eventually.

Storing projectile on/off status in a single 16-bit variable shouldn't be too costly. I'd recommend having two lookup tables: One for setting bits and one for clearing bits. Then you could define some simple functions like this:

.
DEF FN SetBit(var, bit) var = NOT (NOT var AND ClrMask(bit))
DEF FN ClrBit(var, bit) var = var AND ClrMask(bit)
DEF FN TstBit(val, bit) ((val) AND SetMask(bit))

' This is the same as the power-of-2 table, so you can reuse that
' if you've already got it somewhere, rather than adding a redundant table.
SetMask:  DATA $0001, $0002, $0004, $0008
          DATA $0010, $0020, $0040, $0080
          DATA $0100, $0200, $0400, $0800
          DATA $1000, $2000, $4000, $8000

' This is the bitwise-inverse of the SetMask table.
ClrMask:  DATA $FFFE, $FFFD, $FFFB, $FFF7
          DATA $FFEF, $FFDF, $FFBF, $FF7F
          DATA $FEFF, $FDFF, $FBFF, $F7FF
          DATA $EFFF, $DFFF, $BFFF, $7FFF
.

SetBit() is written the way it is due to the fact the CPU doesn't have an OR instruction. That particular sequence generates reasonable code it appears.

intvnut · February 8, 2019

One thing that always bother me in some high-level languages is that the modulo operator is a division that discards the quotient. Consequently, if you need both the result and the remainder, you have to essentially perform the division twice, which can get expensive.

Note that IntyBASIC optimizes modulo w/ 2ⁿ to the corresponding AND instruction, and divides by 2ⁿ to the corresponding shift instruction(s). So, while your observation is correct for general divides and modulos, there really is no difference between these two expressions:

.

#Y = #X AND 15
#Y = #X % 16

.

I say use whichever one is easiest for you to read. At my $DAYJOB, they actually prefer you to use the modulo rather than the bitwise-AND. Decent compilers will do the conversion when they optimize. (Note that for C and C++, you'll want to use an unsigned type. IntyBASIC always does unsigned divide and modulo.)

+DZ-Jay · February 8, 2019

Note that IntyBASIC optimizes modulo w/ 2ⁿ to the corresponding AND instruction, and divides by 2ⁿ to the corresponding shift instruction(s). So, while your observation is correct for general divides and modulos, there really is no difference between these two expressions:

.
#Y = #X AND 15
#Y = #X % 16
.

I say use whichever one is easiest for you to read. At my $DAYJOB, they actually prefer you to use the modulo rather than the bitwise-AND. Decent compilers will do the conversion when they optimize. (Note that for C and C++, you'll want to use an unsigned type. IntyBASIC always does unsigned divide and modulo.)

Yes. I meant more about general divisions of non-powers of 2. I guess for the specific application of unpacking and masking, it really is not a big problem.

-dZ.

Question regarding storing multiple values into a single 16-bit wide value

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members