Jump to content
  • entries
    39
  • comments
    621
  • views
    147,909

Optimize data overlapping


Thomas Jentzsch

4,500 views

During my work with Space Castle I have to manage quite a lot of graphic data. Especially the castle explosion is very big (currently 16 frames, each 5 blocks of 40 bytes = 3200 bytes) but there also is a lot of other graphic data needed for the various messages and the scores.

 

All data is compressed by overlapping the data blocks. For the explosion this was done by Chris before I took over from him. When Nathan asked me, if there is enough space for frames to improve the explosion, I was facing the problem that I would have to optimize the overlapping of the 80 (or more!) blocks by hand again. And that this would always result into less than optimal solutions.

 

So I decided to write my own program for doing that. icon_smile.gif

 

The basic idea was to calculate the overlappings of each block with each other block and then let the code do some recursive depth-first-search (DFS) to try all possible combinations. While this worked within reasonable time for 10 blocks, I soon realized, that for 80 blocks this would take forever (complexity is O(nn)). So I needed to find a faster way, which should deliver solutions close to the optimum in short time.

 

I searched the web and found... nothing (well, nothing I could understand icon_smile.gif).

 

Then I experimented with the Greedy algorithm. My first code just started with the 1st block and reordered the other blocks by always selecting the next block with the maximum overlapping. This delivered very fast results, but far from the optimum I had found with my DFS for lower block numbers. So decided to iterate over all blocks as starting blocks for Greedy and then chose the best solution. This returned somewhat better results, but was still not satisfying.

 

While doing the last improvement, I made a coding error which resulted into mixing the blocks. The calculation result was still OK, but the data generated for DASM was wrong, because I had forgotten to reset the block ordering before each iteration. When I got the result from the last optimization, I remembered that my buggy code had delivered even better results. So I reimplemented the bug again and just fixed the output. Then I let the code run for some more iterations and the results where GREAT! icon_smile.gif

 

So this is how my algorithm works now:

  1. calculate the maximum overlapping between each block and each other block (in both directions)
  2. call Greedy with random starting blocks many times (e.g. n3) and feed the resulting block order into the next Greedy call

Pretty simple and reasonably fast for the number of blocks I need. And the results are perfect for low numbers and very constant for higher numbers (where I cannot verify with DFS anymore).

 

Attached you find the program and some test files, which also offers a lot of options, like page sizes to consider, offsets into pages, various output formats etc. This should be pretty useful.

DOOD v0.91.zip DOOD v0.93.zip

18 Comments


Recommended Comments

All data is compressed by overlapping the data blocks.

 

Does this mean that, in order to get the 6th frame of animation for a sprite you must iterate through frames 1, 2, 3, 4 and 5?

Link to comment

No, we use pointers to the beginning of each block of the frames.

 

ExplodePtrLo
DC.B <Explode0_0, <Explode0_1, <Explode0_2, <Explode0_3, <Explode0_4
DC.B <Explode1_0, <Explode1_1, <Explode1_2, <Explode1_3, <Explode1_4
DC.B <Explode2_0, <Explode2_1, <Explode2_2, <Explode2_3, <Explode2_4
...
DC.B <Explode15_0, <Explode15_1, <Explode15_2, <Explode15_3, <Explode15_4

And a 2nd table (ExplodePtrHi) for the high byte.

Link to comment

greedy works great if the dataset is structured to match it. there will always be exceptions, but usually when the data is random, that is, you cant' organize it.

Link to comment

Updated to version 0.93

Improvements:

  • much better scanner (handles tabs and comments, skips empty lines)
  • output can be identical to input now (new default)
  • fixed a bug when data is completely included in other data
  • overlapping matrix file is sorted identical to data in generated output file
  • can handle special zz and zr formats (constants used for easier graphics data definition)
  • some new statistic information added
Link to comment

Maybe a stupid question, but: how do I run this?

(limited to linux, and if wine wont do, Im screwed but still I'd like to know ;-)

Link to comment

You just start the executable with the file containing the graphics you want to optimize as an argument. There are a few more arguments for fine tuning, but none of them is really required.

 

The output are two files:

  • a matrix showing how each graphic block overlaps with each other
  • and the optimized graphics file.
Link to comment

The code is a mess, so I won't post it here. It uses FreePascal which is available for Linux too. I can PM you the code then.

 

Or you send me the graphics file and I optimize it for you.

Link to comment

The code is a mess, so I won't post it here. It uses FreePascal which is available for Linux too. I can PM you the code then.

 

 

I'm set up for Pascal now and should be able to create a Mac executable.

Link to comment

Sorry for the very late bump, but I just came across this post. Could you provide some background as to how the "overlapping blocks" method of packing data works, in principle?

 

I would like to better understand how to efficiently pack large amounts of repeating data.

 

dZ.

Link to comment

This is very simple.

 

If the last n bytes of one data block are identical with the first n bytes of another data block, we can remove the last n bytes of the first data block and overlap the end of this data block with the beginning of the other one.

 

Also it may be possible, that the content of a whole data block is completely included in another data block, but that's very rare.

  • Like 1
Link to comment

Thank you, Thomas, for that explanation, it makes sense.

 

I do have more questions. It seems that this algorithm will force you to mix the frames of multiple sequences, which will require some indexing strategy, consuming space in the process. This also incurs additional processing during reads.

 

I guess since you are using this technique, the savings overwhelm the costs in additional complexity and processing overhead, but I would like hear your thoughts on the typical overall savings this produced, and what strategies there are to reduce the overhead.

 

-dZ.

Link to comment

Also, would you mind stating the problem in terms of the Greedy algorithm? I'm having a hard time understanding how you applied it your optimization. (I'm new at this.)

Link to comment

Not sure why you think it increases complexity. Unless you can calculate the address (e.g. for same size date like digits), you need an address label anyway. Maybe I misunderstand you here. Can you give an example?

 

The savings greatly depend on the amount and type of data you have. The more random the data is, the lower the chances for overlapping are. But usually the data has some structure and then you can save 20% or more.

 

As for Greedy: Greedy only finds local minimums (see Wikipedia). A single loop will usually be far from the optimum. By randomizing the data before applying Greedy, you create different situations, which result into different local minimums. A you have to do, is to remember the lowest minimum. Usually it takes only a few hundred or thousand iterations until the best solution (or a very, very close one) is found.

Link to comment
Guest
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...