Jump to content
rensoup

peephole optimization: automatic code optimization for space?

Recommended Posts

 

So I've got this porting project that I'm working on... I've got about 25KB of 6502 code which I would like to reduce if possible.

Is there a tool that could automatically optimize it for space?

Perhaps I could use a c compiler and convert the code to inline asm and let the compiler do some magic ?

Any thoughts ?

 

A quick search reveals these:

http://web.archive.org/web/20010627195141/http://www.heilbronn.netsurf.de/~dallmann/lunix/src/opt65.c

https://github.com/RussellSprouts/6502-enumerator

Does anyone have experience with them ?

 

Thanks

Share this post


Link to post
Share on other sites

With a compiler you get the overhead of included libaries etc that need to be in the final product and generally the critical mass where a production could be smaller than 100% Asm will be way more than a standard machine's memory can hold.  Ultimately 100% assembler can always have the advantage anyway since every other language has it's basis in machine code.

 

An optimizer - I have some doubts at least as far as hand written programs goes, maybe it'd work well on something like automatically generated code like RastaConverter makes (it has lots of NOPs and repeating stuff).  6502 doesn't lend entirely well to such techniques since it's a poor CPU for doing relocatable object code.

 

Maybe a better approach could be compressing embedded data though for that to work your program would need a relatively high proportion of data vs instructions, the data would need to lend well to compression and you'd need the necessary workspace to decompress.  And again you have that critical mass thing where only medium to large programs will benefit.

Edited by Rybags

Share this post


Link to post
Share on other sites

25k of pure code sounds really a lot,so I expect there is inline data.

One the other hand 25k is no problem at all for the Atari 8-bit, so why reduce it?
If it is about reducing loading time, compressing the final XEX is the best option.

Share this post


Link to post
Share on other sites

There is no automatic tool I am aware of. 6502 assembler is not really orthogonal enough to allow many peephole optimizations, so I doubt that much is possible. It is mostly manual work.

 

  • Sad 1

Share this post


Link to post
Share on other sites

C compilers in general perceive inline assembler as an obstacle for optimization, that's no way to go.

You can give the opt65.c a try, there is nothing to lose. It is very dated, though.

 

 

 

 

  • Sad 1

Share this post


Link to post
Share on other sites

- You can pack some code blocks and depack them only if you need them. That would cost you additional space for the depacking code and some additional code for the handling

- If you have converted other asm to 6502 asm in your porting project, then you may develop shorter code in the converter. Or use tokens/interpreter technique where speed is no problem

- You can write code for 128 KB Ram. Taht costs you extra management of the 16 kb bank area.

- You can write code for an Atarimax cart or similar and put the code in 8k pages using only 8kb of the 64 kb Ram area

- you can analyse your code by hand and find identical codeblocks that can be used via JSR/RTS on every appearance

 

Any more Info on what you are doing or some code samples?

Share this post


Link to post
Share on other sites
20 hours ago, Rybags said:

With a compiler you get the overhead of included libaries etc that need to be in the final product and generally the critical mass where a production could be smaller than 100% Asm will be way more than a standard machine's memory can hold.  Ultimately 100% assembler can always have the advantage anyway since every other language has it's basis in machine code.

To be clear, I don't want to include C libs or compiled C code, I thought maybe a compiler may still be able to optimize inline ASM code.

 

 

13 hours ago, JAC! said:

25k of pure code sounds really a lot,so I expect there is inline data.

 

There is more or less 25KB of 6502 code, and about 13KB of game data.

 

I can give a bit more detail but the consensus seems to be negative.

 

Right now the game runs in a "virtual" 6502 environment, that means there's no hardware specific emulation (no graphics no sound), except for input. The draw functions are intercepted and redirected to D3D.

 

Here's a screenshot at init:

 

memuse.png.0bf81ed8351e359fd0e1306ac971375e.png

 

-each square is a byte.

-each white frame is a label (I can hover over it and get its name)

-each black byte has never been accessed

-each red byte is byte that's been written to

-each blue byte is byte that's been read (uninitialized access!)

-each green byte is byte that's been written to and read.

 

 

After running the game for 2 frames:

 

memuse2.thumb.png.61eb82a35c666c08c5549d349dd21018.png

 

The area between $2700-$7d00 is the code. Before/After that it's data.

 

All the memory up to $9200 is used (the black areas will be used)

 

ZP is almost full and gives back some decent amount of code space. I'm already using the bottom of the stack.

 

After just 10 seconds of gameplay it's mostly all green, so yes the game touches a lot of code and data.

 

I'm using 37KB of RAM *without* Screen buffers (modeE 2 * 160pix*216 lines ), PMG Area, DL, disc buffer, Atari code. I think my 64 KB are pretty much full.

Not counting graphics/sound yet!

 

10 hours ago, 1NG said:

- You can pack some code blocks and depack them only if you need them. That would cost you additional space for the depacking code and some additional code for the handling

Dmsc, another forum member provided a fast lz4 decompressor at +-1500 bytes/frame. The game touches a lot of code per frame so decompressing code on the fly doesn't seem like a good option.

I'm hoping I'll be able to decompress graphics data on the fly though.

 

10 hours ago, 1NG said:

You can write code for 128 KB Ram. Taht costs you extra management of the 16 kb bank area.

The project is 128KB. All the graphics/sound data will go into the extra 64KB.

11 hours ago, 1NG said:

If you have converted other asm to 6502 asm in your porting project, then you may develop shorter code in the converter. Or use tokens/interpreter technique where speed is no problem

Nope the code is 6502 originally.

 

11 hours ago, 1NG said:

you can analyse your code by hand and find identical codeblocks that can be used via JSR/RTS on every appearance

I've done a little bit of code/data optimization on the project and I'm not very good at it :), managed to save a few KB though.

 

11 hours ago, 1NG said:

Any more Info on what you are doing or some code samples?

Not yet :) but perhaps people will be interested in contributing once it gets a public release.

 

 

 

 

 

  • Like 1

Share this post


Link to post
Share on other sites

Well,

 

in the 80s we used Bit/Byte crunchers that removed zeros from any ML program. The result was a ML file with dozens (or hundreds) of short segments. Nowadays I would use Exomizer, Inflate/Deflate or LZ4 - afaik, Superpacker includes them all: http://madteam.atari8.info/uzytki/sp.7z

(That's not a ZIP, it is a .7z, so use e.g. 7-Zip to unpack: https://7-zip.org/ )

Edited by CharlieChaplin

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...