Hmm, I looked at the code of Graviten and there seems to be VERY compact and optimized way of moving object around. No additional if branches, just pure logic in one line using bitwise arithmetics. And using similar thing for rotating an object using offsets to an array of data.
Be very careful of one-liners in higher-level languages. Some translate to pages of ASM code...
The best way to produce fast code in higher-level language (say, C or Pascal), is to let the compiler produce also assembly file in parallel, and that way you can quickly compare your subroutine with the generated ASM code.
This will provide you with an insight that goes waaay beyond any benchmarks that you execute, since after a week or two of watching the generated ASM code, you'll be able to understand what works best (though the original code will never look like a human wrote it, but hey - it's fast so who cares ).
It will also be a source of major entertainment, as certain WTFs are truly mind blowing (talking about 68000 C compiler on jaguar here)