ricortes Posted April 25, 2015 Share Posted April 25, 2015 (edited) Sorry, double post. Edited April 25, 2015 by ricortes Quote Link to comment Share on other sites More sharing options...
Shawn Jefferson Posted April 26, 2015 Share Posted April 26, 2015 Memcopy using indirect addressing for speed? What about using self modified code instead? Sometimes you can't use self-modifying code, for instance, running code from cart. I think we sometimes obsess about speeding up specific pieces of code, when really they are "good enough" and you'd be better off working on gameplay or other mechanics. Not you, specifically of course... just in general us 8-bit programmers. 3 Quote Link to comment Share on other sites More sharing options...
emkay Posted April 27, 2015 Share Posted April 27, 2015 Sometimes you can't use self-modifying code, for instance, running code from cart. Running from a Cart should automatically imply to use unrolled code and lookup tables. Using a cart also means to have more RAM of the real machine available, to put the self modifying code there. I think we sometimes obsess about speeding up specific pieces of code, when really they are "good enough" and you'd be better off working on gameplay or other mechanics. Not you, specifically of course... just in general us 8-bit programmers. Good enough? Every single frame counts, so why not putting the speed to the limits with using the fastest routines? Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted April 27, 2015 Share Posted April 27, 2015 (edited) And Franco's routine advocates (and lends itself to) loop unrolling anyway, so we come full circle... Edited April 27, 2015 by flashjazzcat 1 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted April 27, 2015 Share Posted April 27, 2015 @Franco I just stumbling across this in your source '; Generic memcpy optimized for speed. No overlapping' that was my intension and was wondering why then using ZP. but as usual... we are not talking here about the "fastest" generic memcopy routine here... so no offence here. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted April 27, 2015 Share Posted April 27, 2015 @shawn... yup. always foreget the "ROM" dev guys... Quote Link to comment Share on other sites More sharing options...
Creature XL Posted April 27, 2015 Share Posted April 27, 2015 (edited) Running from a Cart should automatically imply to use unrolled code and lookup tables. Using a cart also means to have more RAM of the real machine available, to put the self modifying code there. Good enough? Every single frame counts, so why not putting the speed to the limits with using the fastest routines? Because it uses more memory? Because it in most cases just doesn't matter? We are here in the Programming part of the forums. So for you in German: Wenn der Kuchen spricht, haben die Krümel Pause To clarify for the non-German speakers; http://www.phrasen.com/uebersetze,Wenn-der-Kuchen-spricht-haben-die-Kruemel-Pause,76808,d.html Although I am not sure if the English version really means the same as the German. But it is close enough. After all, we are all thinking the same thing here Edited April 27, 2015 by Creature XL 1 Quote Link to comment Share on other sites More sharing options...
Franco Catrin Posted April 27, 2015 Share Posted April 27, 2015 @Franco I just stumbling across this in your source '; Generic memcpy optimized for speed. No overlapping' that was my intension and was wondering why then using ZP. but as usual... we are not talking here about the "fastest" generic memcopy routine here... so no offence here. I have no problem with that! Setup code will be similar for the fastest version (direct addr) without loop unrolling. If you use loop unrolling, the setup code will be ugly. Of course you can use macros, but I hate macros. (It's not something rational, it's just a feeling). Be sure that I would go for the uglier version if I really need to. Quote Link to comment Share on other sites More sharing options...
emkay Posted April 27, 2015 Share Posted April 27, 2015 (edited) Although I am not sure if the English version really means the same as the German. But it is close enough. After all, we are all thinking the same thing here Ich kenne da auch was: ngan rub maitschop , ngan lai maiköy ... guess the origin Or, in other words: a PICTURE that isn't worth drawing , also doesn't need a line to be drawn... This sentence fits in a phantastic way to most of "new A8 projects" ... you know , exactly those "unfinished" ... or better unfinishable projects. Using NOT the optimized routines for memory copying, means to have speed issues in simple games ,just like Prince of Persia... but that's another story..... or not ... ? You could put it in other words: If you can look over the completed project, it's worth to create every part of it. Edited April 27, 2015 by emkay Quote Link to comment Share on other sites More sharing options...
Franco Catrin Posted April 28, 2015 Share Posted April 28, 2015 Using NOT the optimized routines for memory copying, means to have speed issues in simple games ,just like Prince of Persia... but that's another story..... or not ... ? You could put it in other words: If you can look over the completed project, it's worth to create every part of it. Well, now that you put it on topic... My current version of Prince of Persia is "fast enough" to render each screen, at least faster than the original version, and it's using my "slow" memcpy: BTW: The original Prince of Persia uses self modifying code, but the screen handling on Apple can turn very complex. 4 Quote Link to comment Share on other sites More sharing options...
emkay Posted April 28, 2015 Share Posted April 28, 2015 Well, now that you put it on topic... My current version of Prince of Persia is "fast enough" to render each screen, at least faster than the original version, and it's using my "slow" memcpy: BTW: The original Prince of Persia uses self modifying code, but the screen handling on Apple can turn very complex. Wrong video? Quote Link to comment Share on other sites More sharing options...
emkay Posted April 28, 2015 Share Posted April 28, 2015 (edited) Look at "Karateka", side a side. Even due all "adapted" optimizations, the A8 version doesn't run faster. There are also nifty glitches in the masking on the Atari due to the 8 bit instead of 7 bit pixels... and people want 4 colour objects there. How will you solve that without all possible speed optimizing? Edited April 28, 2015 by emkay Quote Link to comment Share on other sites More sharing options...
Franco Catrin Posted April 28, 2015 Share Posted April 28, 2015 Wrong video? Why wrong? Quote Link to comment Share on other sites More sharing options...
Franco Catrin Posted April 28, 2015 Share Posted April 28, 2015 (edited) Look at "Karateka", side a side. Even due all "adapted" optimizations, the A8 version doesn't run faster. There are also nifty glitches in the masking on the Atari due to the 8 bit instead of 7 bit pixels... and people want 4 colour objects there. How will you solve that without all possible speed optimizing? Karateka needs to move the whole screen, do the masking with the mountain and the big portals. In that video mode just a complete "sta" takes how much? 3 frames? I don't get impressed if Karateka has slow scrolling. Prince of Persia on the other hand, doesn't need scrolling. Ok, the rendering IS complex (isometric tiles with masks), but there is no need to move the entire screen for each movement of the player, so I can turn off DMA, make selective drawing and as shown in the video, it takes "only" 8-9 frames to redraw the entire screen. Edited April 28, 2015 by Franco Catrin Quote Link to comment Share on other sites More sharing options...
emkay Posted April 28, 2015 Share Posted April 28, 2015 Karateka needs to move the whole screen, do the masking with the mountain and the big portals. In that video mode just a complete "sta" takes how much? 3 frames? I don't get impressed if Karateka has slow scrolling. Not sure if that "some scanline" overlay can be named scrolling. PoP uses 3 parts on the screen for movement and the masking "back" & "front" happens sometimes at the same time on different positions. What the Atari can do and what people want is often NOT the same, so they don't tell all problems, when wanting to have a game "C64" like... And with the following picture , I'll end the PoP discussion in this thread. Let's wait and see the final stuff ... Quote Link to comment Share on other sites More sharing options...
danwinslow Posted April 28, 2015 Author Share Posted April 28, 2015 Wenn der Kuchen spricht, haben die Krümel Pause Hmm, I would translate as : "When the Cake talks, the cupcakes listen". Quote Link to comment Share on other sites More sharing options...
Franco Catrin Posted April 28, 2015 Share Posted April 28, 2015 Let's wait and see the final stuff ... Agreed! Quote Link to comment Share on other sites More sharing options...
emkay Posted April 28, 2015 Share Posted April 28, 2015 Hmm, I would translate as : "When the Cake talks, the cupcakes listen". Depending on the Atari scene , it's more then like "when the cupcakes talk, the cakepieces should listen". Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.