Jump to content
ilmenit

Quantizator

Recommended Posts

Missile DMA is off and GRAFM is set to $FF.

Note that if Player DMA is enabled in Antic, it also does Missile DMA - so that extra cycle is still lost.

 

Good point. I was disappointed to learn that disabling ANTIC's missile DMA doesn't yield an extra cycle for CPU usage as I suggested earlier in this thread. As you said, ANTIC missile DMA can't be turned off independently of player DMA but GTIA's reading of ANTIC's DMA into GRAFM can be independently controlled which is how RC works. That is to say that ANTIC's DMACTL is fixed at $3a and GTIA's GRACTL is fixed at $02.

Share this post


Link to post
Share on other sites

Thanks for the answers.

 

Just started RC for the first time now - with a picture I tried to rasterize/split manually in 2010 (and stopped at about 60%).

Watching RC is a bit like watching DEFRAG - lots of cool output and the result appear to get better over time.

I always liked watching DEFRAG :-)

 

Since we are always in millions of eval, it would ne nice to have some number formatting (thousands separator or similar).

Just realized that my CPU usage is only at about 30%. Is that normal? I'd expect this is greedy with regards to CPU.

I can start a second instance, then CPU is at about 60%.

Edited by JAC!
  • Like 1

Share this post


Link to post
Share on other sites

Since we are always in millions of eval, it would ne nice to have some number formatting (thousands separator or similar).

Just realized that my CPU usage is only at about 30%. Is that normal? I'd expect this is greedy with regards to CPU.

I can start a second instance, then CPU is at about 60%.

 

I assume you have a quad-core computer?

Share this post


Link to post
Share on other sites

Simple math has never been my strength obviously :-) I was irritated because the task manager shows 2 CPUs, but device manager really shows 4 cores indeed. I should stick to 1 core 6502, I think.

Edited by JAC!
  • Like 1

Share this post


Link to post
Share on other sites

When I a look at RC running this morning (133 Mio evals), I everything was black except the eval counters. Probably because the screen/power saver jumped in during the night.

Pressing "S" displayed the current image again. Any way to display the others again? Minimize/Maximize didn't cause a redraw either.

Edited by JAC!

Share this post


Link to post
Share on other sites

When that sort of thing happens to a window, moving it offscreen then on again sometimes refreshes it, or try minimize/restore.

Share this post


Link to post
Share on other sites

330 mio. evals and the result is quite good taking just the visual appearance into account. Taking "pixel perfectness" into account, the parts of the picture I already did "handmade" are much more exact. Most likely because I used ANTIC 4 (5 colors) and lots of pixel exact PM stuff. But on the other hand RC proved that it can be possible so solve the middle part of the picture with ANTIC E and splits "almost exact". That gives me confidence that I'll find an excat solution, because now I can really split every line correctly without badlines. Really a great tool.

Share this post


Link to post
Share on other sites

Great, now, here's a BeagleBone port. I hope that there are people out there who have BeagleBones! If you do, please reply with a 'yay' to let me know that this is worth doing! (otherwise I may go and sulk in a corner!)

 

Nice work! I've added your build to the downloads page.

 

I don't have a BeagleBone but I might be interested in getting one. I want to try hooking up a microcontroller board to the Atari's PBI connector to try my hand at making some kind of "new equipment" device, for example, as emkay alluded to, a huge bank-switching memory expansion for RastaMovie :). I've read that Beaglebone will die if you hook 5v devices to its GPIO pins without level shifters. The Fez Panda II is 5V tolerant but has a lot less ram and cpu horsepower. Full screen movies would require ~500KB/s throughput (25fps * ~20KB per frame).

 

It took a while, but I had tremendous fun actually going through with this:

 

http://www.youtube.com/watch?v=1irR4TQ5aMA

 

That's a stock Atari 600XL "expanded" to 140MB via a Beaglebone. I put it in quotes because it's not truly random access. There's too much latency in the DDR path on the Beaglebone, so instead it bursts 24K pages. During that time, the 6502 and Antic must steer clear of the page memory.

 

I put a write-up on my blog. The Beaglebone and Atari code is available on github. Some new movie executables for emulators are available on the RastaMovie downloads page. The latest movies (birds.zip and epic.zip) are made for NTSC. I used Altirra's Authentic NTSC palette when rendering the images in RastaConverter. There is a 60Hz buzz in Altirra but not in Atari800. It appears that Altirra doesn't quite like my abuse of instantaneous segment loads and inserts a zero sample where the loads happen, maybe?

  • Like 4

Share this post


Link to post
Share on other sites

A little better detail. Sorry for bad video quality.

That's really excellent!! Not that I think I'll be taking on a hardware project, but could you let everyone know what you've done with your connections between the 600XL and the BeagleBone?

 

Ilmenit, would it be possible to merge in the "No Allegro" changes from Xuel (lybrown). This will enable me to produce a binary for the latest versions of RastaConverter on Linux machines. If you have already done this, my apologies, I'm still pretty new with GitHub. If that is possible to do, would anyone like a Raspberry Pi version also? (with no on-screen graphics).

Edited by snicklin

Share this post


Link to post
Share on other sites

A little better detail. Sorry for bad video quality.

That's really excellent!! Not that I think I'll be taking on a hardware project, but could you let everyone know what you've done with your connections between the 600XL and the BeagleBone?

 

Ilmenit, would it be possible to merge in the "No Allegro" changes from Xuel (lybrown). This will enable me to produce a binary for the latest versions of RastaConverter on Linux machines. If you have already done this, my apologies, I'm still pretty new with GitHub. If that is possible to do, would anyone like a Raspberry Pi version also? (with no on-screen graphics).

 

Thanks! The schematic is on the abx github page. It's just six 8-bit level converters.

 

Thanks for reminding me. I just pushed my latest audio-related changes to my no-allegro RastaConverter fork.

Share this post


Link to post
Share on other sites

snicklin, I merged ilmenit's changes up to 5.1 in my fork, so you can use it to build a new linux executable.

Share this post


Link to post
Share on other sites

snicklin, I merged ilmenit's changes up to 5.1 in my fork, so you can use it to build a new linux executable.

 

Ahh superb!! I'll see what I can do.... thanks!

Share this post


Link to post
Share on other sites

I put a write-up on my blog. The Beaglebone and Atari code is available on github. Some new movie executables for emulators are available on the RastaMovie downloads page. The latest movies (birds.zip and epic.zip) are made for NTSC. I used Altirra's Authentic NTSC palette when rendering the images in RastaConverter. There is a 60Hz buzz in Altirra but not in Atari800. It appears that Altirra doesn't quite like my abuse of instantaneous segment loads and inserts a zero sample where the loads happen, maybe?

 

Altirra's EXE loader emulates the SIOV behavior of silencing all POKEY channels at the conclusion of a load. This is required for some games that actually depend on this behavior to kill sounds that are played during the loading sequence.

Share this post


Link to post
Share on other sites

@Xuel:

The download of Epic.Zip does not seem to work. Whenever I click on it, nothing happens (and the counter also says 0/zero downloads).

-Andreas Koch.

Share this post


Link to post
Share on other sites

@phaeron: That makes sense! Thanks for the explanation. I modified abxmovie to rewrite AUDC1 immediately after every segment load and voila - no more buzz.

 

@CharlieChaplin: Thanks for the heads up!

 

I've uploaded fixed versions of all three movies:

  • Like 1

Share this post


Link to post
Share on other sites

I've completed a non-Allegro Raspberry PI build but am disappointed with 150 evaluations per second with gcc flag '-Ofast' and 115 evals/s with gcc flag '-O3'. I get triple that on the BeagleBone.

 

To those who know the code, is FreeImage a major overhead? Is it used after the main processing (after the initial solution calculations)? Should I check the optimisation settings on the FreeImage compilation?

Share this post


Link to post
Share on other sites

To those who know the code, is FreeImage a major overhead? Is it used after the main processing (after the initial solution calculations)? Should I check the optimisation settings on the FreeImage compilation?

 

I think there is not much to boost. The kernel algorithm needs a lot of time and since the calculation power of the RasPi can be compared with a 300MHz Pentium 2, there isn't much to get. (But according to Eben, the RasPi should be (only) about 20% slower than the BeagleBoard for general purpose computing, so something is wrong?!?)

Edited by Irgendwer

Share this post


Link to post
Share on other sites

(But according to Eben, the RasPi should be (only) about 20% slower than the BeagleBoard for general purpose computing, so something is wrong?!?)

 

I'm not sure how they compare but could it be a difference between the BeagleBoard and the BeagleBone?

 

Actually, come to think of it, the figures I gave for the BeagleBone were for version 3. I've not done a version 5 compile for the BB yet. I'll try that now....

 

<EDIT>

I've now done a version 5 compile for the BeagleBone and the evaluation rate is now down to 107 compared to around 350 with version 3. The Raspberry PI version is now quicker.

 

I'll attach the files if anyone is interested.... you'll need FreeImage downloaded, compiled and installed on your system to use them.

 

 

rastaconv.zip

 

Command line used:

 

time ./rastaconv /i=./steve2.JPG /o=./steve /s=3 /h=240 /distance=ciede /max_evals=10000

 

I won't attach the picture, any picture can be used to see how they are comparatively. Yes, I know ciede is slow, but even still....

 

 

I accept that the program isn't quick overall (that's not a problem), I'd just expect my BeagleBone / Raspberry PI to have about an eight to a quarter of the power of my PC whereas they actually have around a 25th of the speed of my 2.6GHz PC.

Edited by snicklin

Share this post


Link to post
Share on other sites

I needed an excuse to learn C++11 threading, so I ported RastaConverter beta 5.1 to Visual Studio Express 2012 and rewrote the evaluation loop to work on multiple threads. Of course, that's when I discovered via the debugger that VS2012's condition_variable is unusably broken, so then I had to strip out the C++11 threading and replace it with pthreads-win32. Updated build and source are attached (VS2010 SSE2 pogo build). It has an improved line cache for better single threaded performance and the core evaluation loop broken out to enable multithreading. Hopefully I haven't added too many bugs. :)

 

Use /threads=<n> to enable multithreading. On my Core i7 I get about 6-8Kiter/sec with one thread and 15-30Kiter/sec with /threads=8; the dual-core Atom is hitting 2-4Kiter/sec with /threads=4. The threading is less effective with >1 solution until the solutions start getting harder to find, because of swapping solutions around between the threads. I did have to replace the global Mersenne Twister random number generator with a local 64-bit LFSR, but hopefully that won't make a difference.

 

The changes should port to Linux without too much trouble, but the setup for the call to pthreads_cond_timedwait() will have to be fixed up because that function is awkward to use as shipped in pthreads-win32, which exposes this function without providing a portable way to get the time (C++11 handles this better, which would have been nice if VS2012's implementation actually worked). The instruction sequence cache has lowered the memory usage somewhat and the per-thread cache size can now be tuned directly in bytes in Evaluator.cpp, which should help those of you building RastaConverter on embedded platforms.

rastaconv-5.1b-threaded.zip

rastaconv-5.1b-threaded-src.zip

  • Like 3

Share this post


Link to post
Share on other sites

The changes should port to Linux without too much trouble

...

 

I have made positive experience with the 'boost_threads' package. I haven't the time, but maybe someone else like to use this to get more portable code...

Share this post


Link to post
Share on other sites

I did a quick test on my AMD Phenom X4 955, and with 4 threads it was bouncing between 20 and 23,000 iterations per second. Great update!

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...