[Users] cactus performance

Ian Hinder ian.hinder at aei.mpg.de
Thu Mar 29 10:54:08 CDT 2012


On 29 Mar 2012, at 08:55, Jose Fiestas Iquira wrote:

> Hello,
> 
> I reduced the simulation time by setting Cactus::cctk_final_time = .01 in order to measure performance with CrayPat. It run only 8 iterations. I used 16 and 24 cores for testing, and obtained almost the same performance (~1310 sec. simulation time, and ~16MFlops). 
> 
> It remembers me Fig.2 in the reference you sent
> http://arxiv.org/abs/1111.3344
> 
> which I don't really understand. I would expect shorter times with larger number of cores. Why does it not happen here? 

Hi Jose,

You will need to investigate this yourself by looking at the timings from the different parts of the code.  Look at the timer parameters in the Carpet and TimerReport thorns. I recommend Carpet::output_timer_tree_every = 32 or so, though this only gives you timings from process 0.  The code is very complex and there could be many factors inhibiting scaling if you have not been very careful to set things up in a good way.  One possibility is that the initial data solver might take a significant percentage of your run time, depending on the parameters you have set, and this solver is not parallelised.  You should look at the time spent in Evolve, not the time spent in Initialise.

> I am using McLachlan to simulate a binary system. So, all my regards are concerning this specific application. Do you think it will scale in the sense that simulation time will be shorter, the larger of number of cores I use?

In some situations yes, in others no.  There is no general answer to this type of question.  Both very small and very large problems will likely scale badly.  There is certainly a region in-between where you expect to see good scaling with Cactus/Carpet/McLachlan.


> 
> Thanks,
> Jose
> 
> 
> 
> On Wed, Mar 21, 2012 at 5:08 AM, Erik Schnetter <schnetter at cct.lsu.edu> wrote:
> On Tue, Mar 20, 2012 at 10:45 PM, Frank Loeffler <knarf at cct.lsu.edu> wrote:
> > Hi,
> >
> > On Tue, Mar 20, 2012 at 05:14:38PM -0700, Jose Fiestas Iquira wrote:
> >> Is there documentation about performance of Cactus ETK in large machines. I
> >> have some questions regarding best performance according to initial
> >> conditions, calculation time required, etc.
> >
> > Performance very much depends on the specific setup. One poorly scaling
> > function can ruin the otherwise best run.
> >
> >> If there are performance plots like Flops vs. Number of nodes would help me
> >> as well.
> >
> > Flops are very problem-dependent. There isn't such thing as flops/s for
> > Cactus, not even for one given machine. If we talk about the Einstein
> > equations and a typical production run I would expect a few percent of
> > the peak performance of any given CPU, as we are most of the time bound by
> > memory bandwidth.
> 
> I would like to add some more numbers to Frank's description:
> 
> One some problems (e.g. evaluating the BSSN equations with a
> higher-order stencil), I have measured more than 20% of the
> theoretical peak performance. The bottleneck seem to be L1 data cache
> accesses, because the BSSN equation kernels require a large number of
> local (temporary) variables.
> 
> If you look for parallel scaling, then e.g.
> <http://arxiv.org/abs/1111.3344> contains a scaling graph for the BSSN
> equations evolved with mesh refinement. This shows that, for this
> benchmark, the Einstein Toolkit scales well to more than 12k cores.
> 
> -erik
> 
> --
> Erik Schnetter <schnetter at cct.lsu.edu>
> http://www.perimeterinstitute.ca/personal/eschnetter/
> 
> _______________________________________________
> Users mailing list
> Users at einsteintoolkit.org
> http://lists.einsteintoolkit.org/mailman/listinfo/users

-- 
Ian Hinder
http://numrel.aei.mpg.de/people/hinder

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20120329/e1e1dcd6/attachment.html 


More information about the Users mailing list