[Users] cactus performance

Jose Fiestas Iquira jafiestas at lbl.gov
Thu Mar 29 11:20:44 CDT 2012


Dear Christian,
Thank you. I understand what you are saying. I am mainly asking regarding
McLachlan. Sorry if it appears I want to learn about HPC from you.
I am working with people in the lab, for sure. They are just not aware
about Cactus and I am learning as well.
I apologize for that, and will try to avoid it in the future.
Sincerely,
Jose


On Thu, Mar 29, 2012 at 8:01 AM, Christian D. Ott <cott at tapir.caltech.edu>wrote:

>
> Hi Jose,
>
> look, the Einstein Toolkit team is very happy to help new users like
> you to get started and sort out specific questions regarding parts
> of the toolkit.
>
> What we really can't do is provide you with very basic
> high-performance computing training via the mailing list. This is
> because many if not most people on this list actually volunteer to
> help in their spare time and are not paid as consultants for general
> HPC questions. You are at Berkeley lab and there are many experts that
> can help you with basic HPC questions, plus there are tons of resources
> available on-line, that I would kindly ask you to consult first.
>
> Regarding your scaling question:
>
>
> https://support.scinet.utoronto.ca/wiki/index.php/Introduction_To_Performance
>
> gives a good introduction to performance measurements. There are many
> more webpages like this available on the internet. The plot shown in
> the Einstein Toolkit paper (arXiv:1111.3344) is a weak scaling test.
>
> Best,
>
>  - Christian Ott
>
>
>
>
> On Wed, Mar 28, 2012 at 11:55:30PM -0700, Jose Fiestas Iquira wrote:
> > Hello,
> >
> > I reduced the simulation time by setting Cactus::cctk_final_time = .01 in
> > order to measure performance with CrayPat. It run only 8 iterations. I
> used
> > 16 and 24 cores for testing, and obtained almost the same performance
> > (~1310 sec. simulation time, and ~16MFlops).
> >
> > It remembers me Fig.2 in the reference you sent
> > http://arxiv.org/abs/1111.3344
> >
> > which I don't really understand. I would expect shorter times with larger
> > number of cores. Why does it not happen here?
> >
> > I am using McLachlan to simulate a binary system. So, all my regards are
> > concerning this specific application. Do you think it will scale in the
> > sense that simulation time will be shorter, the larger of number of
> cores I
> > use?
> >
> > Thanks,
> > Jose
> >
> >
> >
> > On Wed, Mar 21, 2012 at 5:08 AM, Erik Schnetter <schnetter at cct.lsu.edu
> >wrote:
> >
> > > On Tue, Mar 20, 2012 at 10:45 PM, Frank Loeffler <knarf at cct.lsu.edu>
> > > wrote:
> > > > Hi,
> > > >
> > > > On Tue, Mar 20, 2012 at 05:14:38PM -0700, Jose Fiestas Iquira wrote:
> > > >> Is there documentation about performance of Cactus ETK in large
> > > machines. I
> > > >> have some questions regarding best performance according to initial
> > > >> conditions, calculation time required, etc.
> > > >
> > > > Performance very much depends on the specific setup. One poorly
> scaling
> > > > function can ruin the otherwise best run.
> > > >
> > > >> If there are performance plots like Flops vs. Number of nodes would
> > > help me
> > > >> as well.
> > > >
> > > > Flops are very problem-dependent. There isn't such thing as flops/s
> for
> > > > Cactus, not even for one given machine. If we talk about the Einstein
> > > > equations and a typical production run I would expect a few percent
> of
> > > > the peak performance of any given CPU, as we are most of the time
> bound
> > > by
> > > > memory bandwidth.
> > >
> > > I would like to add some more numbers to Frank's description:
> > >
> > > One some problems (e.g. evaluating the BSSN equations with a
> > > higher-order stencil), I have measured more than 20% of the
> > > theoretical peak performance. The bottleneck seem to be L1 data cache
> > > accesses, because the BSSN equation kernels require a large number of
> > > local (temporary) variables.
> > >
> > > If you look for parallel scaling, then e.g.
> > > <http://arxiv.org/abs/1111.3344> contains a scaling graph for the BSSN
> > > equations evolved with mesh refinement. This shows that, for this
> > > benchmark, the Einstein Toolkit scales well to more than 12k cores.
> > >
> > > -erik
> > >
> > > --
> > > Erik Schnetter <schnetter at cct.lsu.edu>
> > > http://www.perimeterinstitute.ca/personal/eschnetter/
> > >
>
> > _______________________________________________
> > Users mailing list
> > Users at einsteintoolkit.org
> > http://lists.einsteintoolkit.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20120329/defbccfe/attachment.html 


More information about the Users mailing list