[Users] Possible performance issue

Haas, Roland rhaas at illinois.edu
Mon Oct 7 09:26:11 CDT 2019


Hello Vaishak,

hmm, still very slow.

One question that I forgot to ask before: did you make sure to build
and optimized Cactus executable (setting OPTIMISE=yes, DEBUG=no to
ensure that you have -O2 or -O3 optimisation settings enabled)?

Ideally if you could send the file configs/sim/config-info that would
tell me.

Yours,
Roland

> Dear Sir,
> 
> I am a little worried about the performance because this is a new cluster
> we have and it is supposed to be performing well. I am inclined to think
> that some libraries/compiler options / settings might be the bottleneck.
> 
> 
> I am presently running two simulations, both using the same parameter file
> GW150914.rpar.
> 
> The first one is using mpich-3.3.1, the same as in the simulation mentioned
> in the previous thread. I am using one node consisting of 2*16 cores, and
> 32 mpiprocs.
> 
> The second one is using openmpi-3.1.2 with openmp. It uses 128 procs,
> distributed among 16 mpiprocs and 8 openmp threads per mpiproc. Since I
> have 32 PPN, it is launching 4 mpiprocs per node.
> 
> I am herewith attaching the carpet-timing..asc file from both these runs.
> 
> Thanking you
> 
> Regards,
> Vaishak
> 
> 
> On Fri, Oct 4, 2019 at 8:05 PM Haas, Roland <rhaas at illinois.edu> wrote:
> 
> > Hello Vaishak,
> >
> > I do not see anything obviously wrong with the setup.
> >
> > It uses 128 MPI ranks for the 4 nodes which fits with there being 2x16
> > cores per node.
> >
> > Lookin at the timer tree output at iteration 1024 (search for
> > "gettimeof " and you will find the spot) out of 5977s spend during
> > Evolve about 2143s were spent in "syncs" which is communication and
> > about the same amount of time in "thorns" that is doing computation.
> > While this ratio is not great (spending more time sending data than
> > doing computation) it is also not unheard of.
> >
> > Getting the original output files for the gallery data from Zenodo
> > (link is on the gallery page):
> >
> > wget
> > https://zenodo.org/record/155394/files/GW150914_28.tar.xz?download=1
> >
> > you can see (in GW150914_28/output-0000/GW150914_28.out) that that one
> > took about 137s for syncs and 198s for thorns, so the same ratio but
> > about a factor of 10 faster.
> >
> > I am reaching for straws here, but sometimes having too many MPI ranks
> > can be detrimental if there is not enough work to split up (OpenMP can
> > be a bit more forgiving in that respect, the original gallery run
> > used 120 cores on 10 nodes using 6 OpenMP threads per MPI rank).
> >
> > Since each node has lots of RAM (more than the 96GB required to run the
> > simulation), can you try and see what would happen if you were to run
> > on only a single node?
> >
> > Also if you could add the parameter:
> >
> > Carpet::output_timers_every = 1024
> >
> > then provide the files carpet-timing-statistics*.asc that would let us
> > know in even more detail where the time is spent.
> >
> > Running for a short time (2048 iterations) is enough to get data to
> > compare.
> >
> > Yours,
> > Roland
> >
> > > Dear All,
> > >
> > > I am running the simulation GW150914 using the parameter file available
> > at
> > > the ETK gallery at (GW150914-ETK gallery
> > > <https://einsteintoolkit.org/gallery/bbh/index.html>) using 128 cores.
> > >
> > > Each compute node consists of 2 X 16 Cores Intel SkyLake ( Intel(R)
> > Xeon(R)
> > > Gold 6142 CPU @ 2.60GHz) and 384 GB RAM .  I have compiled and am running
> > > Einstein Toolkit without OpenMP and using mpich-3.3.1.
> > >
> > >
> > > The issue is that the simulation seems to be running at a very slow pace.
> > > The number of physical time per hour that it is completing is only about
> > > 1.3 units. At this rate to complete 1700 units, it would take about 54
> > > days, in contrast to 2.8 days on (Intel(R) Xeon(R) CPU E5-2630 v3 @
> > > 2.40GHz) as per the details at the example run of GW150914 available at
> > the
> > > gallery (GW150914-ETK gallery
> > > <https://einsteintoolkit.org/gallery/bbh/index.html>).
> > >
> > > I have also tried using intel mpi (impi) but with simular results.
> > >
> > > I am also attaching the out file from the simulation.
> > >
> > > Looking forward to your inputs.
> > >
> > >
> > > Thanks and regards,
> > >
> > >
> > >
> > >
> > >
> > > Vaishak P
> > >
> > > PhD Scholar,
> > > Shyama Prasad Mukherjee Fellow
> > > Inter-University Center for Astronomy and Astrophysics (IUCAA)
> > > Pune, India
> >
> >
> >
> > --
> > My email is as private as my paper mail. I therefore support encrypting
> > and signing email messages. Get my PGP key from http://pgp.mit.edu .
> >
> 
> 



-- 
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://pgp.mit.edu .
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20191007/ec5d8de3/attachment.bin 


More information about the Users mailing list