[Users] Einstein Toolkit and modern AMD supercomputer

Erik Schnetter schnetter at cct.lsu.edu
Fri Aug 27 11:56:51 CDT 2021


Thanks for your thoughts.

I have some general remarks:

You mention that mvapich has the best performance. Is there any reason
to use any other MPI implementation?

Did you check that mvapich is configured correctly? Does it use the
network efficiently?

You need to use SystemTopology, or ensure otherwise that the way
threads and processes are mapped to hardware is reasonable.

What is the ratio of ghost/buffer to actually evolved grid points in your setup?

If MPI performance is slow, then the usual way out is to use OpenMP.
You implied using 4 threads per process; did you try using 8 threads
per process or more? This will also reduce memory consumption since
there are fewer ghost zones. Unfortunately, OpenMP multi-threading in
Carpet is not as efficient as it could be. CarpetX is much better in
this respect.

Our way of discretizing equations (high-order methods with 3 ghost
zones, AMR with buffer zones), combined with having many evolved
variables, require a lot of storage, and also have a rather high
parallelization overhead. A few ways out (none are production ready)
- Use DGFE instead of finite differences; see e.g. Jonah Miller's PhD
thesis and the respective McLachlan branch
- Avoid buffer zones by using an improved time interpolation scheme
(I've seen papers, I don't know about 3d code)
- Switch to CarpetX to avoid subcycling in time.

If memory usage is high only on a single node, then this is probably
caused by a serial code. Known serial pieces of code are ASCII output,
wave extraction, or the apparent horizon finder. Try disabling these
to see which (if any) is the culprit.

Finally, if OpenMP performance is bad, you can try using only every
second core and leaving the remainder idle, and see whether this


On Fri, Aug 27, 2021 at 12:45 PM Gabriele Bozzola
<bozzola.gabriele at gmail.com> wrote:
> Hello,
> Last week I opened a PR to add the configuration files
> for Expanse to simfactory. Expanse is an example of
> the new generation of AMD supercomputers. Others are
> Anvil, one of the other new XSEDE machines, or Puma,
> the newest cluster at The University of Arizona.
> I have some experience with Puma and Expanse and
> I would like to share some thoughts, some of which come
> from interacting with the admins of Expanse. The problem
> is that I am finding terrible multi-node performance on both
> these machines, and I don't know if this will be a common
> thread among new AMD clusters.
> These supercomputers have similar characteristics.
> First, they have very high cores/node count (typically
> 128/node) but low memory per core (typically 2 GB / core).
> In these conditions, it is very easy to have a job killed by
> the OOM daemon. My suspicion is that it is rank 0 that
> goes out of memory, and the entire run is aborted.
> Second, depending on the MPI implementation, MPI collective
> operations can be extremely expensive. I was told that
> the best implementation is mvapich 2.3.6 (at the moment).
> This seems to be due to the high core count.
> I found that the code does not scale well. This is possibly
> related to the previous point. If your job can fit on a single node,
> it will run wonderfully. However, when you perform the same
> simulation on two nodes, the code will actually be slower.
> This indicates that there's no strong scaling at all from
> 1 node to 2 (128 to 256 cores, or 32 to 64 MPI ranks).
> Using mvapich 2.3.6 improves the situation, but it is still
> faster to use fewer nodes.
> (My benchmark is a par file I've tested extensively on Frontera)
> I am working with Expanse's support staff to see what we can
> do, but I wonder if anyone has had a positive experience with
> this architecture and has some tips to share.
> Gabriele
> _______________________________________________
> Users mailing list
> Users at einsteintoolkit.org
> http://lists.einsteintoolkit.org/mailman/listinfo/users

Erik Schnetter <schnetter at cct.lsu.edu>

More information about the Users mailing list