[Users] Changing the number of MPI processes on recovery using simfactory

Konrad Topolski k.topolski2 at student.uw.edu.pl
Tue Jul 13 16:22:46 CDT 2021


Hi,

I am currently trying to find what the optimal number of MPI processes is
for my purposes.
I have managed to change the number of MPI processes when restarting a
simulation from a checkpoint - but using the bare executable, not
simfactory.

Now, I would like to learn how to do it in simfactory.

I have learned that to successfully steer the number of threads per 1 MPI
process (which, combined with a total number of threads requested, yields
the total number of MPI processes), I change the num-thread variable in the
machine.ini file.
This is probably (certainly?) suboptimal, so if there's a proper way, I'd
like to learn it.

I submit/recover simulations via .
/simfactory/bin/sim submit <sim_name> --parfile <parfile_name>  --recover
--procs NUM_PROCS  --machine=okeanos --configuration=okeanos

If I don't use the --machine option specifying my cluster, it will default
to some config with max nodes = 1 (generic?). Which is why I steer MPI
processes via num-thread.

Trying to recover a simulation via simfactory with a new machine file (with
num-thread changed) yields an error message:

INFO (Carpet): MPI is enabled
INFO (Carpet): Carpet is running on 4 processes
WARNING level 0 from host nid00392 process 0
  in thorn Carpet, file
/lustre/tetyda/home/topolski/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:148:
  -> The environment variable CACTUS_NUM_PROCS is set to 96, but there are
4 MPI processes. This may indicate a severe problem with the MPI startup
mechanism.

What can I do to recover a simulation via simfactory and use a different
number of MPI processes?

While I'm at it, can I also change parameters such as the number of
refinement levels or make new guesses for AHFinderDirect, in case the
previously-used parameters did not provide high enough resolution for a
successful find?

Best regards
Konrad Topolski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20210713/30d52b94/attachment.html 


More information about the Users mailing list