[Users] Issue with Multiple Node Simulation on cluster

Spandan Sarma 19306 spandan19 at iiserb.ac.in
Thu Dec 8 01:49:47 CST 2022


Hello,

This mail is in continuation to the ticket, “Issue with compiling ET on
cluster”, by Shamim.

So after Roland’s suggestion, we found that using the –prefix
<openmpi-directory> command along with hostfile worked successfully in
simulating a multiple node simulation in our HPC.

Now we find that the BNSM gallery simulation evolves for only 240
iterations on 2 nodes (16+16 procs, 24 hr walltime), which is very slow
with respect to, simulation on 1 node (16 procs, 24 hr walltime) evolved
for 120988 iterations. The parallelization process goes well within 1 node,
we received iterations - 120988, 67756, 40008 for procs - 16, 8, 4 (24 hr
walltime) respectively. We are unable to understand what is causing this
issue when openmpi is given 2 nodes (16+16 procs).

In the output files we found the following, which may be an indication
towards the issue:

IINFO (Carpet): MPI is enabled

INFO (Carpet): Carpet is running on 32 processes

INFO (Carpet): This is process 0

INFO (Carpet): OpenMP is enabled

INFO (Carpet): This process contains 1 threads, this is thread 0

INFO (Carpet): There are 144 threads in total

INFO (Carpet): There are 4.5 threads per process

INFO (Carpet): This process runs on host n129, pid=20823

INFO (Carpet): This process runs on 1 core: 0

INFO (Carpet): Thread 0 runs on 1 core: 0

INFO (Carpet): This simulation is running in 3 dimensions

INFO (Carpet): Boundary specification for map 0:

   nboundaryzones: [[3,3,3],[3,3,3]]

   is_internal   : [[0,0,0],[0,0,0]]

   is_staggered  : [[0,0,0],[0,0,0]]

   shiftout      : [[1,0,1],[0,0,0]]

WARNING level 1 from host n131 process 21

  in thorn Carpet, file
/home2/mallick/ET9/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:426:

  -> The number of threads for this process is larger its number of cores.
This may indicate a performance problem.

This is something that we couldn’t understand as we asked for only 32
procs, with num-threads set to 1. The command that we used to submit our
job was:

 ./simfactory/bin/sim create-submit p32_mpin_npn --procs=32 --ppn=16
--num-threads=1 --ppn-used=16 --num-smt=1 --parfile=par/nsnstohmns1.par
--walltime=24:10:00

I have attached the out file, runscript, submitscript, optionlist, machine
file for reference. Thanks in advance for help.

Sincerely,
--
Spandan Sarma
BS-MS' 19
Department of Physics (4th Year),
IISER Bhopal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20221208/ce955086/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p32_mpin_npn.cfg
Type: application/octet-stream
Size: 4142 bytes
Desc: not available
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20221208/ce955086/attachment-0004.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p32_mpin_npn.run
Type: application/octet-stream
Size: 2370 bytes
Desc: not available
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20221208/ce955086/attachment-0005.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p32_mpin_npn.ini
Type: application/octet-stream
Size: 2851 bytes
Desc: not available
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20221208/ce955086/attachment-0006.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p32_mpin_npn.sub
Type: text/x-microdvd
Size: 814 bytes
Desc: not available
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20221208/ce955086/attachment-0001.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p32_mpin_npn.out
Type: application/octet-stream
Size: 411946 bytes
Desc: not available
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20221208/ce955086/attachment-0007.obj 


More information about the Users mailing list