[Users] Issue with Multiple Node Simulation on cluster

Wed Dec 21 03:55:04 CST 2022

Dear Spendan,

You say your simulations performed 2840 timesteps in half an hour on 32 
procs, which is 5680 timesteps per hour. Running for a full day you get 
132105 timesteps, i.e. 5504 timesteps per hour. So you're right there is a 
small difference in speed. However, remember that the grid structure will 
be changing as the black holes move across the grid, so some variation in 
speed is to be expected. I think the small difference you observed is 
within the natural range of variation.

Cheers,

   Peter

On Thu, 15 Dec 2022, Spandan Sarma 19306 wrote:

> Dear Erik and Steven,
> 
> Thank you so much for the suggestions. We changed the runscript to add -x
> OMP_NUMTHREADS to the command line and it worked in solving the issue with
> the total number of threads being 144. Now it sets to 32 (equal to the
> number of procs).
> 
> Also, the iterations have increased to 132105 for 32 procs (24 hr walltime)
> compared to just 240 before. Although this is a huge increase, we expected
> it to be a bit more. For a shorter walltime (30 mins) we received iterations
> - 2840, 2140, 1216 for procs - 32, 16, 8. Are there any more changes that we
> can do to improve on this?
> 
> The new runscript and the output file (as a drive link) are attached below
> (no changes were made to the machine file, option list and the submit script
> from before).
> 
> p32_omp.out
> 
> On Thu, Dec 15, 2022 at 2:49 PM Spandan Sarma 19306 <spandan19 at iiserb.ac.in>
> wrote:
>       Dear Erik and Steven,
> 
> Thank you so much for the suggestions. We changed the runscript to add
> -x OMP_NUMTHREADS to the command line and it worked in solving the
> issue with the total number of threads being 144. Now it sets to 32
> (equal to the number of procs).
> 
> Also, the iterations have increased to 132105 for 32 procs (24 hr
> walltime) compared to just 240 before. Although this is a huge
> increase, we expected it to be a bit more. For a shorter walltime (30
> mins) we received iterations - 2840, 2140, 1216 for procs - 32, 16, 8.
> Are there any more changes that we can do to improve on this?
> 
> The new runscript and the output file for 32 procs are attached below
> (no changes were made to the machine file, option list and the submit
> script from before).
> 
> On Fri, Dec 9, 2022 at 8:13 PM Steven R. Brandt <sbrandt at cct.lsu.edu>
> wrote:
>       It's not too late to do a check, though, to see if all
>       other nodes have
>       the same OMP_NUM_THREADS value. Maybe that's the warning?
>       It sounds like
>       it should be an error.
>
>       --Steve
>
>       On 12/8/2022 5:23 PM, Erik Schnetter wrote:
>       > Steve
>       >
>       > Code that runs as part of the Cactus executable is
>       running too late
>       > for this. At that time, OpenMP has already been
>       initialized.
>       >
>       > There is the environment variable "CACTUS_NUM_THREADS"
>       which is
>       > checked at run time, but only if it is set (for backward
>       > compatibility). Most people do not bother setting it,
>       leaving this
>       > error undetected. There is a warning output, but these
>       are generally
>       > ignored.
>       >
>       > -erik
>       >
>       > On Thu, Dec 8, 2022 at 3:48 PM Steven R. Brandt
>       <sbrandt at cct.lsu.edu> wrote:
>       >> We could probably add some startup code in which MPI
>       broadcasts the
>       >> OMP_NUM_THREADS setting to all the other processes and
>       either checks the
>       >> value of the environment variable or calls
>       omp_set_num_threads() or some
>       >> such.
>       >>
>       >> --Steve
>       >>
>       >> On 12/8/2022 9:03 AM, Erik Schnetter wrote:
>       >>> Spandan
>       >>>
>       >>> The problem is likely that MPI does not automatically
>       forward your
>       >>> OpenMP setting to the other nodes. You are setting the
>       environment
>       >>> variable OMP_NUM_THREADS in the run script, and it is
>       likely necessary
>       >>> to forward this environment variable to the other
>       processes as well.
>       >>> Your MPI documentation will tell you how to do this.
>       This is likely an
>       >>> additional option you need to pass when calling
>       "mpirun".
>       >>>
>       >>> -erik
>       >>>
>       >>> On Thu, Dec 8, 2022 at 2:50 AM Spandan Sarma 19306
>       >>> <spandan19 at iiserb.ac.in> wrote:
>       >>>> Hello,
>       >>>>
>       >>>>
>       >>>> This mail is in continuation to the ticket, “Issue
>       with compiling ET on cluster”, by Shamim.
>       >>>>
>       >>>>
>       >>>> So after Roland’s suggestion, we found that using the
>       –prefix <openmpi-directory> command along with hostfile
>       worked successfully in simulating a multiple node
>       simulation in our HPC.
>       >>>>
>       >>>>
>       >>>> Now we find that the BNSM gallery simulation evolves
>       for only 240 iterations on 2 nodes (16+16 procs, 24 hr
>       walltime), which is very slow with respect to, simulation
>       on 1 node (16 procs, 24 hr walltime) evolved for 120988
>       iterations. The parallelization process goes well within 1
>       node, we received iterations - 120988, 67756, 40008 for
>       procs - 16, 8, 4 (24 hr walltime) respectively. We are
>       unable to understand what is causing this issue when
>       openmpi is given 2 nodes (16+16 procs).
>       >>>>
>       >>>>
>       >>>> In the output files we found the following, which may
>       be an indication towards the issue:
>       >>>>
>       >>>> IINFO (Carpet): MPI is enabled
>       >>>>
>       >>>> INFO (Carpet): Carpet is running on 32 processes
>       >>>>
>       >>>> INFO (Carpet): This is process 0
>       >>>>
>       >>>> INFO (Carpet): OpenMP is enabled
>       >>>>
>       >>>> INFO (Carpet): This process contains 1 threads, this
>       is thread 0
>       >>>>
>       >>>> INFO (Carpet): There are 144 threads in total
>       >>>>
>       >>>> INFO (Carpet): There are 4.5 threads per process
>       >>>>
>       >>>> INFO (Carpet): This process runs on host n129,
>       pid=20823
>       >>>>
>       >>>> INFO (Carpet): This process runs on 1 core: 0
>       >>>>
>       >>>> INFO (Carpet): Thread 0 runs on 1 core: 0
>       >>>>
>       >>>> INFO (Carpet): This simulation is running in 3
>       dimensions
>       >>>>
>       >>>> INFO (Carpet): Boundary specification for map 0:
>       >>>>
>       >>>>      nboundaryzones: [[3,3,3],[3,3,3]]
>       >>>>
>       >>>>      is_internal   : [[0,0,0],[0,0,0]]
>       >>>>
>       >>>>      is_staggered  : [[0,0,0],[0,0,0]]
>       >>>>
>       >>>>      shiftout      : [[1,0,1],[0,0,0]]
>       >>>>
>       >>>> WARNING level 1 from host n131 process 21
>       >>>>
>       >>>>     in thorn Carpet, file
>       /home2/mallick/ET9/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:426:
>       >>>>
>       >>>>     -> The number of threads for this process is
>       larger its number of cores. This may indicate a
>       performance problem.
>       >>>>
>       >>>>
>       >>>> This is something that we couldn’t understand as we
>       asked for only 32 procs, with num-threads set to 1. The
>       command that we used to submit our job was:
>       >>>>
>       >>>>    ./simfactory/bin/sim create-submit p32_mpin_npn
>       --procs=32 --ppn=16 --num-threads=1 --ppn-used=16
>       --num-smt=1 --parfile=par/nsnstohmns1.par
>       --walltime=24:10:00
>       >>>>
>       >>>>
>       >>>> I have attached the out file, runscript,
>       submitscript, optionlist, machine file for reference.
>       Thanks in advance for help.
>       >>>>
>       >>>>
>       >>>> Sincerely,
>       >>>>
>       >>>> --
>       >>>> Spandan Sarma
>       >>>> BS-MS' 19
>       >>>> Department of Physics (4th Year),
>       >>>> IISER Bhopal
>       >>>> _______________________________________________
>       >>>> Users mailing list
>       >>>> Users at einsteintoolkit.org
>       >>>>
>       http://lists.einsteintoolkit.org/mailman/listinfo/users
>       >>>
>       >> _______________________________________________
>       >> Users mailing list
>       >> Users at einsteintoolkit.org
>       >> http://lists.einsteintoolkit.org/mailman/listinfo/users
>       >
>       >
>       _______________________________________________
>       Users mailing list
>       Users at einsteintoolkit.org
>       http://lists.einsteintoolkit.org/mailman/listinfo/users
> 
> 
> 
> --
> Spandan Sarma
> BS-MS' 19Department of Physics (4th Year),
> IISER Bhopal
> 
> 
> 
> --
> Spandan Sarma
> BS-MS' 19Department of Physics (4th Year),
> IISER Bhopal
> 
>