<div dir="auto"><div data-smartmail="gmail_signature">Thank you, Peter, Samuel, Steve, and Erik for the suggestions and comments. We will try to check for more options for optimizing the performance.</div><div data-smartmail="gmail_signature" dir="auto"><br></div><div data-smartmail="gmail_signature" dir="auto">Regards,</div><div data-smartmail="gmail_signature" dir="auto">Spandan Sarma</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 21 Dec 2022, 15:25 Peter Diener, <<a href="mailto:diener@cct.lsu.edu">diener@cct.lsu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear Spendan,<br>
<br>
You say your simulations performed 2840 timesteps in half an hour on 32 <br>
procs, which is 5680 timesteps per hour. Running for a full day you get <br>
132105 timesteps, i.e. 5504 timesteps per hour. So you're right there is a <br>
small difference in speed. However, remember that the grid structure will <br>
be changing as the black holes move across the grid, so some variation in <br>
speed is to be expected. I think the small difference you observed is <br>
within the natural range of variation.<br>
<br>
Cheers,<br>
<br>
Peter<br>
<br>
<br>
On Thu, 15 Dec 2022, Spandan Sarma 19306 wrote:<br>
<br>
> Dear Erik and Steven,<br>
> <br>
> Thank you so much for the suggestions. We changed the runscript to add -x<br>
> OMP_NUMTHREADS to the command line and it worked in solving the issue with<br>
> the total number of threads being 144. Now it sets to 32 (equal to the<br>
> number of procs).<br>
> <br>
> Also, the iterations have increased to 132105 for 32 procs (24 hr walltime)<br>
> compared to just 240 before. Although this is a huge increase, we expected<br>
> it to be a bit more. For a shorter walltime (30 mins) we received iterations<br>
> - 2840, 2140, 1216 for procs - 32, 16, 8. Are there any more changes that we<br>
> can do to improve on this?<br>
> <br>
> The new runscript and the output file (as a drive link) are attached below<br>
> (no changes were made to the machine file, option list and the submit script<br>
> from before).<br>
> <br>
> p32_omp.out<br>
> <br>
> On Thu, Dec 15, 2022 at 2:49 PM Spandan Sarma 19306 <<a href="mailto:spandan19@iiserb.ac.in" target="_blank" rel="noreferrer">spandan19@iiserb.ac.in</a>><br>
> wrote:<br>
> Dear Erik and Steven,<br>
> <br>
> Thank you so much for the suggestions. We changed the runscript to add<br>
> -x OMP_NUMTHREADS to the command line and it worked in solving the<br>
> issue with the total number of threads being 144. Now it sets to 32<br>
> (equal to the number of procs).<br>
> <br>
> Also, the iterations have increased to 132105 for 32 procs (24 hr<br>
> walltime) compared to just 240 before. Although this is a huge<br>
> increase, we expected it to be a bit more. For a shorter walltime (30<br>
> mins) we received iterations - 2840, 2140, 1216 for procs - 32, 16, 8.<br>
> Are there any more changes that we can do to improve on this?<br>
> <br>
> The new runscript and the output file for 32 procs are attached below<br>
> (no changes were made to the machine file, option list and the submit<br>
> script from before).<br>
> <br>
> On Fri, Dec 9, 2022 at 8:13 PM Steven R. Brandt <<a href="mailto:sbrandt@cct.lsu.edu" target="_blank" rel="noreferrer">sbrandt@cct.lsu.edu</a>><br>
> wrote:<br>
> It's not too late to do a check, though, to see if all<br>
> other nodes have<br>
> the same OMP_NUM_THREADS value. Maybe that's the warning?<br>
> It sounds like<br>
> it should be an error.<br>
><br>
> --Steve<br>
><br>
> On 12/8/2022 5:23 PM, Erik Schnetter wrote:<br>
> > Steve<br>
> ><br>
> > Code that runs as part of the Cactus executable is<br>
> running too late<br>
> > for this. At that time, OpenMP has already been<br>
> initialized.<br>
> ><br>
> > There is the environment variable "CACTUS_NUM_THREADS"<br>
> which is<br>
> > checked at run time, but only if it is set (for backward<br>
> > compatibility). Most people do not bother setting it,<br>
> leaving this<br>
> > error undetected. There is a warning output, but these<br>
> are generally<br>
> > ignored.<br>
> ><br>
> > -erik<br>
> ><br>
> > On Thu, Dec 8, 2022 at 3:48 PM Steven R. Brandt<br>
> <<a href="mailto:sbrandt@cct.lsu.edu" target="_blank" rel="noreferrer">sbrandt@cct.lsu.edu</a>> wrote:<br>
> >> We could probably add some startup code in which MPI<br>
> broadcasts the<br>
> >> OMP_NUM_THREADS setting to all the other processes and<br>
> either checks the<br>
> >> value of the environment variable or calls<br>
> omp_set_num_threads() or some<br>
> >> such.<br>
> >><br>
> >> --Steve<br>
> >><br>
> >> On 12/8/2022 9:03 AM, Erik Schnetter wrote:<br>
> >>> Spandan<br>
> >>><br>
> >>> The problem is likely that MPI does not automatically<br>
> forward your<br>
> >>> OpenMP setting to the other nodes. You are setting the<br>
> environment<br>
> >>> variable OMP_NUM_THREADS in the run script, and it is<br>
> likely necessary<br>
> >>> to forward this environment variable to the other<br>
> processes as well.<br>
> >>> Your MPI documentation will tell you how to do this.<br>
> This is likely an<br>
> >>> additional option you need to pass when calling<br>
> "mpirun".<br>
> >>><br>
> >>> -erik<br>
> >>><br>
> >>> On Thu, Dec 8, 2022 at 2:50 AM Spandan Sarma 19306<br>
> >>> <<a href="mailto:spandan19@iiserb.ac.in" target="_blank" rel="noreferrer">spandan19@iiserb.ac.in</a>> wrote:<br>
> >>>> Hello,<br>
> >>>><br>
> >>>><br>
> >>>> This mail is in continuation to the ticket, “Issue<br>
> with compiling ET on cluster”, by Shamim.<br>
> >>>><br>
> >>>><br>
> >>>> So after Roland’s suggestion, we found that using the<br>
> –prefix <openmpi-directory> command along with hostfile<br>
> worked successfully in simulating a multiple node<br>
> simulation in our HPC.<br>
> >>>><br>
> >>>><br>
> >>>> Now we find that the BNSM gallery simulation evolves<br>
> for only 240 iterations on 2 nodes (16+16 procs, 24 hr<br>
> walltime), which is very slow with respect to, simulation<br>
> on 1 node (16 procs, 24 hr walltime) evolved for 120988<br>
> iterations. The parallelization process goes well within 1<br>
> node, we received iterations - 120988, 67756, 40008 for<br>
> procs - 16, 8, 4 (24 hr walltime) respectively. We are<br>
> unable to understand what is causing this issue when<br>
> openmpi is given 2 nodes (16+16 procs).<br>
> >>>><br>
> >>>><br>
> >>>> In the output files we found the following, which may<br>
> be an indication towards the issue:<br>
> >>>><br>
> >>>> IINFO (Carpet): MPI is enabled<br>
> >>>><br>
> >>>> INFO (Carpet): Carpet is running on 32 processes<br>
> >>>><br>
> >>>> INFO (Carpet): This is process 0<br>
> >>>><br>
> >>>> INFO (Carpet): OpenMP is enabled<br>
> >>>><br>
> >>>> INFO (Carpet): This process contains 1 threads, this<br>
> is thread 0<br>
> >>>><br>
> >>>> INFO (Carpet): There are 144 threads in total<br>
> >>>><br>
> >>>> INFO (Carpet): There are 4.5 threads per process<br>
> >>>><br>
> >>>> INFO (Carpet): This process runs on host n129,<br>
> pid=20823<br>
> >>>><br>
> >>>> INFO (Carpet): This process runs on 1 core: 0<br>
> >>>><br>
> >>>> INFO (Carpet): Thread 0 runs on 1 core: 0<br>
> >>>><br>
> >>>> INFO (Carpet): This simulation is running in 3<br>
> dimensions<br>
> >>>><br>
> >>>> INFO (Carpet): Boundary specification for map 0:<br>
> >>>><br>
> >>>> nboundaryzones: [[3,3,3],[3,3,3]]<br>
> >>>><br>
> >>>> is_internal : [[0,0,0],[0,0,0]]<br>
> >>>><br>
> >>>> is_staggered : [[0,0,0],[0,0,0]]<br>
> >>>><br>
> >>>> shiftout : [[1,0,1],[0,0,0]]<br>
> >>>><br>
> >>>> WARNING level 1 from host n131 process 21<br>
> >>>><br>
> >>>> in thorn Carpet, file<br>
> /home2/mallick/ET9/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:426:<br>
> >>>><br>
> >>>> -> The number of threads for this process is<br>
> larger its number of cores. This may indicate a<br>
> performance problem.<br>
> >>>><br>
> >>>><br>
> >>>> This is something that we couldn’t understand as we<br>
> asked for only 32 procs, with num-threads set to 1. The<br>
> command that we used to submit our job was:<br>
> >>>><br>
> >>>> ./simfactory/bin/sim create-submit p32_mpin_npn<br>
> --procs=32 --ppn=16 --num-threads=1 --ppn-used=16<br>
> --num-smt=1 --parfile=par/nsnstohmns1.par<br>
> --walltime=24:10:00<br>
> >>>><br>
> >>>><br>
> >>>> I have attached the out file, runscript,<br>
> submitscript, optionlist, machine file for reference.<br>
> Thanks in advance for help.<br>
> >>>><br>
> >>>><br>
> >>>> Sincerely,<br>
> >>>><br>
> >>>> --<br>
> >>>> Spandan Sarma<br>
> >>>> BS-MS' 19<br>
> >>>> Department of Physics (4th Year),<br>
> >>>> IISER Bhopal<br>
> >>>> _______________________________________________<br>
> >>>> Users mailing list<br>
> >>>> <a href="mailto:Users@einsteintoolkit.org" target="_blank" rel="noreferrer">Users@einsteintoolkit.org</a><br>
> >>>><br>
> <a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
> >>><br>
> >> _______________________________________________<br>
> >> Users mailing list<br>
> >> <a href="mailto:Users@einsteintoolkit.org" target="_blank" rel="noreferrer">Users@einsteintoolkit.org</a><br>
> >> <a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
> ><br>
> ><br>
> _______________________________________________<br>
> Users mailing list<br>
> <a href="mailto:Users@einsteintoolkit.org" target="_blank" rel="noreferrer">Users@einsteintoolkit.org</a><br>
> <a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
> <br>
> <br>
> <br>
> --<br>
> Spandan Sarma<br>
> BS-MS' 19Department of Physics (4th Year),<br>
> IISER Bhopal<br>
> <br>
> <br>
> <br>
> --<br>
> Spandan Sarma<br>
> BS-MS' 19Department of Physics (4th Year),<br>
> IISER Bhopal<br>
> <br>
></blockquote></div>