<div dir="ltr"><div class="gmail_default" style="color:#000000">Hello Steve,</div><div class="gmail_default" style="color:#000000"><br></div><div class="gmail_default" style="color:#000000">Thanks for pointing this out. I'll try to write a fresh runscipt by looking at example runscripts.</div><div class="gmail_default" style="color:#000000"><br></div><div class="gmail_default" style=""><font color="#a64d79">Since you're using slurm, MPI should be smart enough that you don't need to pass -n, -npernode,<br></font></div><div class="gmail_default" style="color:#000000"><br></div><div class="gmail_default" style="color:#000000">I don't need to pass -n as well? I can see -n @NUM_PROCS@ in the SBATCH runscripts that uses openmpi (example - graham, expanse). Can you please explain a little bit about what should be the simplest/safest mpi execution command to start with? And how can we build it further to optimise it more?</div><div class="gmail_default" style="color:#000000"><br></div><div class="gmail_default" style=""><font color="#a64d79">How did you get a Runscript and Submitscript for this machine. Did you create yourself?</font><br></div><div class="gmail_default" style="color:#000000"><br></div><div class="gmail_default" style="color:#000000">My first attempt was at an HPC at my home institute IISER Bhopal, which had PBS. Then, I installed ETK in the NSM facility (India), which has SLURM. I changed most of the stuff in machinefile and submitscipt as per SLRUM (by looking at available example scripts) and copied the runscript from PBS, which was working fine. </div><div class="gmail_default" style="color:#000000"><br></div><div class="gmail_default" style="color:#000000">The current HPC also has SLURM, so I copied all the scripts from the NSM facility. It always worked alright, so I was never quite sceptical about the runscipt, especially because the error has only shown up in the current HPC only and quite occasionally.</div><div class="gmail_default" style="color:#000000"><br></div><div class="gmail_default" style="color:#000000">Now, I can see from example runscripts that mpi execution commands for SBATCH look very different from PBS ones.</div><div class="gmail_default" style="color:#000000"><br></div><div class="gmail_default" style="color:#000000">Regards</div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><font color="#666666">Shamim Haque</font></div><div dir="ltr"><font color="#666666">Senior Research Fellow (SRF)<br></font><div><font color="#666666">Department of Physics</font></div><div><font color="#666666">IISER Bhopal</font></div></div></div></div></div></div></div></div></div></div><br></div><div hspace="streak-pt-mark" style="max-height:1px"><img alt="" style="width:0px;max-height:0px;overflow:hidden" src="https://mailfoogae.appspot.com/t?sender=ac2hhbWltc0BpaXNlcmIuYWMuaW4%3D&type=zerocontent&guid=ee6109f4-a5db-4cb7-b4fc-e71c32d45e20"><font color="#ffffff" size="1">ᐧ</font></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, May 1, 2024 at 11:05 PM Steven Brandt <<a href="mailto:sbrandt@cct.lsu.edu">sbrandt@cct.lsu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
<p>Hello Shamim,</p>
<p>The error says that you're calling MPI with the wrong parameters,
specificall -npernode. Since you're using slurm, MPI should be
smart enough that you don't need to pass -n, -npernode, How did
you get a Runscript and Submitscript for this machine. Did you
create yourself?</p>
<p>--Steve<br>
</p>
<div>On 5/1/2024 6:54 AM, Shamim Haque
1910511 wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="color:rgb(0,0,0)">Hi all,</div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default" style="color:rgb(0,0,0)">I am
attempting ETK installation in KALINGA Cluster at NISER,
India. This cluster has 40 procs per node and SLURM workload
manager.</div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default" style="color:rgb(0,0,0)">I compiled ETK
with gcc-7.5 and openmpi-4.0.5 (attached the machinefile,
optionlist, submitscript and runscript). The installation is
mostly alright, as I can run parfiles for test TOV and BNS
mergers.</div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default" style="color:rgb(0,0,0)">I tried to run
a simulation with procs=160 (nodes 4) and num-threads=1 but
landed with this error (error file also attached):</div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default"><i><font color="#a64d79">+ mpiexec -n 640 -npernode 40.0
/home/kamal/simulations/dx25_r500_rg7_t30_p640-1_2/SIMFACTORY/exe/cactus_sim
-L 3
/home/kamal/simulations/dx25_r500_rg7_t30_p640-1_2/output-0000/eos20_dx25_r500_rg7.par<br>
----------------------------------------------------------------------------<br>
Open MPI has detected that a parameter given to a command
line<br>
option does not match the expected format:<br>
<br>
Option: npernode<br>
Param: 40.0<br>
<br>
This is frequently caused by omitting to provide the
parameter<br>
to an option that requires one. Please check the command
line and try again.<br>
----------------------------------------------------------------------------</font><br>
</i></div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default" style="color:rgb(0,0,0)">Strangely, this
error is not at all regular. Mostly, the error won't appear,
and the simulation works just fine (with no changes being made
in the scripts or simfactory command). In fact, this exact
simulation has worked fine before. Since I am unable to find
the source of this issue, I am also unable to recreate the
error on my own. But it does kick in occasionally.</div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default" style="color:rgb(0,0,0)">My command for
mpi execution in runscript looks like this:</div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default"><i><font color="#3d85c6">time mpiexec
-n @NUM_PROCS@ -npernode @(@PPN_USED@ / @NUM_THREADS@)@
@EXECUTABLE@ -L 3 @PARFILE@</font></i><br>
</div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default" style="color:rgb(0,0,0)">If I replace <i style="color:rgb(34,34,34)"><font color="#3d85c6"> @(@PPN_USED@
/ @NUM_THREADS@)@ </font></i>with a desired value, then
the script always works. My simfactory command looks like
this:</div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default"><i><font color="#3d85c6">./simfactory/bin/sim create-submit
dx25_r500_rg7_t30_p640-1_2
--parfile=par-smooth/scale_test/eos20_dx25_r500_rg7.par
--queue=large1 --procs=640 --num-threads=1
--walltime=00:45:00<br>
</font></i></div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default" style="color:rgb(0,0,0)">I am unable to
understand how to solve this issue. Any help with this issue
is appreciated. Please let me know if you need more
information. Thank you.</div>
<div class="gmail_default" style="color:rgb(0,0,0)"><br>
</div>
<div class="gmail_default" style="color:rgb(0,0,0)">Regards</div>
<div>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr"><font color="#666666">Shamim
Haque</font></div>
<div dir="ltr"><font color="#666666">Senior
Research Fellow (SRF)<br>
</font>
<div><font color="#666666">Department of
Physics</font></div>
<div><font color="#666666">IISER Bhopal</font></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div hspace="streak-pt-mark" style="max-height:1px"><img alt="" style="width: 0px; max-height: 0px; overflow: hidden;" src="https://mailfoogae.appspot.com/t?sender=ac2hhbWltc0BpaXNlcmIuYWMuaW4%3D&type=zerocontent&guid=e2f8d473-349d-4337-ba32-7ab03cfbfabd"><font size="1" color="#ffffff">ᐧ</font></div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
Users mailing list
<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a>
<a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a>
</pre>
</blockquote>
</div>
_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a><br>
<a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
</blockquote></div>