<div dir="ltr">Hello,<div><br></div><div>Reverting to srun fixes the problem. I updated the master branches for the testsuite </div><div>results and simfactory.</div><div><br></div><div>Gabriele</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jun 2, 2022 at 9:12 AM Gabriele Bozzola <<a href="mailto:bozzola.gabriele@gmail.com">bozzola.gabriele@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi Roland,<div><br></div><div>That sounds reasonable. I think I was originally using srun, but was recommended</div><div>to move to ibrun. I will try with srun to see if it works, in which case I will update the</div><div>simfactory entry and the testsuite results.</div><div><br></div><div>Gabrieel</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jun 2, 2022 at 8:03 AM Roland Haas <<a href="mailto:rhaas@illinois.edu" target="_blank">rhaas@illinois.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello Gabriele,<br>
<br>
ok, I can at least partially answer this. Indeed RNS's A2 test is code<br>
to use only 1 MPI rank:<br>
<br>
TEST rnsA2<br>
{<br>
PROCS 1<br>
}<br>
<br>
and thus the most likely reason is that ibrun just pulls the number of<br>
MPI ranks from SLURM rather than from whatever simfactory tries to use.<br>
<br>
Since ibrun is no longer documented on the SDSC page (at least I do not<br>
see it on <a href="https://www.sdsc.edu/support/user_guides/expanse.html" rel="noreferrer" target="_blank">https://www.sdsc.edu/support/user_guides/expanse.html</a>), maybe<br>
the easiest fix is to remove it and use the srun command they document<br>
now?<br>
<br>
Yours,<br>
Roland<br>
<br>
> Hello Gabriele,<br>
> <br>
> hmm.<br>
> <br>
> > /home/sbozzolo/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:148: <br>
> > -> The environment variable CACTUS_NUM_PROCS is set to 1, but there are 2 <br>
> > MPI processes. This may indicate a severe problem with the MPI startup<br>
> > mechanism. <br>
> <br>
> > IBRUN: launch command: srun -n 2 --ntasks-per-node 2<br>
> > /expanse/lustre/projects/uic383/sbozzolo/ettests_2proc/SIMFACTORY/exe/cactus_sim <br>
> <br>
> Looking at these, I would have expected that CACTUS_NUM_PROCS is set to<br>
> 2 given that -n is 2 (being the number of MPI ranks). <br>
> <br>
> The current submitscript uses ibrun though current documentation uses<br>
> srun. Maybe changing to srun helps? Though the srun command does seem<br>
> to have 2 MPI procs in the way you expect to.<br>
> <br>
> Can you check (in the RunScript in<br>
> simulations/foo/output-0000/SIMFACTORY) what CACTUS_NUM_PROCS is set to?<br>
> <br>
> If this works with "regular" runs but fails with the testsuite using<br>
> --testsuite then the issue is most likely related to the complicated<br>
> method simfactory has to use to set the number of MPI ranks.<br>
> <br>
> I would check if the failing test is actually runnable only on 1 MPI<br>
> rank (set in test.ccl). In that case, Cactus will try to run in it in a<br>
> 2 MPI rank test suite but use only 1 MPI rank. Possibly ibrun ignores<br>
> Cactus' request and uses only information provided by SLURM.<br>
> <br>
> Yours,<br>
> Roland<br>
> <br>
<br>
-- <br>
My email is as private as my paper mail. I therefore support encrypting<br>
and signing email messages. Get my PGP key from <a href="http://pgp.mit.edu" rel="noreferrer" target="_blank">http://pgp.mit.edu</a> .<br>
</blockquote></div>
</blockquote></div>