<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hi all,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I am running ETK (2019_10) on a home built cluster consisting of two nodes (8 cores, 16 threads, 64GB 4.3 GHz each). I just finished my second node and am trying to run a simulation (BBHMedRes) over both nodes. For starters I am just running one process (one
thread per process) on each node. When I execute my simfactory submit command, I get one process with one thread on the node I submitted the simulation on. However, I get one process with 16 threads on the second node which I don't want. When I run on just
the first node, the number of processes and threads per process I get are just what I specify in the simfactory submit command. If I submit the simulation on the second node and just run on the second node I get processs/threads just what I specify in the
simfactory submit command. Its only when I run on multiply nodes that don't get the # of processes/threads that I specify. Is there something I am doing wrong? I am using OpenMPI.<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thanks for any help, Tony...<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Relevant data is:</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<ol>
<li>RunScript:<br>
<br>
<span>#!/bin/sh<br>
</span>
<div><br>
</div>
<div># This runscript is used internally by simfactory as a template during the<br>
</div>
<div># sim setup and sim setup-silent commands<br>
</div>
<div># Edit at your own risk<br>
</div>
<div><br>
</div>
<div>echo "Preparing:"<br>
</div>
<div>set -x # Output commands<br>
</div>
<div>set -e # Abort on errors<br>
</div>
<div><br>
</div>
<div>cd @RUNDIR@-active<br>
</div>
<div><br>
</div>
<div>echo "Checking:"<br>
</div>
<div>pwd<br>
</div>
<div>hostname<br>
</div>
<div>date<br>
</div>
<div><br>
</div>
<div>echo "Environment:"<br>
</div>
<div>export CACTUS_NUM_PROCS=@NUM_PROCS@<br>
</div>
<div>export CACTUS_NUM_THREADS=@NUM_THREADS@<br>
</div>
<div>export GMON_OUT_PREFIX=gmon.out<br>
</div>
<div>export OMP_NUM_THREADS=@NUM_THREADS@<br>
</div>
<div>env | sort > SIMFACTORY/ENVIRONMENT<br>
</div>
<div><br>
</div>
<div>echo "Starting:"<br>
</div>
<div>export CACTUS_STARTTIME=$(date +%s)<br>
</div>
<div><br>
</div>
<div>if [ ${CACTUS_NUM_PROCS} = 1 ]; then<br>
</div>
<div> if [ @RUNDEBUG@ -eq 0 ]; then<br>
</div>
<div> @EXECUTABLE@ -L 3 @PARFILE@<br>
</div>
<div> else<br>
</div>
<div> gdb --args @EXECUTABLE@ -L 3 @PARFILE@<br>
</div>
<div> fi<br>
</div>
<div>else<br>
</div>
<div>mpirun --hostfile /home/mpiuser/mpi-hosts -np @NUM_PROCS@ @EXECUTABLE@ -L 3 @PARFILE@<br>
</div>
<div>fi<br>
</div>
<div><br>
</div>
<div>echo "Stopping:"<br>
</div>
<div>date<br>
</div>
<div>echo "Done."<br>
</div>
<span></span><br>
</li><li>mpi-hosts file:<br>
<br>
<span>localhost slots=1<br>
</span>
<div>RZNode2 slots=1<br>
</div>
<span></span><br>
</li><li>simfactory submit command: <span>./simfactory/bin/sim submit BBHMedRes --parfile=par/BBHMedRes.par --procs=2 --num-smt=1 --num-threads=1 --ppn-used=1 --ppn=1 --wallt<br>
</span><span>ime=99:0:0 | cat<br>
<br>
</span></li><li><span>Machine file on first node (RZNode1):<br>
<br>
<span><br>
</span>
<div>[RZNode1]<br>
</div>
<div><br>
</div>
<div># This machine description file is used internally by simfactory as a template<br>
</div>
<div># during the sim setup and sim setup-silent commands<br>
</div>
<div># Edit at your own risk<br>
</div>
<div># Machine description<br>
</div>
<div>nickname = RZNode1<br>
</div>
<div>name = RZNode1<br>
</div>
<div>location = somewhere<br>
</div>
<div>description = Whatever<br>
</div>
<div>status = personal<br>
</div>
<div><br>
</div>
<div># Access to this machine<br>
</div>
<div>hostname = RZNode1<br>
</div>
<div>aliaspattern = ^generic\.some\.where$<br>
</div>
<div><br>
</div>
<div># Source tree management<br>
</div>
<div>sourcebasedir = /home/Cactus<br>
</div>
<div>optionlist = generic.cfg<br>
</div>
<div>submitscript = generic.sub<br>
</div>
<div>runscript = generic.run<br>
</div>
<div>make = make -j@MAKEJOBS@<br>
</div>
<div>basedir = /home/mpiuser/simulations<br>
</div>
<div>ppn = 1 # was 16<br>
</div>
<div>max-num-threads = 1 # was 16<br>
</div>
<div>num-threads = 1 # was 16<br>
</div>
<div>nodes = 2<br>
</div>
<div>submit = exec nohup @SCRIPTFILE@ < /dev/null > @RUNDIR@/@SIMULATION_NAME@.out 2> @RUNDIR@/@SIMULATION_NAME@.err & echo $!<br>
</div>
<div>getstatus = ps @JOB_ID@<br>
</div>
<div>stop = kill @JOB_ID@<br>
</div>
<div>submitpattern = (.*)<br>
</div>
<div>statuspattern = "^ *@JOB_ID@ "<br>
</div>
<div>queuedpattern = $^<br>
</div>
<div>runningpattern = ^<br>
</div>
<div>holdingpattern = $^<br>
</div>
<div>exechost = echo localhost<br>
</div>
<div>exechostpattern = (.*)<br>
</div>
<div>stdout = cat @SIMULATION_NAME@.out<br>
</div>
<div>stderr = cat @SIMULATION_NAME@.err<br>
</div>
<div>stdout-follow = tail -n 100 -f @SIMULATION_NAME@.out @SIMULATION_NAME@.err<br>
<br>
</div>
<span></span></span></li><li><span><span>Machine file on second node (RZNode2):<br>
<br>
<span>[RZNode2]<br>
</span>
<div><br>
</div>
<div># This machine description file is used internally by simfactory as a template<br>
</div>
<div># during the sim setup and sim setup-silent commands<br>
</div>
<div># Edit at your own risk<br>
</div>
<div># Machine description<br>
</div>
<div>nickname = RZNode2<br>
</div>
<div>name = RZNode2<br>
</div>
<div>location = somewhere<br>
</div>
<div>description = Whatever<br>
</div>
<div>status = personal<br>
</div>
<div><br>
</div>
<div># Access to this machine<br>
</div>
<div>hostname = RZNode2<br>
</div>
<div>aliaspattern = ^generic\.some\.where$<br>
</div>
<div><br>
</div>
<div># Source tree management<br>
</div>
<div>sourcebasedir = /home/ET_2019_10<br>
</div>
<div>optionlist = generic.cfg<br>
</div>
<div>submitscript = generic.sub<br>
</div>
<div>runscript = generic.run<br>
</div>
<div>make = make -j@MAKEJOBS@<br>
</div>
<div>basedir = /home/mpiuser/simulations<br>
</div>
<div>ppn = 1<br>
</div>
<div>max-num-threads = 1<br>
</div>
<div>num-threads = 1<br>
</div>
<div>nodes = 1<br>
</div>
<div>submit = exec nohup @SCRIPTFILE@ < /dev/null > @RUNDIR@/@SIMULATION_NAME@.out 2> @RUNDIR@/@SIMULATION_NAME@.err & echo $!<br>
</div>
<div>getstatus = ps @JOB_ID@<br>
</div>
<div>stop = kill @JOB_ID@<br>
</div>
<div>submitpattern = (.*)<br>
</div>
<div>statuspattern = "^ *@JOB_ID@ "<br>
</div>
<div>queuedpattern = $^<br>
</div>
<div>runningpattern = ^<br>
</div>
<div>holdingpattern = $^<br>
</div>
<div>exechost = echo localhost<br>
</div>
<div>exechostpattern = (.*)<br>
</div>
<div>stdout = cat @SIMULATION_NAME@.out<br>
</div>
<div>stderr = cat @SIMULATION_NAME@.err<br>
</div>
<div>stdout-follow = tail -n 100 -f @SIMULATION_NAME@.out @SIMULATION_NAME@.err<br>
</div>
<span></span></span></span></li></ol>
<div><br>
</div>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div id="Signature">
<div style="font-family:Tahoma; font-size:13px"><img style="max-width:100%; height:auto; font-family:Calibri; font-size:15px" src="https://email.osu.edu/owa/attachment.ashx?id=RgAAAAAb%2fHy0wVvTSoHQx8OJXAaLBwCiA5IZrwRKTqiLVNbt4xWyAAAAAAFUAADltPc25wRDT4tJbW9en2wXAHKArVn%2fAAAJ&attcnt=1&attid0=EAD8Cse5Lj5uQ6ZJWk98Q%2blj"><br style="font-family:Calibri; font-size:15px">
<font size="2" face="Helvetica" color="#BB0000"><span style="font-size:9pt"><b>Anthony Shoup</b></span></font><font size="2" face="Helvetica" color="#333333"><span style="font-size:9pt"> PhD, Senior Lecturer<br>
</span></font><font size="2" face="Helvetica" color="#BB0000"><span style="font-size:9pt">College of Arts & Sciences, College of Engineering</span></font><font size="2" face="Helvetica" color="#333333"><span style="font-size:9pt"> Departments of Physics, Astronomy,
EEIC<br>
</span></font><font size="2" face="Helvetica" color="#333333"><span style="font-size:9pt">315 Science Bldg. | 4250 Campus Dr. Lima, OH 45807<br>
419-995-8018 Office | 419-516-2257 Mobile<br>
</span></font><a href="https://email.osu.edu/owa/redir.aspx?C=j5WpnJiBk0W5oVlCbtvB-xiCkA_lbdEIi9hlk7ByHiG7ARrxjwDFmAW8S_XespJbMLJRblY5JKc.&URL=mailto%3ashoup.31%40osu.edu" target="_blank" style="font-family:Calibri; font-size:15px"><font size="2" face="Helvetica" color="blue"><span style="font-size:9pt"><b><u>shoup.31@osu.edu</u></b></span></font></a><font size="2" face="Helvetica" color="#333333"><span style="font-size:9pt"> </span></font><a href="https://email.osu.edu/owa/redir.aspx?C=j5WpnJiBk0W5oVlCbtvB-xiCkA_lbdEIi9hlk7ByHiG7ARrxjwDFmAW8S_XespJbMLJRblY5JKc.&URL=http%3a%2f%2fosu.edu" target="_blank" style="font-family:Calibri; font-size:15px"><font size="2" face="Helvetica" color="blue"><span style="font-size:9pt"><b><u>osu.edu</u></b></span></font></a></div>
</div>
</body>
</html>