[ET Trac] [Einstein Toolkit] #1850: Severe performance problem on Stampede
Einstein Toolkit
trac-noreply at einsteintoolkit.org
Wed Dec 16 15:59:43 CST 2015
#1850: Severe performance problem on Stampede
-------------------------+--------------------------------------------------
Reporter: hinder | Owner:
Type: defect | Status: confirmed
Priority: major | Milestone:
Component: SimFactory | Version: development version
Resolution: | Keywords:
-------------------------+--------------------------------------------------
Comment (by michael.clark@…):
Replying to [comment:16 hinder]:
> Note that Michael Clark reported different results, but he says that
they were probably not accurate, as he cannot reproduce them now.
>
> Michael: is it possible that the run script you were using was not the
updated one you had modified? Editing the run script in
simfactory/mdb/runscripts is not sufficient. It then needs to be added to
the Cactus configuration before rerunning. This requires a "sim build
<config> --runscript <runscriptname>".
Short version: I'm aware this is required to change the runscript for a
configuration. I double-checked the simulation directories to make sure
that the runscripts were correct.
Longer version: I ran a simulation "runA" with executable "exeA", and
"runB" with executable "exeB". These executables used the same optionlist
(default), thornlist (containing hwloc and SystemTopology), and
submitscript (default), and the runscripts differed by one having an
additional, commented out line. The simulations used the same parameter
file that has both hwloc and SystemTopology. Both runscripts had the
"export KMP_AFFINITY..." line commented out as well. I ran "runA" on
Monday, and it ran 9-10x slower than baseline, leading me to make my
previous comment. I ran "runB" yesterday, however, and I saw baseline
performance.
For good measure, I performed a few other tests: I reconfigured with
identical runscripts, and I also used the executable exeA to perform the
same simulation runA again, without recovery, to see if that executable
was still slow. I found it ran yesterday at the same (high) speed as
baseline.
Some misc notes on obstacles to these tests: I found it inconvenient
that...
(a) the configuration has to be rebuilt merely to change the default
runscript, as in particular this means the resulting executables are
different, despite having the same optionlist and thornlist (different in
the sense of having different md5 hashes). This is why I went through with
redoing the simulation runA with exeA. I suspect the executables are
different because, in part, they have information about the date of
compilation that is printed at the beginning of a run.
I think having a command line option to provide the runscript would be
convenient, albeit unlikely to be used once performance considerations
have been resolved. Moreover, you can provide --runscript to simfactory's
create-submit command, and simfactory will silently ignore this.
(Thankfully, this wasted only 5 minutes of my time.)
(b) The option --norecover flatly did not work for rerunning a simulation
from the beginning, despite being advertised in simfactory as the default.
I had to manually delete checkpoint directories to perform the simulation
again in the same simulation directory.
So as to what could have caused the discrepancy I originally observed? I
cannot say with any certainty. As far as something under my control, I
considered whether this had to do with envsetup or module loads. I have
often in the past run with envsetup set to "sleep 0", but I have not at
any point observed a performance impact of changing envsetup on runs using
mvapich2. I tested this with intel MPI module loaded as default as well as
with intel MPI loaded in envsetup; neither had any performance impact.
That is all I have at this time.
--
Ticket URL: <https://trac.einsteintoolkit.org/ticket/1850#comment:18>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit
More information about the Trac
mailing list