[ET Trac] [Einstein Toolkit] #1796: Include IllinoisGRMHD into the Toolkit, as an Arrangement
Einstein Toolkit
trac-noreply at einsteintoolkit.org
Fri Oct 9 19:59:29 CDT 2015
#1796: Include IllinoisGRMHD into the Toolkit, as an Arrangement
------------------------------------+---------------------------------------
Reporter: zachetie@… | Owner: Zachariah Etienne
Type: enhancement | Status: new
Priority: major | Milestone: ET_2015_11
Component: EinsteinToolkit thorn | Version: development version
Resolution: | Keywords: GRMHD IllinoisGRMHD
------------------------------------+---------------------------------------
Comment (by Zach Etienne):
Replying to [comment:18 hinder]:
> Replying to [comment:14 Zach Etienne]:
> > My understanding, based on Ian Hinder's message to the [Users] list
(Subject: Test case notes), was that tests cannot be long in duration:
> >
> > "Tests should run as quickly as possible; ideally in less than one
second. Longer tests are OK if this is absolutely necessary to test for
regressions."
> >
> > I interpret this to mean that my usual tests for correctness, which
might require evolutions of order an hour long, running in parallel across
machines, are disallowed in ET.
> The point is that the tests in the thorns' 'test' directory are intended
to be tests which can be run frequently and easily. They are run roughly
after every commit on some fairly small virtual machines, and people are
encouraged to run the tests after checking out the ET, to make sure that
everything is currently working on their machine. Thus, they should be
fast to run. Usually this means regression tests with small amounts of
output.
My correctness tests require very little output (for TOV stars, just the
maximum density plus the L2 norm of density versus time suffices). The
issue is, they require orders of magnitude more resources than a few
seconds of testing on a server.
> In addition, a code should have correctness tests, and these may need to
be run on a cluster, consist of multiple simulations (resolutions), run
for longer, produce more output, etc. The existing Cactus test mechanism
is not suitable for this, and there is no alternative framework. I have
argued in the past for the need for such a framework, and of course nobody
disagreed, but it is a question of resources to develop such a framework.
Great to hear! I would argue that resource requirements depend most
sensitively on the timescale of development; if major patches occur on a
weekly timescale, then as long as complete, continuous correctness tests
take ~ a day or two, we should be fine. For IllinoisGRMHD, a souped-up
laptop or desktop collecting dust could do all the correctness testing
necessary, completed within hours. Then it'd just be a matter of
automatically generating roundoff error build-up comparison plots (a la
Fig. 1 of the IllinoisGRMHD paper, http://arxiv.org/pdf/1501.07276.pdf)
for a couple of quantities of physical interest, and automatically
uploading these to a web server. I already have the basic components of
infrastructure in place here at WVU; I'd just need to write the scripts
and have them run as a cronjob.
> If you have existing correctness tests, then it would be really good if
you could include instructions for running them, e.g. in the README or
documentation. If they need access to large test data, I would be
reluctant to include it in the thorn directory. Maybe a separate
repository somewhere?
The easiest correctness test is to just run the parfile for more
iterations and compare the roundoff-error build-up in maximum density with
the trusted code. It is remarkable how sensitive this quantity is to any
small, beyond-roundoff-level difference. I'll write the instructions for
doing so in the README, as well as a small table of expected results.
Thanks for the tip!
> I would be interested also in discussions about how a correctness-
testing framework would look. It would need to be able to submit and
collect data from simulations on clusters, hence simfactory. It would
also need to be able to analyse the data, e.g. collecting data from
multiple processes as you normally do when analysing your data. i.e. it
would need an analysis framework (https://docs.einsteintoolkit.org/et-docs
/Analysis_and_post-processing). It would also need IT infrastructure
similar to a jenkins server to automate running testing jobs, providing
interfaces for people to see what failed, look at results, etc. It's
nontrivial, which is probably why it doesn't already exist.
Such an automated framework indeed sounds nontrivial, requiring a lot of
human time to complete. I would suggest starting with something quick and
dirty, like writing scripts that simply generate plots (like Fig 1 of
http://arxiv.org/pdf/1501.07276.pdf) to a web page automatically so that
within a second a human could recognize *by eye* whether the correctness
was violated. This idea is unlikely to scale-up to many thorns, but I
think it will work well for WVUThorns at least, and I'm happy to share my
codes for doing so, in case they are useful.
--
Ticket URL: <https://trac.einsteintoolkit.org/ticket/1796#comment:19>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit
More information about the Trac
mailing list