[ET Trac] [Einstein Toolkit] #1370: Provide a framework for simulation metadata
Einstein Toolkit
trac-noreply at einsteintoolkit.org
Fri May 24 05:12:21 CDT 2013
#1370: Provide a framework for simulation metadata
-------------------------+--------------------------------------------------
Reporter: hinder | Owner:
Type: enhancement | Status: new
Priority: major | Milestone:
Component: Cactus | Version:
Keywords: |
-------------------------+--------------------------------------------------
When writing tools to analysis the output of a Cactus simulation, it would
be very useful to have more information than is currently available, and
some of the available information could be provided in a more convenient
way. For example, users set many parameters, and this provides a good
source of information about the simulation, but the parameter file is not
always output (e.g. in testsuite data), and it does not contain the value
of parameters which are unset, and hence have their default value.
Similarly, the value of a parameter is not necessarily a good indicator of
what actually happened. Often, a group of parameters needs to be
interpreted together to determine the required quantity. For example, if
I want to know what the intended final time of the simulation was, I would
have to look at Cactus::terminate, Cactus::cctk_itlast and
Cactus::cctk_final_time. If I want to know what the timestep or grid
spacing on the coarsest grid is, I have to look at a similar set of
parameters, or parse a grid structure file from Carpet for which there is
no well-defined filename. If I want to know what the last iteration
actually was, I have to find an output file and look at it, and there
might not even be any appropriate files, depending on the user's choices.
I propose that Cactus provides a framework for simulation metadata. The
following is one possible way that it could work.
1. Metadata for the simulation is collected and output to disk
2. The metadata comes from both the flesh and from thorns
3. The metadata format is extensible
4. The metadata format is easy to parse (hence, it is in a standard well-
specified and commonly-supported format)
5. The metadata file is easily human-readable
6. The metadata file is always output, so that analysis tools can expect
that it is present in modern simulations
7. The metadata file is not too large
8. The framework for metadata is managed by the flesh, as it is important
and will be available for every Cactus simulation
9. One possible format for the metadata file is the "ini" file format, as
used by SimFactory. This satisfies 3, 4 and 5 above.
10. There would be one section per implementation active in the
simulation, and one for the flesh.
11. Each thorn is responsible for determining what metadata keys should be
output.
12. The flesh will output essential characteristics of the simulation that
is knows about, e.g. start and end iteration and times, run title, etc.
13. Output thorns will output the names of output files, and a description
of what they contain.
14. Some metadata will be available at startup, some at termination, and
some will become available only periodically. For example, due to
parameter steering, the set of available output files might get larger
during the simulation. We could either handle this by parsing and
rewriting the metadata file to insert extra information into existing
sections, or allow sections to be repeated. We have a parsing framework
in the flesh now (Piraha), so this should be straightforward.
15. Metadata files will be modified safely (e.g. by writing a new one to a
temporary file and moving it over the old one)
16. A distinction will be made between metadata items and parameters.
Often, there will be a 1-1 correspondence between these. As a result, it
would be good to have a convenient way for thorn authors to easily mark
parameters as suitable for direct inclusion in the metadata file. For
example, marking a parameter with a keyword "metadata = yes" or equivalent
in the param.ccl file would cause a metadata key for this parameter to be
automatically included in the metadata file.
17. Information which can change during a simulation might not be a good
candidate for metadata; maybe then it becomes "data" and should be output
in a separate file (pointed to by a metadata entry, of course). In that
case, setting "steerable" and "metadata" for a parameter in param.ccl
should lead to an error.
18. Metadata entries could be restricted to string values, or could have
richer types. Richer types such as strings, integers, floating point
numbers, and lists (possibly with nesting) might be convenient.
19. The flesh could provide a function CCTK_RecordMetadata(key, value)
[surely the implementation does not need to be told to the flesh by the
caller?]. This function would store the data in a flesh data structure,
and note whether the on-disk file needed to be updated.
20. Every iteration, the flesh (on the first process) would update the on-
disk metadata file if it needed to be changed.
21. The sections in the metadata file will correspond to implementations,
and multiple thorns providing the same implementation [who chose this
name?] if providing the same information should provide it using the same
key names.
Related:
* An example of this sort of idea is already implemented by TwoPunctures
(#551), which outputs a TwoPunctures.bbh metadata file in the "numerical
relativity data format".
* The thorn Formaline currently handles a limited amount of metadata, but
the scope is more limited than this ticket. The above proposal could be
implemented using Formaline, but then you could not always expect that the
metadata file is available as Formaline might not have been activated.
--
Ticket URL: <https://trac.einsteintoolkit.org/ticket/1370>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit
More information about the Trac
mailing list