[ET Trac] [Einstein Toolkit] #1033: support a combined "map" file in CarpetIOHDF5
Einstein Toolkit
trac-noreply at einsteintoolkit.org
Thu Aug 9 01:04:16 CDT 2012
#1033: support a combined "map" file in CarpetIOHDF5
--------------------------+-------------------------------------------------
Reporter: rhaas | Owner: eschnett
Type: enhancement | Status: new
Priority: minor | Milestone:
Component: Carpet | Version:
Keywords: CarpetIOHDF5 |
--------------------------+-------------------------------------------------
recovering on different number of processes than was used to write a
checkpoint is painfully slow. Part of the reason seems to be that each
process essentially has to read all files to find out where each piece of
data it requires is located. The attached patch (not to be included in the
main code due to bad file formats and coding) enables CarpetIOHDF5 to read
all the information stored in the union of index files to from a single
file. This means (together with the other patches proposed today) that
CarpetIOHDF5 only ever opens those HDF5 files that are required to restore
the simulation on a given process. It significantly (factor > 4 where I
don't quite know how fast since the unpatched version ran out of walltime)
speeds up recovery with many more processors than wrote the files.
It also adds an optimization for CCTK_VarIndex calls inside CarpetIOHDF5
(which happens for every dataset in the file).
This is intended only as a proof of what might speed up recovery. A proper
implementation would need a more sensible file format. Two option seem
possible:
1) extend the index file format by a "filename" or "filenum" attribute to
each dataset and use a concatenation of all index files as the map file
2) define a custom hdf5 data type corresponding to the information in a
single patch_t, which would have mostly integer field plus two variable
length / enumerated ASCII fields (for the patch name, variable name)
--
Ticket URL: <https://trac.einsteintoolkit.org/ticket/1033>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit
More information about the Trac
mailing list