[ET Trac] [Einstein Toolkit] #1033: support a combined "map" file in CarpetIOHDF5

Einstein Toolkit trac-noreply at einsteintoolkit.org
Thu Aug 9 01:04:16 CDT 2012


#1033: support a combined "map" file in CarpetIOHDF5
--------------------------+-------------------------------------------------
 Reporter:  rhaas         |       Owner:  eschnett
     Type:  enhancement   |      Status:  new     
 Priority:  minor         |   Milestone:          
Component:  Carpet        |     Version:          
 Keywords:  CarpetIOHDF5  |  
--------------------------+-------------------------------------------------
 recovering on different number of processes than was used to write a
 checkpoint is painfully slow. Part of the reason seems to be that each
 process essentially  has to read all files to find out where each piece of
 data it requires is located. The attached patch (not to be included in the
 main code due to bad file formats and coding) enables CarpetIOHDF5 to read
 all the information stored in the union of index files to from a single
 file. This means (together with the other patches proposed today) that
 CarpetIOHDF5 only ever opens those HDF5 files that are required to restore
 the simulation on a given process. It significantly (factor > 4 where I
 don't quite know how fast since the unpatched version ran out of walltime)
 speeds up recovery with many more processors than wrote the files.

 It also adds an optimization for CCTK_VarIndex calls inside CarpetIOHDF5
 (which happens for every dataset in the file).

 This is intended only as a proof of what might speed up recovery. A proper
 implementation would need a more sensible file format. Two option seem
 possible:
 1) extend the index file format by a "filename" or "filenum" attribute to
 each dataset and use a concatenation of all index files as the map file
 2) define a custom hdf5 data type corresponding to the information in a
 single patch_t, which would have mostly integer field plus two variable
 length / enumerated ASCII fields (for the patch name, variable name)

-- 
Ticket URL: <https://trac.einsteintoolkit.org/ticket/1033>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit


More information about the Trac mailing list