[ET Trac] [Einstein Toolkit] #479: speed up VisIt's CarpetHDF5 plugin, set 2d masks

Wed Jul 20 10:43:52 CDT 2011

#479: speed up VisIt's CarpetHDF5 plugin, set 2d masks
-------------------+--------------------------------------------------------
 Reporter:  rhaas  |        Type:  enhancement
   Status:  new    |    Priority:  minor      
Milestone:         |   Component:  Other      
  Version:         |    Keywords:  CarpetHDF5 
-------------------+--------------------------------------------------------
 Hello all,

 the attached patches (when all of them are applied to VisIt's CarpetHDF5
 plugin) speed up opening (large) HDF5 output files in VisIt by about a
 factor of 8. The two main speedups are replacing a linear search when
 translating from the Cactus iteration cctk_iteration to timestep (index)
 number and (surprisingly enough) the parsing of the Cactus variable name
 out of the dataset name rather than the "name" attribute (but it falls
 back to reading the attribute if the parsing fails). The third speed
 improvement is only visible when more than one file are opened and
 plotted. VisIt seems to consider reading metadata a cheap operation and
 creates and destroys the metadata object (avtCarpet...) very often. The
 patch caches metadata when VisIt destroys the object. This speeds up
 plotting several frames (or using the timestep slider) considerably, but
 has the unfortunate side effect that one cannot fully close files anymore
 (re-loading still works though).

 I attach timing information to show the gains. Data files and a python
 script to use with visit -cli -s openfile.py are provided at
 http://www.numrel.org/~rhaas3/CarpetHDF5/ . The data files are about 20MB
 when compressed and about 2-4GB when decompressed (they are actual data
 from one of my simulations with all values set to zero).
 There are three data sets provided. "flat" is the direct output from the
 Cactus simulation. "grouped" contains datasets from each timestep in a
 group of their own (grouping is mostly useful when mergin hdf5 files which
 is unbearably slow otherwise). "unchunked" used a python script to merge
 the individual Carpet components into the CarpetRegrid2 boxes (or
 equivalent), which results in much faster load times and smaller files
 (but is itself a slow operation).
 Code and scripts to group and unchunk hdf5 data is not yet public. If
 there is interested and I can post those as well but they are not nicely
 coded at all.

 The last patch (actually the first two by number) changes the way
 CarpetHDF5 sets up the nesting of components for 2D data so that eg.
 contour plots work properly (no more duplicate lines from the coarse
 points under fine points).

-- 
Ticket URL: <https://trac.einsteintoolkit.org/ticket/479>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit