[Users] visualization-friendly HDF5 format
Erik Schnetter
schnetter at cct.lsu.edu
Wed Aug 19 12:30:05 CDT 2015
On Wed, Aug 19, 2015 at 12:46 PM, Frank Loeffler <knarf at cct.lsu.edu> wrote:
> Hi,
>
> I am not a visualization expert, so I am only answering what I know.
> Please others: chime in. I only wanted to get the discussion going and
> avoid different people working on the same thing separately after the
> workshop.
>
> On Wed, Aug 19, 2015 at 11:51:00AM -0400, Erik Schnetter wrote:
> > As Jonah described during the ET workshop, we're working with the yt
> > developers to make Carpet's HDF5 format yt-friendly. At the moment, we
> are
> > adding the missing information,
>
> Great. It might be good to share this with others.
>
> > which so far included the set of active
> > grid points -- i.e. those that should be displayed for a given level, as
> > opposed to those that should be cut off (ghost, buffer, symmetry points).
> > One may argue that these points should not have been output at all, but
> > that would be a major change to the current file format.
>
> We already have parameters to do that.
>
> > The items you list are a good starting point, but are too high-level to
> be
> > useful as a guide to implementing this. To make things concrete, I'd
> rather
> > collaborate with someone who is actually implementing a reader, and
> provide
> > the data that this reader needs. For example, you say you want a "list of
> > variables (as a given iteration)" -- do you really want a two-step table,
> > containing first a list of iterations, and then (for each iteration) a
> list
> > of variables?
>
> That is the kind of discussion I wanted to get started.
>
> > That's likely very different from what a user wants to
> > extract; rather, people want a set of variables, and for each variable,
> the
> > set of iterations at which this variable has data.
>
> I agree.
>
> > How do you want the AMR structure to be presented? Currently, Carpet can
> > output a string that can be parsed reasonably easily, and which describes
> > the grid structure. Is that sufficient?
>
> I didn't try this myself, but I remember someone at the workshop
> mentioning parsing the string as 'quite complicated' (no quote). Does
> this string also include information about all the components, so that a
> reader can easily figure out which component is containing a specific
> point/region?
>
The yt developers -- who probably have more of a computer science
background -- were not bothered by the format.
Feel free to suggest a different format, or a different mechanism. The grid
structure consists essentially of two nested arrays: A set of refinement
levels, and each has a set of components. A component is described by the
location of the origin and its size, i.e. two integer three-vectors. In
C++, you would have something like
struct bbox { int origin[3], size[3]; };
typedef vector<vector<bbox>> grid_structure;
There's a slight complication if you have a multi-block system (adding
either another level to the arrays, or adding an integer to the bbox
structure describing the patch number). You also need to be careful about
how the integer coordinates between the refinement levels are related,
which is different for vertex and cell centred grids.
One way to store this without using a hierarchy would be to define instead
struct bbox {
int reflevel;
int patch;
int origin[3];
int size[3];
};
and then store all such bboxes in a single, large array. It is the up to
the reader to split this into a tree structure according to the reflevel
and patch entries. That's easier to read, but more difficult to process.
-erik
> We also may want to have a mechanism to "glue" the different output file
> > from different processors together, other than just looking for files
> with
> > similar names.
>
> I experimented with external links in hdf5. They work rather nicely
> (after changing one byte in the source of the VisIt reader: enabling
> following links). I use this to have one (very small) hdf5 file pointing
> to all three components of a vector, so that it is easy to combine them
> to a vector in VisIt (since then from the point of visit they come from
> the same database). The 'recombiner' for this is a very tiny python
> script, but this could be done during a simulation as well.
>
> > Finally, I disagree with the "established the need for a meta-data file".
> > This capability exists, and it speeds up reading for the current output
> > format, but the current output format has several obvious shortcomings;
> if
> > those were remedied, things may be much faster.
>
> The main idea this reasoning comes from is that if meta-data and data
> are written intermixed, then just reading the meta-data is always going
> to be slower than if it would be contained in a file missing the actual
> data, due to disk access usually reading much more than requested. This
> is separate from what the meta data actually looks like.
>
> Frank
>
>
--
Erik Schnetter <schnetter at cct.lsu.edu>
http://www.perimeterinstitute.ca/personal/eschnetter/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20150819/1d081886/attachment.html
More information about the Users
mailing list