[Users] visualization-friendly HDF5 format

Frank Loeffler knarf at cct.lsu.edu
Wed Aug 19 11:46:06 CDT 2015


Hi,

I am not a visualization expert, so I am only answering what I know.
Please others: chime in. I only wanted to get the discussion going and
avoid different people working on the same thing separately after the
workshop.

On Wed, Aug 19, 2015 at 11:51:00AM -0400, Erik Schnetter wrote:
> As Jonah described during the ET workshop, we're working with the yt
> developers to make Carpet's HDF5 format yt-friendly. At the moment, we are
> adding the missing information,

Great. It might be good to share this with others.

> which so far included the set of active
> grid points -- i.e. those that should be displayed for a given level, as
> opposed to those that should be cut off (ghost, buffer, symmetry points).
> One may argue that these points should not have been output at all, but
> that would be a major change to the current file format.

We already have parameters to do that.

> The items you list are a good starting point, but are too high-level to be
> useful as a guide to implementing this. To make things concrete, I'd rather
> collaborate with someone who is actually implementing a reader, and provide
> the data that this reader needs. For example, you say you want a "list of
> variables (as a given iteration)" -- do you really want a two-step table,
> containing first a list of iterations, and then (for each iteration) a list
> of variables?

That is the kind of discussion I wanted to get started.

> That's likely very different from what a user wants to
> extract; rather, people want a set of variables, and for each variable, the
> set of iterations at which this variable has data.

I agree.

> How do you want the AMR structure to be presented? Currently, Carpet can
> output a string that can be parsed reasonably easily, and which describes
> the grid structure. Is that sufficient?

I didn't try this myself, but I remember someone at the workshop
mentioning parsing the string as 'quite complicated' (no quote). Does
this string also include information about all the components, so that a
reader can easily figure out which component is containing a specific
point/region?

> We also may want to have a mechanism to "glue" the different output file
> from different processors together, other than just looking for files with
> similar names.

I experimented with external links in hdf5. They work rather nicely
(after changing one byte in the source of the VisIt reader: enabling
following links). I use this to have one (very small) hdf5 file pointing
to all three components of a vector, so that it is easy to combine them
to a vector in VisIt (since then from the point of visit they come from
the same database). The 'recombiner' for this is a very tiny python
script, but this could be done during a simulation as well.

> Finally, I disagree with the "established the need for a meta-data file".
> This capability exists, and it speeds up reading for the current output
> format, but the current output format has several obvious shortcomings; if
> those were remedied, things may be much faster.

The main idea this reasoning comes from is that if meta-data and data
are written intermixed, then just reading the meta-data is always going
to be slower than if it would be contained in a file missing the actual
data, due to disk access usually reading much more than requested. This
is separate from what the meta data actually looks like.

Frank

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20150819/d19d0f09/attachment.bin 


More information about the Users mailing list