[Users] CarpetIOHDF5::output_index = "yes" by default?

Erik Schnetter schnetter at cct.lsu.edu
Thu Feb 28 08:47:44 CST 2013


Ian

I have nothing against the index files. It is just that, if it can be done
in post-processing, it should be done there, because creating index files
doesn't require 1000 cores. With Simfactory, or maybe even with an
easy-to-use shell script that is automatically placed in the output
directory, we would not have to. This is a point of principle, i.e. can
easily be overruled by practical considerations.

Regarding documentation: I think a small section describing what you just
said (same structure, no data content, etc.) would suffice. I'm just
pushing for documentation here, this is not a show-stopper.

-erik





On Thu, Feb 28, 2013 at 8:16 AM, Ian Hinder <ian.hinder at aei.mpg.de> wrote:

>
> On 27 Feb 2013, at 22:33, Erik Schnetter <schnetter at cct.lsu.edu> wrote:
>
> If it is possible to generate the index files as post-processing step,
> then I would prefer that, as it reduces the time the simulation spends
> waiting for I/O. We could add it e.g. to Simfactory's cleanup step to have
> it happen automatically.
>
>
> It is possible; the index files are identical to the original files, but
> with no data written to the datasets.  People don't visualise their data
> enough, and having the index files makes it more likely that people will
> visualise their data, as reading the data will be faster.  Having an
> additional step of post-processing the data is annoying.
>
> How well-documented are the index files? If they are generated by default,
> then there should be a section in CarpetIOHDF5's thorn guide (or in the
> Visit reader's documentation) describing why they are a good idea, and what
> information they contain, and how they can be used to speed up input.
>
>
> I don't think there is documentation for them apart from the description
> of the parameter in param.ccl.
>
> It sounds like you don't want the additional index files appearing in
> users' output directories without them understanding what they are. I agree
> that having the index files in the output directory is a bit ugly.  Another
> alternative would be to embed the content of the index files in the
> original HDF5 files when the simulation terminates, or periodically.  The
> index files are small, so this should not be a large overhead.  This would
> be a binary dataset which could be accessed by name without iterating all
> the datasets in the file.
>
>
>
>
>
> -erik
>
>
>
> On Wed, Feb 27, 2013 at 2:41 PM, Roland Haas <
> roland.haas at physics.gatech.edu> wrote:
>
>> Hello all,
>>
>> > Would somebody object to making CarpetIOHDF5::output_index = "yes" the
>> > default? (It is currently "no".)
>> >
>> > It should not hurt to create these small files. (Or, does it?)
>> It creates files which in itself might be bad. Creating files can be an
>> issue on some file systems where creating files is slow (or there might
>> be limit on the number of files, that apparently on some clusters
>> [admittedly I know of no Cactus users there] can be as low as 1e6 files
>> *total*).
>>
>> Also currently there is a bug/mis-feature in that output_index writes
>> indices for both data and checkpoint files. The later probably should
>> have a separate switch since index files are less useful there.
>>
>> If the issue is that we currently have no method for users to create
>> index files for existing HDF5 files once eg they find that visualization
>> is slow, then for that I have a modified version of hdf5_merge (or was
>> it hdf5_extract... anyway one of them) where I simply commented out the
>> final H5Dwrite and which generates perfectly fine index files (of the
>> "new" format that do not have the extra attribute with just that change
>> or "old" ones with the attribute with the obvious changes [that are also
>> in]). It's of course a bit of a hack since the tools were not originally
>> meant to do that.
>>
>> Yours,
>> Roland
>>
>> --
>> My email is as private as my paper mail. I therefore support encrypting
>> and signing email messages. Get my PGP key from http://keys.gnupg.net.
>> _______________________________________________
>> Users mailing list
>> Users at einsteintoolkit.org
>> http://lists.einsteintoolkit.org/mailman/listinfo/users
>>
>
>
>
> --
> Erik Schnetter <schnetter at cct.lsu.edu>
> http://www.perimeterinstitute.ca/personal/eschnetter/
> _______________________________________________
> Users mailing list
> Users at einsteintoolkit.org
> http://lists.einsteintoolkit.org/mailman/listinfo/users
>
>
> --
> Ian Hinder
> http://numrel.aei.mpg.de/people/hinder
>
>


-- 
Erik Schnetter <schnetter at cct.lsu.edu>
http://www.perimeterinstitute.ca/personal/eschnetter/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20130228/678295bc/attachment.html 


More information about the Users mailing list