[Users] hdf5 error

Hee Il Kim heeilkim at gmail.com
Wed Nov 24 00:53:23 CST 2010


Hi,

It seems the error was caused by the same reasons as Erik explained. It
happens while checkpointing. It could be by-passed without checkpointing or
with "unchunked=no".

However, what I don't understand is
- Why did it not happen for the other case of the same computational size,
not exactly same because of different eos, though.
- Why did it not happen on the other machines of Lustre filesystem, even
with "unchunked=yes"

Cheers,

Hee Il



2010/11/8 Erik Schnetter <schnetter at cct.lsu.edu>

> Hee Il
>
> HDF5 errors can have two likely kinds of sources: You may be running out of
> disk space, or the HDF5 file may be corrupt. The latter happens if you copy
> a file while it is being written, or if you interrupt a program that is
> writing to a file, or if two processes are writing to the same file at the
> same time.
>
> -erik
>
> On Mon, Nov 8, 2010 at 8:00 AM, Hee Il Kim <heeilkim at gmail.com> wrote:
>
>> Hi,
>>
>> Recently I've encountered the hdf error. It didn't happen on all my
>> machines but at least I saw this on QueenBee. I'm using the latest
>> development version of EinsteinToolkit.
>>
>> I attached below my hdf5 parameter setup, stderr, and stdout.
>>
>> Thanks in advance,
>>
>> Hee Il
>>
>> # IOHDF5
>> IOHDF5::out_every = 128
>> iohdf5::out_dir = "h5"
>> iohdf5::compression_level = 9
>> IOHDF5::output_symmetry_points = no
>> IOHDF5::out3D_ghosts           = no
>> IOHDF5::out_vars = "
>>                        hydrobase::rho{ downsample={4 4 4} }
>>                        weylscal4::psi4r{ downsample={4 4 4} }
>>                        weylscal4::psi4i{ downsample={4 4 4} }
>>                    "
>>
>> IO::out_unchunked    = "yes"
>> IO::out_mode         = "onefile"
>>
>>
>> ### stderr ###
>> ...
>> HDF5-DIAG: Error detected in HDF5 (1.8.5-patch1) thread 0:
>>   #000: H5Dio.c line 266 in H5Dwrite(): can't write data
>>     major: Dataset
>>     minor: Write failed
>>   #001: H5Dio.c line 578 in H5D_write(): can't write data
>>     major: Dataset
>>     minor: Write failed
>>   #002: H5Dcontig.c line 557 in H5D_contig_write(): contiguous write
>> failed
>>     major: Dataset
>>     minor: Write failed
>>
>>
>> ### stdout ###
>>
>> WARNING[L1,P0] (CarpetIOHDF5): HDF5 call 'H5Dwrite (memdataset,
>> memdatatype, overlap_dataspace, dataspace, H5P_DEFAULT, data)' returned
>> error code -1
>> WARNING[L1,P0] (CarpetIOHDF5): HDF5 call 'H5Dwrite (memdataset,
>> memdatatype, overlap_dataspace, dataspace, H5P_DEFAULT, data)' returned
>> error code -1
>> WARNING[L1,P0] (CarpetIOHDF5): HDF5 call 'H5Dwrite (memdataset,
>> memdatatype, overlap_dataspace, dataspace, H5P_DEFAULT, data)' returned
>> error code -1
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20101124/a2344d1e/attachment.html 


More information about the Users mailing list