Hi,<br><br>It seems the error was caused by the same reasons as Erik explained. It happens while checkpointing. It could be by-passed without checkpointing or with "unchunked=no".<br><br>However, what I don't understand is <br>
- Why did it not happen for the other case of the same computational size, not exactly same because of different eos, though.<br>- Why did it not happen on the other machines of Lustre filesystem, even with "unchunked=yes"<br>
<br>Cheers,<br><br>Hee Il<br><br> <br><br><div class="gmail_quote">2010/11/8 Erik Schnetter <span dir="ltr"><<a href="mailto:schnetter@cct.lsu.edu">schnetter@cct.lsu.edu</a>></span><br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
Hee Il<div><br></div><div>HDF5 errors can have two likely kinds of sources: You may be running out of disk space, or the HDF5 file may be corrupt. The latter happens if you copy a file while it is being written, or if you interrupt a program that is writing to a file, or if two processes are writing to the same file at the same time.</div>
<div><br></div><div>-erik<br><br><div class="gmail_quote"><div><div></div><div class="h5">On Mon, Nov 8, 2010 at 8:00 AM, Hee Il Kim <span dir="ltr"><<a href="mailto:heeilkim@gmail.com" target="_blank">heeilkim@gmail.com</a>></span> wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div><div></div><div class="h5">
Hi,<br><br>Recently I've encountered the hdf error. It didn't happen on all my machines but at least I saw this on QueenBee. I'm using the latest development version of EinsteinToolkit. <br><br>I attached below my hdf5 parameter setup, stderr, and stdout.<br>
<br>Thanks in advance,<br><br>Hee Il<br><br># IOHDF5<br>IOHDF5::out_every = 128<br>iohdf5::out_dir = "h5"<br>iohdf5::compression_level = 9<br>IOHDF5::output_symmetry_points = no<br>IOHDF5::out3D_ghosts = no<br>
IOHDF5::out_vars = "<br> hydrobase::rho{ downsample={4 4 4} }<br> weylscal4::psi4r{ downsample={4 4 4} }<br> weylscal4::psi4i{ downsample={4 4 4} }<br>
"<br><br>IO::out_unchunked = "yes"<br>IO::out_mode = "onefile"<br><br><br>### stderr ###<br>...<br>HDF5-DIAG: Error detected in HDF5 (1.8.5-patch1) thread 0:<br> #000: H5Dio.c line 266 in H5Dwrite(): can't write data<br>
major: Dataset<br> minor: Write failed<br> #001: H5Dio.c line 578 in H5D_write(): can't write data<br> major: Dataset<br> minor: Write failed<br> #002: H5Dcontig.c line 557 in H5D_contig_write(): contiguous write failed<br>
major: Dataset<br> minor: Write failed<br><br><br>### stdout ###<br><br clear="all">WARNING[L1,P0] (CarpetIOHDF5): HDF5 call 'H5Dwrite (memdataset, memdatatype, overlap_dataspace, dataspace, H5P_DEFAULT, data)' returned error code -1<br>
WARNING[L1,P0] (CarpetIOHDF5): HDF5 call 'H5Dwrite (memdataset, memdatatype, overlap_dataspace, dataspace, H5P_DEFAULT, data)' returned error code -1<br>WARNING[L1,P0] (CarpetIOHDF5): HDF5 call 'H5Dwrite (memdataset, memdatatype, overlap_dataspace, dataspace, H5P_DEFAULT, data)' returned error code -1<br>
<br><br>
</div></div></blockquote></div></div></blockquote></div>