[Users] Another restart from checkpoint failure
Jakob Hansen
jakobidetsortehul at gmail.com
Thu Feb 17 00:24:04 CST 2011
Hi,
I am having problems with checkpoint restarting for high resolution
Carper/McLachlan simulations. For more coarse resolutions I have no problems
with checkpoint restarting, nor do I have any problems if the high
resolution simulations run for only a few M. But when I do high-resolution
simulations until stopped by wall-time and try to restart from checkpoint, I
get the following error output :
#################################################################
HDF5-DIAG: Error detected in HDF5 (1.8.5-patch1) thread 0:
#000: H5Dio.c line 174 in H5Dread(): can't read data
major: Dataset
minor: Read failed
#001: H5Dio.c line 404 in H5D_read(): can't read data
major: Dataset
minor: Read failed
#002: H5Dchunk.c line 1724 in H5D_chunk_read(): unable to read raw data
chunk
major: Low-level I/O
minor: Read failed
#003: H5Dchunk.c line 2737 in H5D_chunk_lock(): data pipeline read failed
major: Data filters
minor: Filter operation failed
#004: H5Z.c line 1116 in H5Z_pipeline(): filter returned failure during
read
major: Data filters
minor: Read failed
#005: H5Zdeflate.c line 133 in H5Z_filter_deflate(): memory allocation
failed for deflate uncompression
major: Resource unavailable
minor: No space available for allocation
WARNING level 1 in thorn CarpetIOHDF5 processor 104 host tachyon3167
(line 1102 of
/home01/r632kgw/jakob/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Input.cc):
-> HDF5 call 'H5Dread (dataset, datatype, memspace, filespace, xfer,
cctkGH->data[patch->vindex][timelevel])' returned error code -1
.
.
......etc, etc,
##################################################################
Any ideas for possible cause and solution to this?
Thanks,
Jakob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20110217/bdefb74c/attachment.html
More information about the Users
mailing list