[ET Trac] [Einstein Toolkit] #1283: Missing data in HDF5 files

Einstein Toolkit trac-noreply at einsteintoolkit.org
Mon Mar 11 09:53:52 CDT 2013


#1283: Missing data in HDF5 files
---------------------+------------------------------------------------------
  Reporter:  hinder  |       Owner:     
      Type:  defect  |      Status:  new
  Priority:  major   |   Milestone:     
 Component:  Cactus  |     Version:     
Resolution:          |    Keywords:     
---------------------+------------------------------------------------------

Comment (by hinder):

 How about just introducing a convention that file should be renamed as
 <file>.tmp.<extension> while they are being written to?  If any such files
 are present in a restart, those files are very likely to be corrupt, and
 simfactory can make its decisions accordingly.  We need to be careful to
 avoid a conflict with any "safe file write" filename convention, as those
 files are really temporary, and their existence does not imply that
 anything important is corrupt.

 Do you mean "checkpoints" or "restarts" where you have written "backups"?
 Are you proposing to archive checkpoint files?  Does anybody do this?  For
 a 500 core job, with 2 GB/core, this is 1 TB of data per checkpoint
 iteration (maybe we don't checkpoint everything, so it could be maybe 1/2
 that).  I don't think the archive services would be very happy with us
 regularly archiving TB of checkpoint data.

-- 
Ticket URL: <https://trac.einsteintoolkit.org/ticket/1283#comment:11>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit


More information about the Trac mailing list