[ET Trac] [Einstein Toolkit] #316: Checkpoint recovery nonfunctional

Einstein Toolkit trac-noreply at einsteintoolkit.org
Tue Mar 15 08:09:59 CDT 2011


#316: Checkpoint recovery nonfunctional
-------------------------+--------------------------------------------------
  Reporter:  hinder      |       Owner:  mthomas   
      Type:  defect      |      Status:  new       
  Priority:  blocker     |   Milestone:            
 Component:  SimFactory  |     Version:            
Resolution:              |    Keywords:  regression
-------------------------+--------------------------------------------------

Comment (by hinder):

 It looks like it only attempts to recover from the previous restart.  What
 happens if that restart never ran, perhaps because it was deleted from the
 queue, and hence there are no checkpoint files in it?  I think simfactory
 should try to recover from the last restart which actually has checkpoint
 files, not just the last restart.  If there are no restarts with
 checkpoint files, then it should not try to recover.

 I don't think the patch should be applied until it works for the case
 where a job never ran, and hence there are no checkpoint files in the last
 few restarts.

 Aside: all of this would have been trivial if we used a single common
 directory for checkpoint files, common across restarts.  All the logic is
 present in Cactus already.

-- 
Ticket URL: <https://trac.einsteintoolkit.org/ticket/316#comment:6>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit


More information about the Trac mailing list