[ET Trac] [Einstein Toolkit] #316: Checkpoint recovery nonfunctional
Einstein Toolkit
trac-noreply at einsteintoolkit.org
Tue Mar 15 08:09:59 CDT 2011
#316: Checkpoint recovery nonfunctional
-------------------------+--------------------------------------------------
Reporter: hinder | Owner: mthomas
Type: defect | Status: new
Priority: blocker | Milestone:
Component: SimFactory | Version:
Resolution: | Keywords: regression
-------------------------+--------------------------------------------------
Comment (by hinder):
It looks like it only attempts to recover from the previous restart. What
happens if that restart never ran, perhaps because it was deleted from the
queue, and hence there are no checkpoint files in it? I think simfactory
should try to recover from the last restart which actually has checkpoint
files, not just the last restart. If there are no restarts with
checkpoint files, then it should not try to recover.
I don't think the patch should be applied until it works for the case
where a job never ran, and hence there are no checkpoint files in the last
few restarts.
Aside: all of this would have been trivial if we used a single common
directory for checkpoint files, common across restarts. All the logic is
present in Cactus already.
--
Ticket URL: <https://trac.einsteintoolkit.org/ticket/316#comment:6>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit
More information about the Trac
mailing list