Hello Erik,<div>I haven't found any recover.par. May be I have an older version of the simfactory. </div><div>So, I tried just to run with my par file. First i used this command</div><div>./simfactory/sim create-submit recover_test2 --parfile=par/myparfile.par --procs=8 --walltime=0:10:0<br>
then </div><div>./simfactory/sim submit recover_test2 --parfile=par/myparfile.par --procs=8 --walltime=0:10:0</div><div>and I got the following error message:</div><div><div>Simulation Factory:</div><div>[log] Submitting:</div>
<div>[log] Restart id "0000" is active</div><div>Aborting.</div></div><div><br></div><div>Thank you. Petr. </div><div><br></div><div>Here is a LOG file: </div><div><font class="Apple-style-span" face="Verdana"><div style="font-weight: bold; ">
<span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">LOG FILE for simulation "recover_test2"</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">================================================================================</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;"><br></span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">--------------------------------------------------------------------------------</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">2011-11-21 10:08:19 <a href="mailto:ettest04@qb4.loni.org">ettest04@qb4.loni.org</a>:</span></div><div style="font-weight: bold; ">
<span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Skeleton created</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Job directory: "/scratch/ettest04/simulations/recover_test2"</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Machine: "queenbee"</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Simulation id: "simulation-recover_test2-queenbee-qb4.loni.org-ettest04-2011.11.21-10.08.19-8396"</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Source dir: "/home/ettest04/Cactus"</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Configuration: "sim"</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Config id: "config-sim-qb4.loni.org-home-ettest04-Cactus"</span></div><div style="font-weight: bold; ">
<span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Build id: "build-sim-qb4.loni.org-ettest04-2011.11.08-19.43.33-31159"</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Executable: "/home/ettest04/Cactus/exe/cactus_sim"</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Option list: "configs/sim/OptionList"</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Script file: "/home/ettest04/Cactus/configs/sim/ScriptFile"</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Parameter file: "par/rns_64_25km_1lev_tvd_vanleerMC2_hlle_MADM_1.432_HarmF_2_HarmN_1_g12.par"</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">--------------------------------------------------------------------------------</span></div><div style="font-weight: bold; ">
<span class="Apple-style-span" style="font-size: 12px; line-height: 15px;"><br></span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">--------------------------------------------------------------------------------</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">2011-11-21 10:08:19 <a href="mailto:ettest04@qb4.loni.org">ettest04@qb4.loni.org</a>:</span></div><div style="font-weight: bold; ">
<span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Submitting:</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Using restart id "0000"</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Not recovering since there are no restart ids</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Created restart directory</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Activated restart directory</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Created job script</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Created parameter file</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">About to submit job</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Executing: cd /scratch/ettest04/simulations/recover_test2/output-0000 && { qsub SIMFACTORY/ScriptFile ; }</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Submitted job</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Job id is "583945"</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">--------------------------------------------------------------------------------</span></div><div style="font-weight: bold; ">
<span class="Apple-style-span" style="font-size: 12px; line-height: 15px;"><br></span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">--------------------------------------------------------------------------------</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">2011-11-21 10:47:32 <a href="mailto:ettest04@qb4.loni.org">ettest04@qb4.loni.org</a>:</span></div><div style="font-weight: bold; ">
<span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Submitting:</span></div><div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">Restart id "0000" is active</span></div>
<div style="font-weight: bold; "><span class="Apple-style-span" style="font-size: 12px; line-height: 15px;">--------------------------------------------------------------------------------</span></div><div style="font-weight: bold; font-size: 12px; line-height: 15px; ">
<br></div><div style="font-size: 12px; line-height: 15px; ">Tha</div><div style="font-weight: bold; font-size: 12px; line-height: 15px; "><br></div></font></div><div><br><div class="gmail_quote">On Sun, Nov 13, 2011 at 8:48 AM, Erik Schnetter <span dir="ltr"><<a href="mailto:schnetter@cct.lsu.edu">schnetter@cct.lsu.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Petr<br>
<br>
In most cases, a simple "submit" (with the same simulation name)<br>
should suffice. --from-restart-id allows restarting from a particular<br>
restart, in case you don't want to restart from the previous restart.<br>
<br>
Simfactory has a parameter file simfactory/etc/parfiles/recover.par.<br>
Could you try this? You would submit it on a single process, let it<br>
run for a few minutes, and then submit it again:<br>
<br>
./bin/sim submit recover --parfile=simfactory/etc/parfiles/recover.par<br>
--walltime=0:10:0<br>
<br>
and later<br>
<br>
./bin/sim submit recover --parfile=simfactory/etc/parfiles/recover.par<br>
--walltime=0:10:0<br>
This should work.<br>
<br>
-erik<br>
<div class="im"><br>
On Sat, Nov 12, 2011 at 8:57 PM, Petr Tsatsin <<a href="mailto:ptsatsin@fau.edu">ptsatsin@fau.edu</a>> wrote:<br>
> Hello,<br>
><br>
> I'm trying to restart my simulation on queenbee using simfactory. I tried<br>
> "./sim submit --from-restart-id ..." and I got<br>
</div><div><div></div><div class="h5">> "Simulation Factory:<br>
> [log] Submitting:<br>
> [log] Using restart id "0002"<br>
> [log] Requested recovering from restart id "0"<br>
> [log] Could not find checkpoint files in restart id "0000"<br>
> Aborting.''<br>
> Inside my parameter file for this simulation I set up HDF5 check pointing in<br>
> following way:<br>
> "IO::out_dir = $parfile<br>
> IOHDF5::checkpoint = "yes"<br>
> IO::checkpoint_every = 100000<br>
> IO::checkpoint_file = $parfile<br>
> IO::checkpoint_dir = $parfile<br>
> IO::recover="auto"<br>
> IO::recover_file = $parfile<br>
> IO::recover_dir = $parfile"<br>
> I checked that h5 check point files are actually inside the output-0000<br>
> directory.<br>
> Any suggestions how can I restart the my simulation from HDF5 checkpoint<br>
> with a simfactory?<br>
> Thank you.<br>
> --<br>
> Petr Tsatsin<br>
> Graduate Student- Department of Physics<br>
> Charles E. Schmidt College of Science<br>
> Florida Atlantic University<br>
> <a href="tel:%28561%29-297-3386" value="+15612973386">(561)-297-3386</a><br>
> <a href="mailto:ptsatin@fau.edu">ptsatin@fau.edu</a>, <a href="mailto:ptsatsin@gmail.com">ptsatsin@gmail.com</a><br>
><br>
> _______________________________________________<br>
> Users mailing list<br>
> <a href="mailto:Users@einsteintoolkit.org">Users@einsteintoolkit.org</a><br>
> <a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
><br>
><br>
<br>
<br>
<br>
--<br>
</div></div><font color="#888888">Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu">schnetter@cct.lsu.edu</a>> <a href="http://www.cct.lsu.edu/~eschnett/" target="_blank">http://www.cct.lsu.edu/~eschnett/</a><br>
</font></blockquote></div><br><br clear="all"><div><br></div>-- <br><font color="#888888">Petr Tsatsin<br>Graduate Student- Department of Physics<br>
Charles E. Schmidt College of Science<br>
Florida Atlantic University<br>
<a value="+15612973386">(561)-297-3386</a><br><a href="mailto:ptsatin@fau.edu" target="_blank">ptsatin@fau.edu</a>, <a href="mailto:ptsatsin@gmail.com" target="_blank">ptsatsin@gmail.com</a><br></font><br>
</div>