[Users] restart with simfactory

Tsatsin Petr ptsatsin at gmail.com
Mon Nov 21 11:20:02 CST 2011


Hello Erik,
I haven't found any recover.par. May be I have an older version of the
simfactory.
So, I tried just to run with my par file. First i used this command
./simfactory/sim create-submit recover_test2 --parfile=par/myparfile.par
--procs=8 --walltime=0:10:0
then
./simfactory/sim submit recover_test2 --parfile=par/myparfile.par --procs=8
--walltime=0:10:0
and I got the following error message:
Simulation Factory:
[log] Submitting:
[log] Restart id "0000" is active
Aborting.

Thank you. Petr.

Here is a LOG file:
LOG FILE for simulation "recover_test2"
================================================================================

--------------------------------------------------------------------------------
2011-11-21 10:08:19 ettest04 at qb4.loni.org:
Skeleton created
Job directory: "/scratch/ettest04/simulations/recover_test2"
Machine: "queenbee"
Simulation id:
"simulation-recover_test2-queenbee-qb4.loni.org-ettest04-2011.11.21-10.08.19-8396"
Source dir: "/home/ettest04/Cactus"
Configuration: "sim"
Config id: "config-sim-qb4.loni.org-home-ettest04-Cactus"
Build id: "build-sim-qb4.loni.org-ettest04-2011.11.08-19.43.33-31159"
Executable: "/home/ettest04/Cactus/exe/cactus_sim"
Option list: "configs/sim/OptionList"
Script file: "/home/ettest04/Cactus/configs/sim/ScriptFile"
Parameter file:
"par/rns_64_25km_1lev_tvd_vanleerMC2_hlle_MADM_1.432_HarmF_2_HarmN_1_g12.par"
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
2011-11-21 10:08:19 ettest04 at qb4.loni.org:
Submitting:
Using restart id "0000"
Not recovering since there are no restart ids
Created restart directory
Activated restart directory
Created job script
Created parameter file
About to submit job
Executing: cd /scratch/ettest04/simulations/recover_test2/output-0000 && {
qsub SIMFACTORY/ScriptFile ; }
Submitted job
Job id is "583945"
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
2011-11-21 10:47:32 ettest04 at qb4.loni.org:
Submitting:
Restart id "0000" is active
--------------------------------------------------------------------------------

Tha


On Sun, Nov 13, 2011 at 8:48 AM, Erik Schnetter <schnetter at cct.lsu.edu>wrote:

> Petr
>
> In most cases, a simple "submit" (with the same simulation name)
> should suffice. --from-restart-id allows restarting from a particular
> restart, in case you don't want to restart from the previous restart.
>
> Simfactory has a parameter file simfactory/etc/parfiles/recover.par.
> Could you try this? You would submit it on a single process, let it
> run for a few minutes, and then submit it again:
>
> ./bin/sim submit recover --parfile=simfactory/etc/parfiles/recover.par
> --walltime=0:10:0
>
> and later
>
> ./bin/sim submit recover --parfile=simfactory/etc/parfiles/recover.par
> --walltime=0:10:0
> This should work.
>
> -erik
>
> On Sat, Nov 12, 2011 at 8:57 PM, Petr Tsatsin <ptsatsin at fau.edu> wrote:
> >  Hello,
> >
> > I'm trying to restart my simulation on queenbee using simfactory. I tried
> > "./sim submit --from-restart-id ..." and I got
> > "Simulation Factory:
> >  [log] Submitting:
> >  [log] Using restart id "0002"
> >  [log] Requested recovering from restart id "0"
> >  [log] Could not find checkpoint files in restart id "0000"
> >  Aborting.''
> > Inside my parameter file for this simulation I set up HDF5 check
> pointing in
> > following way:
> > "IO::out_dir      = $parfile
> > IOHDF5::checkpoint = "yes"
> > IO::checkpoint_every = 100000
> > IO::checkpoint_file = $parfile
> > IO::checkpoint_dir = $parfile
> > IO::recover="auto"
> > IO::recover_file = $parfile
> > IO::recover_dir = $parfile"
> > I checked that h5 check point files are actually inside the output-0000
> > directory.
> > Any suggestions how can I restart the my simulation from HDF5 checkpoint
> > with a simfactory?
> > Thank you.
> > --
> > Petr Tsatsin
> > Graduate Student- Department of Physics
> > Charles E. Schmidt College of Science
> > Florida Atlantic University
> > (561)-297-3386
> > ptsatin at fau.edu, ptsatsin at gmail.com
> >
> > _______________________________________________
> > Users mailing list
> > Users at einsteintoolkit.org
> > http://lists.einsteintoolkit.org/mailman/listinfo/users
> >
> >
>
>
>
> --
> Erik Schnetter <schnetter at cct.lsu.edu>   http://www.cct.lsu.edu/~eschnett/
>



-- 
Petr Tsatsin
Graduate Student- Department of Physics
Charles E. Schmidt College of Science
Florida Atlantic University
(561)-297-3386
ptsatin at fau.edu, ptsatsin at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20111121/fef01832/attachment-0001.html 


More information about the Users mailing list