<div dir="ltr"><div dir="ltr">Hi Erik,<div><br></div><div>Thanks for your comments. </div><div><br></div><div>I ran a few tests last week and found that the "sim submit" command actually works on my cluster but in a weird way. The
"nohup" error I shared earlier is just the screen output and the simulation continues to run in the background. This is not convenient though because it doesn't reflect as a running job through my cluster, but shows up as running only when I look for the simulations using ./simfactory/bin/sim list-simulations. So I will have to continue using the 'run' command. </div><div><br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
If that is true, then I would create a new simulation (i.e. use a<br>
different name for the simulation), and modify the parameter file to<br>
point to the checkpoint files in the old simulations as restart files.<br>
The respective parameters are provided by the "IOUtil" thorn. This<br>
way, you sidestep Simfactory's automatic restart mechanism, and you<br>
are only using the "sim run" command you already know is working for<br>
your machine. The disadvantage is that you need to modify the<br>
parameter file each time you restart to point to the checkpoint files<br>
written by the previous run.<br><br></blockquote><div>I am working on this method now and trying to figure out which parameters would need to be changed. </div><div><br></div><div>I am working on this method now, and as far as I can gather, I should only be changing the recover mode to 'manual' and specify the path and file name to the checkpoint. I will let you know if I make some progress on it.
</div><div><br></div><div>thank you,</div><div>best regards,</div><div>Atul</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> Warning: job status is U<br>
> Warning: Job chaining requested but job id 999999 is not in the queue. Its status is U. Aborting submission.<br>
><br>
> I guess the issue is with the "status is U" part now.<br>
><br>
> Best regards,<br>
> Atul.<br>
><br>
> On Sat, Aug 15, 2020 at 8:57 PM Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>> wrote:<br>
>><br>
>> Atul<br>
>><br>
>> "sim run" starts a simulation right away. Have you tried "sim submit"<br>
>> instead? This should check whether the simulation is still active, and<br>
>> if so, deactivate it before running the next restart.<br>
>><br>
>> -erik<br>
>><br>
>> On Sat, Aug 15, 2020 at 7:54 PM Atul Kedia <<a href="mailto:akedia@nd.edu" target="_blank">akedia@nd.edu</a>> wrote:<br>
>> ><br>
>> > Hello,<br>
>> ><br>
>> > I want to restart a simulation to make it run for longer. It currently stopped at the time it was asked it at my par file. It has checkpoints enabled in the par file.<br>
>> ><br>
>> > I have increased the time in the par file at <sim_name>/output-0000/ and at <sim_name>/SIMFACTORY/par and I tried the commands :<br>
>> ><br>
>> > simfactory/bin/sim cleanup <sim_name><br>
>> > followed by<br>
>> > simfactory/bin/sim run <sim_name><br>
>> > and set the added a line "jobid = 999999" as suggested at : <a href="http://lists.einsteintoolkit.org/pipermail/users/2018-September/006528.html" rel="noreferrer" target="_blank">http://lists.einsteintoolkit.org/pipermail/users/2018-September/006528.html</a><br>
>> ><br>
>> > and I get the error message :<br>
>> > "Error: Internal error: Cannot submit simulation <sim_name> because it is already active"<br>
>> ><br>
>> > Another email thread I used for reference was this one: <a href="http://lists.einsteintoolkit.org/pipermail/users/2018-May/006281.html" rel="noreferrer" target="_blank">http://lists.einsteintoolkit.org/pipermail/users/2018-May/006281.html</a><br>
>> ><br>
>> > I am using ET_Mayer with the default simfactory that it comes with (simfactory 2, I think).<br>
>> ><br>
>> > Any help would be really appreciated.<br>
>> ><br>
>> > Thank you,<br>
>> ><br>
>> > --<br>
>> > Atul Kedia<br>
>> > PhD student,<br>
>> > Physics department,<br>
>> > University of Notre Dame.<br>
>> > _______________________________________________<br>
>> > Users mailing list<br>
>> > <a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a><br>
>> > <a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
>><br>
>><br>
>><br>
>> --<br>
>> Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>><br>
>> <a href="http://www.perimeterinstitute.ca/personal/eschnetter/" rel="noreferrer" target="_blank">http://www.perimeterinstitute.ca/personal/eschnetter/</a><br>
><br>
><br>
><br>
> --<br>
> Atul Kedia<br>
> PhD student,<br>
> Physics department,<br>
> University of Notre Dame.<br>
<br>
<br>
<br>
-- <br>
Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>><br>
<a href="http://www.perimeterinstitute.ca/personal/eschnetter/" rel="noreferrer" target="_blank">http://www.perimeterinstitute.ca/personal/eschnetter/</a><br>
</blockquote></div></div>