[ET Trac] #420: newest revision of simfactory 2.0 submit three instead of one job

Roland Haas trac-noreply at einsteintoolkit.org
Tue May 9 12:59:10 CDT 2023


#420: newest revision of simfactory 2.0 submit three instead of one job

 Reporter: anonymous
   Status: closed
Milestone: 
  Version: ET_2010_11
     Type: bug
 Priority: minor
Component: SimFactory

Changes (by Roland Haas):
I am submitting a simulation using simfactory with the command:

```
[snip]
simfactory/bin/sim submit test-whisky-openmp --configuration test-whisky-openmp --parfile=whisky-openmp-test-ali.par --verbose --walltime=48:00:00 --procs=16 --ppn=4 --num-threads=4 --machine=damiana --queue=intel.q
[snip]
```

sim then comes with the messages:

```text
[snip]
Info: Simfactory command: simfactory/bin/../lib/sim.py "submit" "test-whisky-openmp" "--configuration" "test-whisky-openmp" "--parfile=whisky-openmp-test-ali.par" "--verbose" "--walltime=48:00:00" "--procs=16" "--ppn=4" "--num-threads=4" "--machine=damiana" "--queue=intel.q"
Info: Version 1331M The Simulation Factory: Manage Cactus simulations
Info: defs: /home/alibeck/programme/Cactus-Luca/Cactus/simfactory/etc/defs.ini
Info: defs.local: /home/alibeck/programme/Cactus-Luca/Cactus/simfactory/etc/defs.local.ini 
Info: Cactus Directory: /home/alibeck/programme/Cactus-Luca/Cactus 
Info: simenv.COMMAND: submit 
Info: Executing command: submit 
Info: Assigned restart_id of: 0002 
Info: Found the following restart_ids: [0, 1] 
Info: Maximum restart id determined to be: 0001 Assigned restart id: 2 
Info: Simulation is inactive: submitting 
Info: Job allocation information: 
Info: System: nodes=170 cores/node=4 threads/process=4 
Info: Requested: nodes=4 cores=16 cores/node=4 
Info: Run: processes=4 threads=16 threads/process=4 
Info: Distribution: processes/node=1 threads/node=4 
Info: Ratio: threads/core=1.000 cores/thread=1.000 
Info: writing to internalDir: /lustre/AEI/alibeck/simulations/test-whisky-openmp/output-0002/SIMFACTORY 
Info: saving substituted submitscript contents to: /lustre/AEI/alibeck/simulations/test-whisky-openmp/output-0002/SIMFACTORY/SubmitScript
Executing submit command: qsub /lustre/AEI/alibeck/simulations/test-whisky-openmp/output-0002/SIMFACTORY/SubmitScript
Submit finished, job id is 259460 
Info: Restart 2 is active 
Info: Assigned restart_id of: 0003 
Info: Found the following restart_ids: [0, 1, 2, 2] 
Info: Maximum restart id determined to be: 0002 Assigned restart id: 3 
Info: Simulation is active: presubmitting 
Info: Job allocation information: 
Info: System: nodes=170 cores/node=4 threads/process=4 
Info: Requested: nodes=4 cores=16 cores/node=4 
Info: Run: processes=4 threads=16 threads/process=4 
Info: Distribution: processes/node=1 threads/node=4 
Info: Ratio: threads/core=1.000 cores/thread=1.000 
Info: writing to internalDir: /lustre/AEI/alibeck/simulations/test-whisky-openmp/output-0003/SIMFACTORY 
Info: saving substituted submitscript contents to: /lustre/AEI/alibeck/simulations/test-whisky-openmp/output-0003/SIMFACTORY/SubmitScript
Executing submit command: qsub /lustre/AEI/alibeck/simulations/test-whisky-openmp/output-0003/SIMFACTORY/SubmitScript
Submit finished, job id is 259461 
Info: Restart 2 is active 
Info: Assigned restart_id of: 0004 
Info: Found the following restart_ids: [0, 1, 2, 2, 3] 
Info: Maximum restart id determined to be: 0003 Assigned restart id: 4 
Info: Simulation is active: presubmitting 
Info: Job allocation information: 
Info: System: nodes=170 cores/node=4 threads/process=4 
Info: Requested: nodes=4 cores=16 cores/node=4 
Info: Run: processes=4 threads=16 threads/process=4 
Info: Distribution: processes/node=1 threads/node=4 
Info: Ratio: threads/core=1.000 cores/thread=1.000 
Info: writing to internalDir: /lustre/AEI/alibeck/simulations/test-whisky-openmp/output-0004/SIMFACTORY 
Info: saving substituted submitscript contents to: /lustre/AEI/alibeck/simulations/test-whisky-openmp/output-0004/SIMFACTORY/SubmitScript
Executing submit command: qsub /lustre/AEI/alibeck/simulations/test-whisky-openmp/output-0004/SIMFACTORY/SubmitScript
Submit finished, job id is 259462
[snip]
```

As a result three jobs are queued.

```text
[snip]
qstat job-ID prior name user state submit/start at queue slots ja-task-ID
259460 0.00000 test-whisk alibeck qw 04/29/2011 10:26:55 16 
259461 0.00000 test-whisk alibeck hqw 04/29/2011 10:26:56 16 
259462 0.00000 test-whisk alibeck hqw 04/29/2011 10:26:56 16 
[snip]
```

What is going wrong here?

**Keyword:**

--
Ticket URL: https://bitbucket.org/einsteintoolkit/tickets/issues/420/newest-revision-of-simfactory-20-submit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einsteintoolkit.org/pipermail/trac/attachments/20230509/1d5675eb/attachment-0001.htm>


More information about the Trac mailing list