[ET Trac] [Einstein Toolkit] #420: newest revision of simfactory 2.0 submit three instead of one job

Einstein Toolkit trac-noreply at einsteintoolkit.org
Fri Apr 29 03:44:44 CDT 2011


#420: newest revision of simfactory 2.0 submit three instead of one job
----------------------------------------------+-----------------------------
 Reporter:  alexander.beck-ratzka@…           |       Owner:  mthomas   
     Type:  defect                            |      Status:  new       
 Priority:  minor                             |   Milestone:            
Component:  SimFactory                        |     Version:  ET_2010_11
 Keywords:                                    |  
----------------------------------------------+-----------------------------
 I am submitting a simulation using simfactory with the command:

 [snip]
 simfactory/bin/sim submit test-whisky-openmp --configuration test-whisky-
 openmp --parfile=whisky-openmp-test-ali.par --verbose --walltime=48:00:00
 --procs=16 --ppn=4 --num-threads=4 --machine=damiana --queue=intel.q
 [snip]

 sim then comes with the messages:

 [snip]
 Info: Simfactory command: simfactory/bin/../lib/sim.py "submit" "test-
 whisky-openmp" "--configuration" "test-whisky-openmp" "--parfile=whisky-
 openmp-test-ali.par" "--verbose" "--walltime=48:00:00" "--procs=16" "--
 ppn=4" "--num-threads=4" "--machine=damiana" "--queue=intel.q"
 Info: Version 1331M
 The Simulation Factory: Manage Cactus simulations

 Info: defs: /home/alibeck/programme/Cactus-
 Luca/Cactus/simfactory/etc/defs.ini
 Info: defs.local: /home/alibeck/programme/Cactus-
 Luca/Cactus/simfactory/etc/defs.local.ini
 Info: Cactus Directory: /home/alibeck/programme/Cactus-Luca/Cactus
 Info: simenv.COMMAND: submit
 Info: Executing command: submit
 Info: Assigned restart_id of: 0002
 Info: Found the following restart_ids: [0, 1]
 Info: Maximum restart id determined to be: 0001
 Assigned restart id: 2
 Info: Simulation is inactive: submitting
 Info: Job allocation information:
 Info:    System:       nodes=170 cores/node=4 threads/process=4
 Info:    Requested:    nodes=4 cores=16 cores/node=4
 Info:    Run:          processes=4 threads=16 threads/process=4
 Info:    Distribution: processes/node=1 threads/node=4
 Info:    Ratio:        threads/core=1.000 cores/thread=1.000
 Info: writing to internalDir: /lustre/AEI/alibeck/simulations/test-whisky-
 openmp/output-0002/SIMFACTORY
 Info: saving substituted submitscript contents to:
 /lustre/AEI/alibeck/simulations/test-whisky-
 openmp/output-0002/SIMFACTORY/SubmitScript
 Executing submit command: qsub /lustre/AEI/alibeck/simulations/test-
 whisky-openmp/output-0002/SIMFACTORY/SubmitScript
 Submit finished, job id is 259460
 Info: Restart 2 is active
 Info: Assigned restart_id of: 0003
 Info: Found the following restart_ids: [0, 1, 2, 2]
 Info: Maximum restart id determined to be: 0002
 Assigned restart id: 3
 Info: Simulation is active: presubmitting
 Info: Job allocation information:
 Info:    System:       nodes=170 cores/node=4 threads/process=4
 Info:    Requested:    nodes=4 cores=16 cores/node=4
 Info:    Run:          processes=4 threads=16 threads/process=4
 Info:    Distribution: processes/node=1 threads/node=4
 Info:    Ratio:        threads/core=1.000 cores/thread=1.000
 Info: writing to internalDir: /lustre/AEI/alibeck/simulations/test-whisky-
 openmp/output-0003/SIMFACTORY
 Info: saving substituted submitscript contents to:
 /lustre/AEI/alibeck/simulations/test-whisky-
 openmp/output-0003/SIMFACTORY/SubmitScript
 Executing submit command: qsub /lustre/AEI/alibeck/simulations/test-
 whisky-openmp/output-0003/SIMFACTORY/SubmitScript
 Submit finished, job id is 259461
 Info: Restart 2 is active
 Info: Assigned restart_id of: 0004
 Info: Found the following restart_ids: [0, 1, 2, 2, 3]
 Info: Maximum restart id determined to be: 0003
 Assigned restart id: 4
 Info: Simulation is active: presubmitting
 Info: Job allocation information:
 Info:    System:       nodes=170 cores/node=4 threads/process=4
 Info:    Requested:    nodes=4 cores=16 cores/node=4
 Info:    Run:          processes=4 threads=16 threads/process=4
 Info:    Distribution: processes/node=1 threads/node=4
 Info:    Ratio:        threads/core=1.000 cores/thread=1.000
 Info: writing to internalDir: /lustre/AEI/alibeck/simulations/test-whisky-
 openmp/output-0004/SIMFACTORY
 Info: saving substituted submitscript contents to:
 /lustre/AEI/alibeck/simulations/test-whisky-
 openmp/output-0004/SIMFACTORY/SubmitScript
 Executing submit command: qsub /lustre/AEI/alibeck/simulations/test-
 whisky-openmp/output-0004/SIMFACTORY/SubmitScript
 Submit finished, job id is 259462
 [snip]

 As a result three jobs are queued.

 [snip]
 qstat
 job-ID  prior   name       user         state submit/start at     queue
 slots ja-task-ID
 -----------------------------------------------------------------------------------------------------------------
  259460 0.00000 test-whisk alibeck      qw    04/29/2011 10:26:55
 16
  259461 0.00000 test-whisk alibeck      hqw   04/29/2011 10:26:56
 16
  259462 0.00000 test-whisk alibeck      hqw   04/29/2011 10:26:56
 16
 [snip]

 What is going wrong here?

-- 
Ticket URL: <https://trac.einsteintoolkit.org/ticket/420>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit


More information about the Trac mailing list