[Users] simulation problem

Udayaraj Khanal urkhanal at tucdp.edu.np
Sat Aug 3 05:37:27 CDT 2013


Dear All,I have submitted two simulations. In Cactus directory ./simfactory/bin/sim list-simulations gives 
ks-mclachlan     [ACTIVE (FINISHED), restart 0000, job id 702161]static_tov           [ACTIVE (FINISHED), restart 0001, job id 701299]In /scratch/ettest52/simulations three directories CACHE, ks-mclachlan and static_tov are present.
 Inside static_tov there are four directories(output-0000, output-0001, output-0001-active,SIMFACTORY) and one file log.txt, which contains following details.[LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::Creating simulation static_tov [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::Simulation directory: /scratch/ettest52/simulations/static_tov [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::Simulation Properties: [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile):: [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::[properties] [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::machine         = queenbee [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::simulationid    = simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.07.21-11.10.48-15714 [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::sourcedir       = /home/ettest52/Cactus [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::configuration   = sim [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::configid        = config-sim-qb4.loni.org-home-ettest52-Cactus [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::buildid         = build-sim-qb4.loni.org-ettest52-2013.07.18-07.25.24-12102 [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::testsuite       = False [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::executable      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::optionlist      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/cfg/OptionList [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::submitscript    = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/SubmitScript [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::runscript       = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/RunScript [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::parfile         = /scratch/ettest52/simulations/static_tov/SIMFACTORY/par/static_tov.par [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile):: [LOG:2013-07-21 11:10:48] self.create(simulationName, parfile)::Simulation static_tov created [LOG:2013-07-21 11:10:48] self.submit(submitScript)::Restart for simulation static_tov created with restart id 0, long restart id 0000 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::Prepping for submission [LOG:2013-07-21 11:10:48] self.submit(submitScript)::No previous walltime available to be used, using walltime 8:00:00 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::Defined substituion properties for submission [LOG:2013-07-21 11:10:48] self.submit(submitScript)::{'SIMULATION_ID': 'simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.07.21-11.10.48-15714', 'NODE_PROCS': 1, 'PPN_USED': 8, 'PPN': 8, 'ALLOCATION': 'NO_ALLOCATION', 'WALLTIME_HH': '08', 'CPUFREQ': '2.33', 'USER': 'ettest52', 'RUNDIR': '/scratch/ettest52/simulations/static_tov/output-0000', 'NODES': 4, 'SIMULATION_NAME': 'static_tov', 'WALLTIME': '8:00:00', 'NUM_THREADS': 8, 'EXECUTABLE': '/scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim', 'PROCS_REQUESTED': 32, 'EMAIL': 'urkhanal at tucdp.edu.np', 'RESTART_ID': 0, 'CHAINED_JOB_ID': '', 'FROM_RESTART_COMMAND': '', 'NUM_SMT': 1, 'WALLTIME_SECONDS': 28800, 'SIMFACTORY': '/home/ettest52/Cactus/simfactory/bin/sim', 'PROCS': 32, 'SUBMITSCRIPT': '/scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript', 'WALLTIME_HOURS': 8.0, 'WALLTIME_MM': '00', 'PARFILE': '/scratch/ettest52/simulations/static_tov/output-0000/static_tov.par', 'WALLTIME_SS': '00', 'QUEUE': 'checkpt', 'CONFIGURATION': 'sim', 'SOURCEDIR': '/home/ettest52/Cactus', 'HOSTNAME': 'qb3.loni.org', 'NUM_PROCS': 4, 'SCRIPTFILE': '/scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript', 'MEMORY': '8192', 'WALLTIME_MINUTES': 480, 'SHORT_SIMULATION_NAME': 'static_tov-0000'} [LOG:2013-07-21 11:10:48] self.submit(submitScript)::self.Properties: /scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/properties.ini [LOG:2013-07-21 11:10:48] self.submit(submitScript):: [LOG:2013-07-21 11:10:48] self.submit(submitScript)::[properties] [LOG:2013-07-21 11:10:48] self.submit(submitScript)::machine         = queenbee [LOG:2013-07-21 11:10:48] self.submit(submitScript)::simulationid    = simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.07.21-11.10.48-15714 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::sourcedir       = /home/ettest52/Cactus [LOG:2013-07-21 11:10:48] self.submit(submitScript)::configuration   = sim [LOG:2013-07-21 11:10:48] self.submit(submitScript)::configid        = config-sim-qb4.loni.org-home-ettest52-Cactus [LOG:2013-07-21 11:10:48] self.submit(submitScript)::buildid         = build-sim-qb4.loni.org-ettest52-2013.07.18-07.25.24-12102 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::testsuite       = False [LOG:2013-07-21 11:10:48] self.submit(submitScript)::executable      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim [LOG:2013-07-21 11:10:48] self.submit(submitScript)::optionlist      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/cfg/OptionList [LOG:2013-07-21 11:10:48] self.submit(submitScript)::submitscript    = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/SubmitScript [LOG:2013-07-21 11:10:48] self.submit(submitScript)::runscript       = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/RunScript [LOG:2013-07-21 11:10:48] self.submit(submitScript)::parfile         = /scratch/ettest52/simulations/static_tov/SIMFACTORY/par/static_tov.par [LOG:2013-07-21 11:10:48] self.submit(submitScript)::chainedjobid    = -1 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::ppn             = 8 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::procsrequested  = 32 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::allocation      = NO_ALLOCATION [LOG:2013-07-21 11:10:48] self.submit(submitScript)::user            = ettest52 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::numsmt          = 1 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::walltime        = 8:00:00 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::numprocs        = 4 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::nodeprocs       = 1 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::numthreads      = 8 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::hostname        = qb3.loni.org [LOG:2013-07-21 11:10:48] self.submit(submitScript)::ppnused         = 8 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::queue           = checkpt [LOG:2013-07-21 11:10:48] self.submit(submitScript)::cpufreq         = 2.33 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::procs           = 32 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::memory          = 8192 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::nodes           = 4 [LOG:2013-07-21 11:10:48] self.submit(submitScript)::pbsSimulationName= static_tov-0000 [LOG:2013-07-21 11:10:48] self.submit(submitScript):: [LOG:2013-07-21 11:10:48] self.submit(submitScript)::saving substituted submitscript contents to: /scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript [LOG:2013-07-21 11:10:48] self.submit(submitScript)::Executing submission command: qsub /scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript [LOG:2013-07-21 11:10:48] self.makeActive()::Simulation static_tov with restart-id 0 has been made active [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output)::received raw output: Invalid allocation "NO_ALLOCATION". [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output)::===================== Allocation information for ettest52 ===================== [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output)::   Proj. Name|        Alloc|  Balance| Deposited|    %Used| Days Left|       End [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output)::-------------------------------------------------------------------------------- [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output)::loni_cactus08|loni_cactus08 on @Dell_Cluster|106210.43| 250000.00|    57.52|       164|2014-01-01 [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output):: [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output)::Note: Balance and Deposit are measured in CPU-hours [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output)::qsub: submit filter returned an error code, aborting job submission. [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output):: [LOG:2013-07-21 11:10:49] job_id = self.extractJobId(output)::using submitRegex: (\d+)[.]qb2 [LOG:2013-07-21 11:10:49] self.submit(submitScript)::After searching raw output, it was determined that the job_id is: -1 [LOG:2013-07-21 11:10:49] self.submit(submitScript)::The regex did NOT match anything. No job_id means no control. [LOG:2013-07-21 11:11:26] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 0, long restart id 0000 [LOG:2013-07-22 10:23:47] restart.load(simulationName, active_id)::For simulation static_tov, loaded restart id 0, long restart id 0000 [LOG:2013-07-22 10:23:47] restart.load(simulationName, active_id)::For simulation static_tov, loaded restart id 0, long restart id 0000 [LOG:2013-07-22 10:23:47] restart.finish()::For simulation static_tov, Finishing restart 0000 [LOG:2013-07-22 10:23:47] restart.finish()::Force option: False [LOG:2013-07-22 10:23:47] restart.finish()::Job ID: -1, Job Status: U [LOG:2013-07-22 10:23:47] restart.finish()::Cleaning up simulation static_tov, restart 0, with job_status U [LOG:2013-07-22 10:23:47] restart.finish()::Simulation static_tov, restart 0, with job id -1 has been successfully cleaned up [LOG:2013-07-22 10:23:47] self.submit(submitScript)::Restart for simulation static_tov created with restart id 1, long restart id 0001 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::Prepping for submission [LOG:2013-07-22 10:23:47] self.submit(submitScript)::No previous walltime available to be used, using walltime 8:00:00 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::Defined substituion properties for submission [LOG:2013-07-22 10:23:47] self.submit(submitScript)::{'SIMULATION_ID': 'simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.07.21-11.10.48-15714', 'NODE_PROCS': 1, 'PPN_USED': 8, 'PPN': 8, 'ALLOCATION': 'loni_cactus08', 'WALLTIME_HH': '08', 'CPUFREQ': '2.33', 'USER': 'ettest52', 'RUNDIR': '/scratch/ettest52/simulations/static_tov/output-0001', 'NODES': 4, 'SIMULATION_NAME': 'static_tov', 'WALLTIME': '8:00:00', 'NUM_THREADS': 8, 'EXECUTABLE': '/scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim', 'PROCS_REQUESTED': 32, 'EMAIL': 'urkhanal at tucdp.edu.np', 'RESTART_ID': 1, 'CHAINED_JOB_ID': '', 'FROM_RESTART_COMMAND': '', 'NUM_SMT': 1, 'WALLTIME_SECONDS': 28800, 'SIMFACTORY': '/home/ettest52/Cactus/simfactory/bin/sim', 'PROCS': 32, 'SUBMITSCRIPT': '/scratch/ettest52/simulations/static_tov/output-0001/SIMFACTORY/SubmitScript', 'WALLTIME_HOURS': 8.0, 'WALLTIME_MM': '00', 'PARFILE': '/scratch/ettest52/simulations/static_tov/output-0001/static_tov.par', 'WALLTIME_SS': '00', 'QUEUE': 'checkpt', 'CONFIGURATION': 'sim', 'SOURCEDIR': '/home/ettest52/Cactus', 'HOSTNAME': 'qb3.loni.org', 'NUM_PROCS': 4, 'SCRIPTFILE': '/scratch/ettest52/simulations/static_tov/output-0001/SIMFACTORY/SubmitScript', 'MEMORY': '8192', 'WALLTIME_MINUTES': 480, 'SHORT_SIMULATION_NAME': 'static_tov-0001'} [LOG:2013-07-22 10:23:47] self.submit(submitScript)::self.Properties: /scratch/ettest52/simulations/static_tov/output-0001/SIMFACTORY/properties.ini [LOG:2013-07-22 10:23:47] self.submit(submitScript):: [LOG:2013-07-22 10:23:47] self.submit(submitScript)::[properties] [LOG:2013-07-22 10:23:47] self.submit(submitScript)::machine         = queenbee [LOG:2013-07-22 10:23:47] self.submit(submitScript)::simulationid    = simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.07.21-11.10.48-15714 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::sourcedir       = /home/ettest52/Cactus [LOG:2013-07-22 10:23:47] self.submit(submitScript)::configuration   = sim [LOG:2013-07-22 10:23:47] self.submit(submitScript)::configid        = config-sim-qb4.loni.org-home-ettest52-Cactus [LOG:2013-07-22 10:23:47] self.submit(submitScript)::buildid         = build-sim-qb4.loni.org-ettest52-2013.07.18-07.25.24-12102 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::testsuite       = False [LOG:2013-07-22 10:23:47] self.submit(submitScript)::executable      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim [LOG:2013-07-22 10:23:47] self.submit(submitScript)::optionlist      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/cfg/OptionList [LOG:2013-07-22 10:23:47] self.submit(submitScript)::submitscript    = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/SubmitScript [LOG:2013-07-22 10:23:47] self.submit(submitScript)::runscript       = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/RunScript [LOG:2013-07-22 10:23:47] self.submit(submitScript)::parfile         = /scratch/ettest52/simulations/static_tov/SIMFACTORY/par/static_tov.par [LOG:2013-07-22 10:23:47] self.submit(submitScript)::chainedjobid    = -1 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::ppn             = 8 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::procsrequested  = 32 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::allocation      = loni_cactus08 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::user            = ettest52 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::numsmt          = 1 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::walltime        = 8:00:00 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::numprocs        = 4 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::nodeprocs       = 1 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::numthreads      = 8 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::hostname        = qb3.loni.org [LOG:2013-07-22 10:23:47] self.submit(submitScript)::ppnused         = 8 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::queue           = checkpt [LOG:2013-07-22 10:23:47] self.submit(submitScript)::cpufreq         = 2.33 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::procs           = 32 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::memory          = 8192 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::nodes           = 4 [LOG:2013-07-22 10:23:47] self.submit(submitScript)::pbsSimulationName= static_tov-0001 [LOG:2013-07-22 10:23:47] self.submit(submitScript):: [LOG:2013-07-22 10:23:47] self.submit(submitScript)::saving substituted submitscript contents to: /scratch/ettest52/simulations/static_tov/output-0001/SIMFACTORY/SubmitScript [LOG:2013-07-22 10:23:47] self.submit(submitScript)::Executing submission command: qsub /scratch/ettest52/simulations/static_tov/output-0001/SIMFACTORY/SubmitScript [LOG:2013-07-22 10:23:47] self.makeActive()::Simulation static_tov with restart-id 1 has been made active [LOG:2013-07-22 10:23:48] job_id = self.extractJobId(output)::received raw output: 701299.qb2 [LOG:2013-07-22 10:23:48] job_id = self.extractJobId(output):: [LOG:2013-07-22 10:23:48] job_id = self.extractJobId(output)::using submitRegex: (\d+)[.]qb2 [LOG:2013-07-22 10:23:48] self.submit(submitScript)::After searching raw output, it was determined that the job_id is: 701299 [LOG:2013-07-22 10:23:48] self.submit(submitScript)::Simulation static_tov, with restart id 1, and job id 701299 has been submitted [LOG:2013-07-22 10:26:43] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 1, long restart id 0001 [LOG:2013-07-23 00:12:00] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 1, long restart id 0001 [LOG:2013-07-23 10:29:21] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 1, long restart id 0001 [LOG:2013-07-27 10:12:50] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 1, long restart id 0001 [LOG:2013-07-27 10:25:09] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 1, long restart id 0001 [LOG:2013-07-31 00:59:46] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 1, long restart id 0001 [LOG:2013-08-01 10:58:25] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 1, long restart id 0001 [LOG:2013-08-03 04:34:24] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 1, long restart id 0001 


The directory output-oooo contains only SIMFACTORY but output-0001 contains SIMFACTORY directory and two files static_tov.err and static_tov.out but static_tov is not present, which is required. static_tov.err contains (PBS: exec of shell " failed) and static_tov.out contains following details.-------------------------------------- Running PBS prologue script -------------------------------------- User and Job Data: -------------------------------------- Job ID:    701299.qb2 Username:  ettest52 Group:     lsuusers Date:      22-Jul-2013 10:23 Node:      qb223 (19932) -------------------------------------- PBS has allocated the following nodes: 
qb223 qb218 qb217 qb216 
A total of 32 processors on 4 nodes allocated --------------------------------------------- Check nodes and clean them of stray processes --------------------------------------------- Checking node qb223 10:23:55 Checking node qb218 10:23:57 Checking node qb217 10:23:59 Checking node qb216 10:24:01 Done clearing all the allocated nodes ------------------------------------------------------ Concluding PBS prologue script - 22-Jul-2013 10:24:01 ------------------------------------------------------ ------------------------------------------------------ Running PBS epilogue script    - 22-Jul-2013 10:24:01 ------------------------------------------------------ Checking node qb223 (MS) Checking node qb216 ok Checking node qb217 ok Checking node qb218 ok Checking node qb223 ok ------------------------------------------------------ Concluding PBS epilogue script - 22-Jul-2013 10:24:08 ------------------------------------------------------ Exit Status:     Job ID:          701299.qb2 Username:        ettest52 Group:           lsuusers Job Name:        static_tov-0001 Session Id:      19931 Resource Limits: ncpus=1,nodes=4:ppn=8,walltime=08:00:00 Resources Used:  cput=00:00:00,mem=428kb,vmem=1416kb,walltime=00:00:01 Queue Used:      checkpt Account String:  loni_cactus08 Node:            qb223 Process id:      20454 ------------------------------------------------------ 

Similarly the directory ks-mclachlan contains three directories(output-0000, output-0000-active and SIMFACTORY) and a log.txt file, which contains following details[LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::Creating simulation ks-mclachlan [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::Simulation directory: /scratch/ettest52/simulations/ks-mclachlan [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::Simulation Properties: [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile):: [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::[properties] [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::machine         = queenbee [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::simulationid    = simulation-ks-mclachlan-queenbee-qb3.loni.org-ettest52-2013.07.27-10.19.03-7922 [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::sourcedir       = /home/ettest52/Cactus [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::configuration   = sim [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::configid        = config-sim-qb4.loni.org-home-ettest52-Cactus [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::buildid         = build-sim-qb4.loni.org-ettest52-2013.07.18-07.25.24-12102 [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::testsuite       = False [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::executable      = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/exe/cactus_sim [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::optionlist      = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/cfg/OptionList [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::submitscript    = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/run/SubmitScript [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::runscript       = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/run/RunScript [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::parfile         = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/par/ks-mclachlan.par [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile):: [LOG:2013-07-27 10:19:03] self.create(simulationName, parfile)::Simulation ks-mclachlan created [LOG:2013-07-27 10:19:03] self.submit(submitScript)::Restart for simulation ks-mclachlan created with restart id 0, long restart id 0000 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::Prepping for submission [LOG:2013-07-27 10:19:03] self.submit(submitScript)::No previous walltime available to be used, using walltime 8:00:00 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::Defined substituion properties for submission [LOG:2013-07-27 10:19:03] self.submit(submitScript)::{'SIMULATION_ID': 'simulation-ks-mclachlan-queenbee-qb3.loni.org-ettest52-2013.07.27-10.19.03-7922', 'NODE_PROCS': 1, 'PPN_USED': 8, 'PPN': 8, 'ALLOCATION': 'loni_cactus08', 'WALLTIME_HH': '08', 'CPUFREQ': '2.33', 'USER': 'ettest52', 'RUNDIR': '/scratch/ettest52/simulations/ks-mclachlan/output-0000', 'NODES': 4, 'SIMULATION_NAME': 'ks-mclachlan', 'WALLTIME': '8:00:00', 'NUM_THREADS': 8, 'EXECUTABLE': '/scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/exe/cactus_sim', 'PROCS_REQUESTED': 32, 'EMAIL': 'urkhanal at tucdp.edu.np', 'RESTART_ID': 0, 'CHAINED_JOB_ID': '', 'FROM_RESTART_COMMAND': '', 'NUM_SMT': 1, 'WALLTIME_SECONDS': 28800, 'SIMFACTORY': '/home/ettest52/Cactus/simfactory/bin/sim', 'PROCS': 32, 'SUBMITSCRIPT': '/scratch/ettest52/simulations/ks-mclachlan/output-0000/SIMFACTORY/SubmitScript', 'WALLTIME_HOURS': 8.0, 'WALLTIME_MM': '00', 'PARFILE': '/scratch/ettest52/simulations/ks-mclachlan/output-0000/ks-mclachlan.par', 'WALLTIME_SS': '00', 'QUEUE': 'checkpt', 'CONFIGURATION': 'sim', 'SOURCEDIR': '/home/ettest52/Cactus', 'HOSTNAME': 'qb3.loni.org', 'NUM_PROCS': 4, 'SCRIPTFILE': '/scratch/ettest52/simulations/ks-mclachlan/output-0000/SIMFACTORY/SubmitScript', 'MEMORY': '8192', 'WALLTIME_MINUTES': 480, 'SHORT_SIMULATION_NAME': 'ks-mclachlan-00'} [LOG:2013-07-27 10:19:03] self.submit(submitScript)::self.Properties: /scratch/ettest52/simulations/ks-mclachlan/output-0000/SIMFACTORY/properties.ini [LOG:2013-07-27 10:19:03] self.submit(submitScript):: [LOG:2013-07-27 10:19:03] self.submit(submitScript)::[properties] [LOG:2013-07-27 10:19:03] self.submit(submitScript)::machine         = queenbee [LOG:2013-07-27 10:19:03] self.submit(submitScript)::simulationid    = simulation-ks-mclachlan-queenbee-qb3.loni.org-ettest52-2013.07.27-10.19.03-7922 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::sourcedir       = /home/ettest52/Cactus [LOG:2013-07-27 10:19:03] self.submit(submitScript)::configuration   = sim [LOG:2013-07-27 10:19:03] self.submit(submitScript)::configid        = config-sim-qb4.loni.org-home-ettest52-Cactus [LOG:2013-07-27 10:19:03] self.submit(submitScript)::buildid         = build-sim-qb4.loni.org-ettest52-2013.07.18-07.25.24-12102 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::testsuite       = False [LOG:2013-07-27 10:19:03] self.submit(submitScript)::executable      = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/exe/cactus_sim [LOG:2013-07-27 10:19:03] self.submit(submitScript)::optionlist      = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/cfg/OptionList [LOG:2013-07-27 10:19:03] self.submit(submitScript)::submitscript    = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/run/SubmitScript [LOG:2013-07-27 10:19:03] self.submit(submitScript)::runscript       = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/run/RunScript [LOG:2013-07-27 10:19:03] self.submit(submitScript)::parfile         = /scratch/ettest52/simulations/ks-mclachlan/SIMFACTORY/par/ks-mclachlan.par [LOG:2013-07-27 10:19:03] self.submit(submitScript)::chainedjobid    = -1 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::ppn             = 8 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::procsrequested  = 32 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::allocation      = loni_cactus08 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::user            = ettest52 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::numsmt          = 1 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::walltime        = 8:00:00 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::numprocs        = 4 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::nodeprocs       = 1 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::numthreads      = 8 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::hostname        = qb3.loni.org [LOG:2013-07-27 10:19:03] self.submit(submitScript)::ppnused         = 8 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::queue           = checkpt [LOG:2013-07-27 10:19:03] self.submit(submitScript)::cpufreq         = 2.33 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::procs           = 32 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::memory          = 8192 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::nodes           = 4 [LOG:2013-07-27 10:19:03] self.submit(submitScript)::pbsSimulationName= ks-mclachlan-00 [LOG:2013-07-27 10:19:03] self.submit(submitScript):: [LOG:2013-07-27 10:19:03] self.submit(submitScript)::saving substituted submitscript contents to: /scratch/ettest52/simulations/ks-mclachlan/output-0000/SIMFACTORY/SubmitScript [LOG:2013-07-27 10:19:03] self.submit(submitScript)::Executing submission command: qsub /scratch/ettest52/simulations/ks-mclachlan/output-0000/SIMFACTORY/SubmitScript [LOG:2013-07-27 10:19:03] self.makeActive()::Simulation ks-mclachlan with restart-id 0 has been made active [LOG:2013-07-27 10:19:04] job_id = self.extractJobId(output)::received raw output: 702161.qb2 [LOG:2013-07-27 10:19:04] job_id = self.extractJobId(output):: [LOG:2013-07-27 10:19:04] job_id = self.extractJobId(output)::using submitRegex: (\d+)[.]qb2 [LOG:2013-07-27 10:19:04] self.submit(submitScript)::After searching raw output, it was determined that the job_id is: 702161 [LOG:2013-07-27 10:19:04] self.submit(submitScript)::Simulation ks-mclachlan, with restart id 0, and job id 702161 has been submitted [LOG:2013-07-27 10:25:09] ret = restart.load(sim, activeId)::For simulation ks-mclachlan, loaded restart id 0, long restart id 0000 [LOG:2013-07-31 00:59:45] ret = restart.load(sim, activeId)::For simulation ks-mclachlan, loaded restart id 0, long restart id 0000 [LOG:2013-08-01 10:58:25] ret = restart.load(sim, activeId)::For simulation ks-mclachlan, loaded restart id 0, long restart id 0000 [LOG:2013-08-03 04:34:24] ret = restart.load(sim, activeId)::For simulation ks-mclachlan, loaded restart id 0, long restart id 0000 


Inside output-0000 a directory SIMFACTORY and two files ks-mclachlan.err and ks-mclachlan.out are present. ks-mclachlan.err contains (PBS: exec of shell " failed) and ks-mclachlan.out contains the following-------------------------------------- Running PBS prologue script -------------------------------------- User and Job Data: -------------------------------------- Job ID:    702161.qb2 Username:  ettest52 Group:     lsuusers Date:      28-Jul-2013 01:20 Node:      qb215 (24943) -------------------------------------- PBS has allocated the following nodes: 
qb215 qb184 qb151 qb130 
A total of 32 processors on 4 nodes allocated --------------------------------------------- Check nodes and clean them of stray processes --------------------------------------------- Checking node qb215 01:20:30 Checking node qb184 01:20:31 Checking node qb151 01:20:33 Checking node qb130 01:20:35 Done clearing all the allocated nodes ------------------------------------------------------ Concluding PBS prologue script - 28-Jul-2013 01:20:35 ------------------------------------------------------ ------------------------------------------------------ Running PBS epilogue script    - 28-Jul-2013 01:20:35 ------------------------------------------------------ Checking node qb215 (MS) Checking node qb130 ok Checking node qb151 ok Checking node qb184 ok Checking node qb215 ok ------------------------------------------------------ Concluding PBS epilogue script - 28-Jul-2013 01:20:43 ------------------------------------------------------ Exit Status:     Job ID:          702161.qb2 Username:        ettest52 Group:           lsuusers Job Name:        ks-mclachlan-00 Session Id:      24942 Resource Limits: ncpus=1,nodes=4:ppn=8,walltime=08:00:00 Resources Used:  cput=00:00:00,mem=428kb,vmem=3884kb,walltime=00:00:00 Queue Used:      checkpt Account String:  loni_cactus08 Node:            qb215 Process id:      25465 ------------------------------------------------------

The directory output-0000-active also contains the same details. I am wondering where I made mistake.
Udayaraj Khanal
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20130803/34332d65/attachment-0001.html 


More information about the Users mailing list