[Users] Simulation error

Udayaraj Khanal urkhanal at tucdp.edu.np
Mon Aug 19 10:56:32 CDT 2013


Dear All,
I submitted a simulation from Cactus directory as
./simfactory/bin/sim submit static_tov --parfile=par/static_tov.par --procs=32 --walltime=8:0:0 --allocation=loni_cactus08

Then ./simfactory/bin/sim list-simulations gives

static_tov     [ACTIVE(FINISHED), restart 0000, job id 705851]

I found in /scratch/ettest52/simulations/static_tov/output-0000 there is a folder SIMFACTORY and two file static_tov.err and static_tov.out but static_tov directory is not present as mentioned in Tutorial for New Users. File static_tov.err contains {PBS: exec of shell failed} and file static_tov.out contains following details,
--------------------------------------
Running PBS prologue script
--------------------------------------
User and Job Data:
--------------------------------------
Job ID:    705851.qb2
Username:  ettest52
Group:     lsuusers
Date:      17-Aug-2013 13:52
Node:      qb031 (11969)
--------------------------------------
PBS has allocated the following nodes:

qb031
qb030
qb029
qb028

A total of 32 processors on 4 nodes allocated
---------------------------------------------
Check nodes and clean them of stray processes
---------------------------------------------
Checking node qb031 13:52:20 
Checking node qb030 13:52:21 
Checking node qb029 13:52:23 
Checking node qb028 13:52:25 
Done clearing all the allocated nodes
------------------------------------------------------
Concluding PBS prologue script - 17-Aug-2013 13:52:25
------------------------------------------------------
------------------------------------------------------
Running PBS epilogue script    - 17-Aug-2013 13:52:26
------------------------------------------------------
Checking node qb031 (MS)
Checking node qb028 ok
Checking node qb029 ok
Checking node qb030 ok
Checking node qb031 ok
------------------------------------------------------
Concluding PBS epilogue script - 17-Aug-2013 13:52:33
------------------------------------------------------
Exit Status:    
Job ID:          705851.qb2
Username:        ettest52
Group:           lsuusers
Job Name:        static_tov-0000
Session Id:      11968
Resource Limits: ncpus=1,nodes=4:ppn=8,walltime=08:00:00
Resources Used:  cput=00:00:00,mem=428kb,vmem=3884kb,walltime=00:00:00
Queue Used:      checkpt
Account String:  loni_cactus08
Node:            qb031
Process id:      12492
------------------------------------------------------

I also found a log.txt file in /scratch/ettest52/simulations/static_tov which contains,

[LOG:2013-08-17 04:21:54] self.create(simulationName, parfile)::Creating simulation static_tov
[LOG:2013-08-17 04:21:54] self.create(simulationName, parfile)::Simulation directory: /scratch/ettest52/simulations/static_tov
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::Simulation Properties:
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::[properties]
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::machine         = queenbee
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::simulationid    = simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.08.17-04.21.54-5675
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::sourcedir       = /home/ettest52/Cactus
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::configuration   = sim
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::configid        = config-sim-qb4.loni.org-home-ettest52-Cactus
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::buildid         = build-sim-qb4.loni.org-ettest52-2013.08.17-09.01.14-32035
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::testsuite       = False
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::executable      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::optionlist      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/cfg/OptionList
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::submitscript    = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/SubmitScript
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::runscript       = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/RunScript
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::parfile         = /scratch/ettest52/simulations/static_tov/SIMFACTORY/par/static_tov.par
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::
[LOG:2013-08-17 04:21:56] self.create(simulationName, parfile)::Simulation static_tov created
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::Restart for simulation static_tov created with restart id 0, long restart id 0000
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::Prepping for submission
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::No previous walltime available to be used, using walltime 8:00:00
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::Defined substituion properties for submission
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::{'SIMULATION_ID': 'simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.08.17-04.21.54-5675', 'NODE_PROCS': 1, 'PPN_USED': 8, 'PPN': 8, 'ALLOCATION': 'loni_cactus08', 'WALLTIME_HH': '08', 'CPUFREQ': '2.33', 'USER': 'ettest52', 'RUNDIR': '/scratch/ettest52/simulations/static_tov/output-0000', 'NODES': 4, 'SIMULATION_NAME': 'static_tov', 'WALLTIME': '8:00:00', 'NUM_THREADS': 8, 'EXECUTABLE': '/scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim', 'PROCS_REQUESTED': 32, 'EMAIL': 'urkhanal at tucdp.edu.np', 'RESTART_ID': 0, 'CHAINED_JOB_ID': '', 'FROM_RESTART_COMMAND': '', 'NUM_SMT': 1, 'WALLTIME_SECONDS': 28800, 'SIMFACTORY': '/home/ettest52/Cactus/simfactory/bin/sim', 'PROCS': 32, 'SUBMITSCRIPT': '/scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript', 'WALLTIME_HOURS': 8.0, 'WALLTIME_MM': '00', 'PARFILE': '/scratch/ettest52/simulations/static_tov/output-0000/static_tov.par', 'WALLTIME_SS': '00', 'QUEUE': 'checkpt', 'CONFIGURATION': 'sim', 'SOURCEDIR': '/home/ettest52/Cactus', 'HOSTNAME': 'qb3.loni.org', 'NUM_PROCS': 4, 'SCRIPTFILE': '/scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript', 'MEMORY': '8192', 'WALLTIME_MINUTES': 480, 'SHORT_SIMULATION_NAME': 'static_tov-0000'}
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::self.Properties: /scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/properties.ini
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::[properties]
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::machine         = queenbee
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::simulationid    = simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.08.17-04.21.54-5675
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::sourcedir       = /home/ettest52/Cactus
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::configuration   = sim
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::configid        = config-sim-qb4.loni.org-home-ettest52-Cactus
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::buildid         = build-sim-qb4.loni.org-ettest52-2013.08.17-09.01.14-32035
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::testsuite       = False
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::executable      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::optionlist      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/cfg/OptionList
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::submitscript    = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/SubmitScript
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::runscript       = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/RunScript
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::parfile         = /scratch/ettest52/simulations/static_tov/SIMFACTORY/par/static_tov.par
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::chainedjobid    = -1
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::ppn             = 8
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::procsrequested  = 32
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::allocation      = loni_cactus08
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::user            = ettest52
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::numsmt          = 1
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::walltime        = 8:00:00
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::numprocs        = 4
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::nodeprocs       = 1
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::numthreads      = 8
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::hostname        = qb3.loni.org
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::ppnused         = 8
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::queue           = checkpt
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::cpufreq         = 2.33
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::procs           = 32
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::memory          = 8192
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::nodes           = 4
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::pbsSimulationName= static_tov-0000
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::saving substituted submitscript contents to: /scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::Executing submission command: qsub /scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript
[LOG:2013-08-17 04:21:56] self.makeActive()::Simulation static_tov with restart-id 0 has been made active
[LOG:2013-08-17 04:21:56] job_id = self.extractJobId(output)::received raw output: 705851.qb2
[LOG:2013-08-17 04:21:56] job_id = self.extractJobId(output)::
[LOG:2013-08-17 04:21:56] job_id = self.extractJobId(output)::using submitRegex: (\d+)[.]qb2
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::After searching raw output, it was determined that the job_id is: 705851
[LOG:2013-08-17 04:21:56] self.submit(submitScript)::Simulation static_tov, with restart id 0, and job id 705851 has been submitted
[LOG:2013-08-17 04:23:24] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 0, long restart id 0000
[LOG:2013-08-17 11:46:32] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 0, long restart id 0000
[LOG:2013-08-18 10:18:50] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 0, long restart id 0000
[LOG:2013-08-18 11:40:56] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 0, long restart id 0000

How can I complete the simulation?




Udayaraj Khanal 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20130819/505f2d1c/attachment-0001.html 


More information about the Users mailing list