[Users] simulation problem

Udayaraj Khanal urkhanal at tucdp.edu.np
Wed Aug 7 03:36:44 CDT 2013


Dear All,I have submitted static_tov simulation inside Cactus directory by ./simfactory/bin/sim submit static_tov --parfile=par/static_tov.par --procs=32 --walltime=8:0:0 --allocation=loni_cactus08
Inside Cactus directory ./simfactory/bin/sim list-simulations givesstatic_tov                      [ACTIVE(FINISHED), restart 0000, job id 703467]But in /scratch/ettest52/simulations/static_tov I found a log.txt file along with output-0000, output-0000-active and SIMFACTORY directories. The log.txt file contains following
[LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::Creating simulation static_tov [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::Simulation directory: /scratch/ettest52/simulations/static_tov [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::Simulation Properties: [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile):: [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::[properties] [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::machine         = queenbee [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::simulationid    = simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.08.04-11.09.54-31394 [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::sourcedir       = /home/ettest52/Cactus [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::configuration   = sim [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::configid        = config-sim-qb4.loni.org-home-ettest52-Cactus [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::buildid         = build-sim-qb4.loni.org-ettest52-2013.07.18-07.25.24-12102 [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::testsuite       = False [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::executable      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::optionlist      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/cfg/OptionList [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::submitscript    = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/SubmitScript [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::runscript       = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/RunScript [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::parfile         = /scratch/ettest52/simulations/static_tov/SIMFACTORY/par/static_tov.par [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile):: [LOG:2013-08-04 11:09:54] self.create(simulationName, parfile)::Simulation static_tov created [LOG:2013-08-04 11:09:54] self.submit(submitScript)::Restart for simulation static_tov created with restart id 0, long restart id 0000 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::Prepping for submission [LOG:2013-08-04 11:09:54] self.submit(submitScript)::No previous walltime available to be used, using walltime 8:00:00 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::Defined substituion properties for submission [LOG:2013-08-04 11:09:54] self.submit(submitScript)::{'SIMULATION_ID': 'simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.08.04-11.09.54-31394', 'NODE_PROCS': 1, 'PPN_USED': 8, 'PPN': 8, 'ALLOCATION': 'loni_cactus08', 'WALLTIME_HH': '08', 'CPUFREQ': '2.33', 'USER': 'ettest52', 'RUNDIR': '/scratch/ettest52/simulations/static_tov/output-0000', 'NODES': 4, 'SIMULATION_NAME': 'static_tov', 'WALLTIME': '8:00:00', 'NUM_THREADS': 8, 'EXECUTABLE': '/scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim', 'PROCS_REQUESTED': 32, 'EMAIL': 'urkhanal at tucdp.edu.np', 'RESTART_ID': 0, 'CHAINED_JOB_ID': '', 'FROM_RESTART_COMMAND': '', 'NUM_SMT': 1, 'WALLTIME_SECONDS': 28800, 'SIMFACTORY': '/home/ettest52/Cactus/simfactory/bin/sim', 'PROCS': 32, 'SUBMITSCRIPT': '/scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript', 'WALLTIME_HOURS': 8.0, 'WALLTIME_MM': '00', 'PARFILE': '/scratch/ettest52/simulations/static_tov/output-0000/static_tov.par', 'WALLTIME_SS': '00', 'QUEUE': 'checkpt', 'CONFIGURATION': 'sim', 'SOURCEDIR': '/home/ettest52/Cactus', 'HOSTNAME': 'qb3.loni.org', 'NUM_PROCS': 4, 'SCRIPTFILE': '/scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript', 'MEMORY': '8192', 'WALLTIME_MINUTES': 480, 'SHORT_SIMULATION_NAME': 'static_tov-0000'} [LOG:2013-08-04 11:09:54] self.submit(submitScript)::self.Properties: /scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/properties.ini [LOG:2013-08-04 11:09:54] self.submit(submitScript):: [LOG:2013-08-04 11:09:54] self.submit(submitScript)::[properties] [LOG:2013-08-04 11:09:54] self.submit(submitScript)::machine         = queenbee [LOG:2013-08-04 11:09:54] self.submit(submitScript)::simulationid    = simulation-static_tov-queenbee-qb3.loni.org-ettest52-2013.08.04-11.09.54-31394 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::sourcedir       = /home/ettest52/Cactus [LOG:2013-08-04 11:09:54] self.submit(submitScript)::configuration   = sim [LOG:2013-08-04 11:09:54] self.submit(submitScript)::configid        = config-sim-qb4.loni.org-home-ettest52-Cactus [LOG:2013-08-04 11:09:54] self.submit(submitScript)::buildid         = build-sim-qb4.loni.org-ettest52-2013.07.18-07.25.24-12102 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::testsuite       = False [LOG:2013-08-04 11:09:54] self.submit(submitScript)::executable      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/exe/cactus_sim [LOG:2013-08-04 11:09:54] self.submit(submitScript)::optionlist      = /scratch/ettest52/simulations/static_tov/SIMFACTORY/cfg/OptionList [LOG:2013-08-04 11:09:54] self.submit(submitScript)::submitscript    = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/SubmitScript [LOG:2013-08-04 11:09:54] self.submit(submitScript)::runscript       = /scratch/ettest52/simulations/static_tov/SIMFACTORY/run/RunScript [LOG:2013-08-04 11:09:54] self.submit(submitScript)::parfile         = /scratch/ettest52/simulations/static_tov/SIMFACTORY/par/static_tov.par [LOG:2013-08-04 11:09:54] self.submit(submitScript)::chainedjobid    = -1 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::ppn             = 8 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::procsrequested  = 32 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::allocation      = loni_cactus08 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::user            = ettest52 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::numsmt          = 1 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::walltime        = 8:00:00 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::numprocs        = 4 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::nodeprocs       = 1 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::numthreads      = 8 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::hostname        = qb3.loni.org [LOG:2013-08-04 11:09:54] self.submit(submitScript)::ppnused         = 8 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::queue           = checkpt [LOG:2013-08-04 11:09:54] self.submit(submitScript)::cpufreq         = 2.33 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::procs           = 32 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::memory          = 8192 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::nodes           = 4 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::pbsSimulationName= static_tov-0000 [LOG:2013-08-04 11:09:54] self.submit(submitScript):: [LOG:2013-08-04 11:09:54] self.submit(submitScript)::saving substituted submitscript contents to: /scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript [LOG:2013-08-04 11:09:54] self.submit(submitScript)::Executing submission command: qsub /scratch/ettest52/simulations/static_tov/output-0000/SIMFACTORY/SubmitScript [LOG:2013-08-04 11:09:54] self.makeActive()::Simulation static_tov with restart-id 0 has been made active [LOG:2013-08-04 11:09:54] job_id = self.extractJobId(output)::received raw output: 703467.qb2 [LOG:2013-08-04 11:09:54] job_id = self.extractJobId(output):: [LOG:2013-08-04 11:09:54] job_id = self.extractJobId(output)::using submitRegex: (\d+)[.]qb2 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::After searching raw output, it was determined that the job_id is: 703467 [LOG:2013-08-04 11:09:54] self.submit(submitScript)::Simulation static_tov, with restart id 0, and job id 703467 has been submitted [LOG:2013-08-04 11:11:02] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 0, long restart id 0000 [LOG:2013-08-07 02:43:53] ret = restart.load(sim, activeId)::For simulation static_tov, loaded restart id 0, long restart id 0000 


Inside output-0000 there is not static_tov directory but contains SIMFACTORY directory, static_tov.err and static_tov.out files. static_tov.err has [PBS: exec of shell '' failed] and static_tov.out contains following
-------------------------------------- Running PBS prologue script -------------------------------------- User and Job Data: -------------------------------------- Job ID:    703467.qb2 Username:  ettest52 Group:     lsuusers Date:      04-Aug-2013 11:10 Node:      qb453 (6947) -------------------------------------- PBS has allocated the following nodes: 
qb453 qb452 qb451 qb450 
A total of 32 processors on 4 nodes allocated --------------------------------------------- Check nodes and clean them of stray processes --------------------------------------------- Checking node qb453 11:10:06 Checking node qb452 11:10:08 Checking node qb451 11:10:09 Checking node qb450 11:10:11 Done clearing all the allocated nodes ------------------------------------------------------ Concluding PBS prologue script - 04-Aug-2013 11:10:11 ------------------------------------------------------ ------------------------------------------------------ Running PBS epilogue script    - 04-Aug-2013 11:10:12 ------------------------------------------------------ Checking node qb453 (MS) Checking node qb450 ok Checking node qb451 ok Checking node qb452 ok Checking node qb453 ok ------------------------------------------------------ Concluding PBS epilogue script - 04-Aug-2013 11:10:19 ------------------------------------------------------ Exit Status:     Job ID:          703467.qb2 Username:        ettest52 Group:           lsuusers Job Name:        static_tov-0000 Session Id:      6946 Resource Limits: ncpus=1,nodes=4:ppn=8,walltime=08:00:00 Resources Used:  cput=00:00:00,mem=112kb,vmem=1416kb,walltime=00:00:00 Queue Used:      checkpt Account String:  loni_cactus08 Node:            qb453 Process id:      7476 ------------------------------------------------------ How can I solve this problem?

Udayaraj Khanal
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20130807/e1263163/attachment.html 


More information about the Users mailing list