[ET Trac] [Einstein Toolkit] #645: Error concerning missing -V option when submitting job on LoneStar
Einstein Toolkit
trac-noreply at einsteintoolkit.org
Sat Oct 22 12:24:29 CDT 2011
#645: Error concerning missing -V option when submitting job on LoneStar
------------------------+---------------------------------------------------
Reporter: hinder | Owner: eschnett
Type: defect | Status: new
Priority: major | Milestone: ET_2011_10
Component: SimFactory | Version:
Keywords: |
------------------------+---------------------------------------------------
I used the following command to submit an ET testsuite job on LoneStar.
The intention is to run on 1 process with 6 threads.
sim --remote lonestar create-submit maxwell_1proc_2 --testsuite --procs
6 --num-threads 6 --walltime 4:00:00 --ppn-used 6
and got a weird error. SimFactory didn't report any error or return a
nonzero exit code, even though it was unable to determine a job ID. I
repeated the submission, and the second time it worked, so the fault
appears to be intermittent. There was no difference in the submit script
in each case, apart from the job name.
The log file is attached, but the final error message from the log file
is:
{{{
[LOG:2011-10-22 11:29:05] self.submit(submitScript)::Executing submission
command: qsub
/scratch/00915/hinder/simulations/maxwell_1proc/output-0000/SIMFACTORY/SubmitScript
[LOG:2011-10-22 11:29:05] self.makeActive()::Simulation maxwell_1proc with
restart-id 0 has been made active
[LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::received raw
output: Unable to run job: JSV rejected job.
[LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::Exiting.
[LOG:2011-10-22 11:29:06] job_id =
self.extractJobId(output)::-----------------------------------------------------------------
[LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::-- Welcome
to the Lonestar4 Westmere/QDR IB Linux Cluster --
[LOG:2011-10-22 11:29:06] job_id =
self.extractJobId(output)::-----------------------------------------------------------------
[LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::--> Checking
that you specified -V...
[LOG:2011-10-22 11:29:06] job_id =
self.extractJobId(output)::--------------------------> Rejecting job
<--------------------------
[LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::-V is now a
required option. Please specify it in your submit script.
[LOG:2011-10-22 11:29:06] job_id =
self.extractJobId(output)::---------------------------------------------------------------------
[LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::
[LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::using
submitRegex: Your job (\d+) \(.*?\) has been submitted
[LOG:2011-10-22 11:29:06] self.submit(submitScript)::After searching raw
output, it was determined that the job_id is: -1
[LOG:2011-10-22 11:29:06] self.submit(submitScript)::If this is -1, that
means the regex did NOT match anything. No job_id means no control.
}}}
Full log.txt file is attached. The job was not submitted.
The weird thing is that I do have -V in my submission script. The file
/scratch/00915/hinder/simulations/maxwell_1proc/output-0000/SIMFACTORY/SubmitScript
has
{{{
#! /bin/bash
#$ -A TG-MCA02N014
#$ -q normal
#$ -r n
#$ -l h_rt=4:00:00
#$ -pe 1way 12
#$
#$ -V
#$ -N maxwell_1proc-0
#$ -M ian.hinder at aei.mpg.de
#$ -m abe
#$ -o
/scratch/00915/hinder/simulations/maxwell_1proc/output-0000/maxwell_1proc.out
#$ -e
/scratch/00915/hinder/simulations/maxwell_1proc/output-0000/maxwell_1proc.err
cd /work/00915/hinder/Cactus/EinsteinToolkit
/work/00915/hinder/Cactus/EinsteinToolkit/simfactory/bin/sim run
maxwell_1proc --machine=lonestar --restart-id=0
}}}
Any ideas?
--
Ticket URL: <https://trac.einsteintoolkit.org/ticket/645>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit
More information about the Trac
mailing list