[ET Trac] [Einstein Toolkit] #645: Error concerning missing -V option when submitting job on LoneStar

Einstein Toolkit trac-noreply at einsteintoolkit.org
Sat Oct 22 12:24:29 CDT 2011


#645: Error concerning missing -V option when submitting job on LoneStar
------------------------+---------------------------------------------------
 Reporter:  hinder      |       Owner:  eschnett  
     Type:  defect      |      Status:  new       
 Priority:  major       |   Milestone:  ET_2011_10
Component:  SimFactory  |     Version:            
 Keywords:              |  
------------------------+---------------------------------------------------
 I used the following command to submit an ET testsuite job on LoneStar.
 The intention is to run on 1 process with 6 threads.

   sim --remote lonestar create-submit maxwell_1proc_2 --testsuite --procs
 6 --num-threads 6 --walltime 4:00:00 --ppn-used 6

 and got a weird error.  SimFactory didn't report any error or return a
 nonzero exit code, even though it was unable to determine a job ID.  I
 repeated the submission, and the second time it worked, so the fault
 appears to be intermittent.  There was no difference in the submit script
 in each case, apart from the job name.

 The log file is attached, but the final error message from the log file
 is:

 {{{
 [LOG:2011-10-22 11:29:05] self.submit(submitScript)::Executing submission
 command: qsub
 /scratch/00915/hinder/simulations/maxwell_1proc/output-0000/SIMFACTORY/SubmitScript
 [LOG:2011-10-22 11:29:05] self.makeActive()::Simulation maxwell_1proc with
 restart-id 0 has been made active
 [LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::received raw
 output: Unable to run job: JSV rejected job.
 [LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::Exiting.
 [LOG:2011-10-22 11:29:06] job_id =
 self.extractJobId(output)::-----------------------------------------------------------------
 [LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::-- Welcome
 to the Lonestar4 Westmere/QDR IB Linux Cluster --
 [LOG:2011-10-22 11:29:06] job_id =
 self.extractJobId(output)::-----------------------------------------------------------------
 [LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::--> Checking
 that you specified -V...
 [LOG:2011-10-22 11:29:06] job_id =
 self.extractJobId(output)::--------------------------> Rejecting job
 <--------------------------
 [LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::-V is now a
 required option. Please specify it in your submit script.
 [LOG:2011-10-22 11:29:06] job_id =
 self.extractJobId(output)::---------------------------------------------------------------------
 [LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::
 [LOG:2011-10-22 11:29:06] job_id = self.extractJobId(output)::using
 submitRegex: Your job (\d+) \(.*?\) has been submitted
 [LOG:2011-10-22 11:29:06] self.submit(submitScript)::After searching raw
 output, it was determined that the job_id is: -1
 [LOG:2011-10-22 11:29:06] self.submit(submitScript)::If this is -1, that
 means the regex did NOT match anything. No job_id means no control.
 }}}

 Full log.txt file is attached.  The job was not submitted.

 The weird thing is that I do have -V in my submission script.  The file
 /scratch/00915/hinder/simulations/maxwell_1proc/output-0000/SIMFACTORY/SubmitScript
 has

 {{{
 #! /bin/bash
 #$ -A TG-MCA02N014
 #$ -q normal
 #$ -r n
 #$ -l h_rt=4:00:00
 #$ -pe 1way 12
 #$
 #$ -V
 #$ -N maxwell_1proc-0
 #$ -M ian.hinder at aei.mpg.de
 #$ -m abe
 #$ -o
 /scratch/00915/hinder/simulations/maxwell_1proc/output-0000/maxwell_1proc.out
 #$ -e
 /scratch/00915/hinder/simulations/maxwell_1proc/output-0000/maxwell_1proc.err
 cd /work/00915/hinder/Cactus/EinsteinToolkit
 /work/00915/hinder/Cactus/EinsteinToolkit/simfactory/bin/sim run
 maxwell_1proc --machine=lonestar --restart-id=0
 }}}

 Any ideas?

-- 
Ticket URL: <https://trac.einsteintoolkit.org/ticket/645>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit


More information about the Trac mailing list