[ET Trac] [Einstein Toolkit] #1168: simfactory rev 1677 fail to start on lonestar

Einstein Toolkit trac-noreply at einsteintoolkit.org
Mon Nov 5 23:06:24 CST 2012


#1168: simfactory rev 1677 fail to start on lonestar
------------------------+---------------------------------------------------
 Reporter:  rhaas       |       Owner:  eschnett
     Type:  defect      |      Status:  new     
 Priority:  blocker     |   Milestone:          
Component:  SimFactory  |     Version:          
 Keywords:              |  
------------------------+---------------------------------------------------
 It seems as if
 {{{
 module load TACC cuda cuda_SDK
 }}}
 fails with the warning
 {{{
 }}}
 after which simfactory fails to continue. My suspicion is that cuda is not
 available for OpenMPI (which is the selected MPI stack) and the module
 then returns and error and simfactory aborts if the env command does not
 succeed.
 The output I get is:
 {{{
 ==> qc0_cuda.err <==
 + cd /work/00945/rhaas/ET_trunk
 + /work/00945/rhaas/ET_trunk/simfactory/bin/sim run qc0_cuda
 --machine=lonestar --restart-id=0
 Inactive Modules:
   1) cuda     2) cuda_SDK

 Lmod Warning: Did not find: cuda cuda_SDK

 Try: "module spider cuda cuda_SDK"

 ==> qc0_cuda.out <==
 TACC: Setting memory limits for job 842190 to unlimited KB
 TACC: Dumping job script:
 --------------------------------------------------------------------------------
 #! /bin/bash
 #$ -A TG-PHY100033
 #$ -q normal
 #$ -r n
 #$ -l h_rt=0:15:00
 #$ -pe 2way 36
 #$
 #$ -V
 #$ -N qc0_cuda-0000
 #$ -M rhaas
 #$ -m abe
 #$ -o /scratch/00945/rhaas/simulations/qc0_cuda/output-0000/qc0_cuda.out
 #$ -e /scratch/00945/rhaas/simulations/qc0_cuda/output-0000/qc0_cuda.err
 set -x
 cd /work/00945/rhaas/ET_trunk
 /work/00945/rhaas/ET_trunk/simfactory/bin/sim run qc0_cuda
 --machine=lonestar --restart-id=0
 --------------------------------------------------------------------------------
 TACC: Done.
 Simulation name: qc0_cuda
 Running simulation qc0_cuda
 Mon Nov  5 23:03:27 CST 2012
 Simfactory Done at date: 0
 TACC: Cleaning up after job: 842190
 TACC: Done.
 }}}

 Removing this command from envsetup lets me run. However since I do not
 run CUDA (the commit claims OpenCL which also seems wrong), does someone
 who uses OpenCL/CUDA on lonestar want to suggest an alternative?

-- 
Ticket URL: <https://trac.einsteintoolkit.org/ticket/1168>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit


More information about the Trac mailing list