[Users] Problem with butch submission on XC30 CFCA.

Max Barkov barmv05 at gmail.com
Mon Apr 20 02:16:00 CDT 2015


Dear Sir or Madam,

I have problem with submission of jobs on Cray XC30 in CFCA NAOJ.
I used script which is similar to edison cluster (also XC30).
Unfortunately, I have a problem

if I submit the job
ET/Cactus> simfactory/bin/sim run cir_1 --configuration whisky_15
--parfile=par/Collapsar.par --procs=48 --machine=cfca_intel

I got the diagnostic:

*Simulation name: cir_1*
*Assigned restart id: 0 *
*Running simulation cir_1*
*ModuleCmd_Switch.c(172):ERROR:152: Module 'PrgEnv-cray' is currently not
loaded*
*Preparing:*
*+ set -e*
*+ cd /work/barkovmm/ET_simulations/cir_1/output-0000-active*
*+ module list*
*++ /opt/modules/3.2.6.7/bin/modulecmd <http://3.2.6.7/bin/modulecmd> bash
list*
*Currently Loaded Modulefiles:*
*  1) modules/3.2.6.7 <http://3.2.6.7>                      10)
cray-libsci/13.0.1                   19) alps/5.2.1-2.0502.8712.10.32.ari*
*  2) eswrap/1.1.0-1.020200.1130.0         11)
udreg/2.3.2-1.0502.8763.1.11.ari     20) rca/1.0.0-2.0502.51491.3.92.ari*
*  3) switch/1.0-1.0502.51632.2.84.ari     12)
ugni/5.0-1.0502.9037.7.26.ari        21) atp/1.7.5*
*  4) craype-network-aries                 13)
pmi/5.0.5-1.0000.10300.134.8.ari     22) PrgEnv-intel/5.2.25*
*  5) craype/2.2.0                         14)
dmapp/7.0.1-1.0502.9080.9.32.ari     23) cray-hdf5/1.8.13*
*  6) pbs/12.2.3.141156                    15)
gni-headers/3.0-1.0502.9038.7.4.ari  24) cray-petsc/3.5.1.0
<http://3.5.1.0>*
*  7) craype-haswell                       16)
xpmem/0.1-2.0502.51169.1.11.ari      25) fftw/3.3.4.0 <http://3.3.4.0>*
*  8) cray-mpich/7.0.3                     17)
job/1.5.5-0.1_2.0502.52111.3.39.ari  26) gsl/115*
*  9) intel/15.0.0.090 <http://15.0.0.090>                     18)
csa/3.0.0-1_2.0502.51200.1.108.ari   27) papi/5.3.2*
*+ eval*
*+ echo Checking:*
*Checking:*
*+ pwd*
*/work/barkovmm/ET_simulations/cir_1/output-0000-active*
*+ hostname*
*xc01*
*+ date*
*2015年  4月 20日 月曜日 16:01:23 JST*
*+ echo Environment:*
*Environment:*
*+ export CACTUS_NUM_PROCS=4*
*+ CACTUS_NUM_PROCS=4*
*+ export CACTUS_NUM_THREADS=12*
*+ CACTUS_NUM_THREADS=12*
*+ export GMON_OUT_PREFIX=gmon.out*
*+ GMON_OUT_PREFIX=gmon.out*
*+ export OMP_NUM_THREADS=12*
*+ OMP_NUM_THREADS=12*
*+ env*
*+ sort*
*+ export 'NODE_PROCS=-N 2'*
*+ NODE_PROCS='-N 2'*
*+ export 'SOCKET_PROCS=-S 1'*
*+ SOCKET_PROCS='-S 1'*
*+ export CORE_PROCS=*
*+ CORE_PROCS=*
*+ echo Starting:*
*Starting:*
*++ date +%s*
*+ export CACTUS_STARTTIME=1429513283*
*+ CACTUS_STARTTIME=1429513283*
*+ aprun -cc numa_node -n 4 -d 12 -N 2 -S 1
/work/barkovmm/ET_simulations/cir_1/SIMFACTORY/exe/cactus_whisky_15 -L 3
/work/barkovmm/ET_simulations/cir_1/output-0000/Collapsar.par*
*/work/barkovmm/ET_simulations/cir_1/output-0000/SIMFACTORY/RunScript: line
39: aprun: command not found*
*2015年  4月 20日 月曜日 16:01:23 JST*
*Simfactory Done at date: 0*

As I understand, Runscript is launched not through qsub system.
grep command can't find qsub word in the current configuration
grep -irn qsub ./config/whisky_15/.

In the machine configuration file I have:
*submit          = qsub @SCRIPTFILE@*
*getstatus       = qstat @JOB_ID@*
*stop            = qdel @JOB_ID@*
*submitpattern   = (\d+)[.]*
*statuspattern   = ^@JOB_ID@[. ]*
*queuedpattern   = " Q "*
*runningpattern  = " R "*
*holdingpattern  = " H "*
*scratchbasedir   = /work/@USER@*
*exechost        = qstat -f @JOB_ID@*

Which additional information I can supply?
Thank you in advance,
Maxim Barkov.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20150420/af3e4ca6/attachment.html 


More information about the Users mailing list