[Users] questions about compilation

林家暉 r06222015 at ntu.edu.tw
Mon Dec 18 03:47:09 CST 2017


Dear Roland,
Thank you for your testing.
It also took several hours for me to compile the code which should be down in 20 min. Did the similar situation occur when Edison is tested last time ?
Actually there are two other clusters I can use. But they are not supported by Einstein toolkit . I have tried to compile the code on them but failed since the required information for a new cluster is quite detailed and I cannot find all of them.
Will you suggest me to shift to another cluster which is unsupported ? 

Best regards,
Chia Hui

________________________________________
從: Roland Haas [roland.haas at physics.gatech.edu]
寄件日期: 2017年12月18日 上午 01:04
至: 林家暉
副本: rhaas at ncsa.illinois.edu; users at einsteintoolkit.org
主旨: Re: [Users] questions about compilation

Hello Chia Hui,

this may take a bit longer. Edison seems to be incredibly slow at
compiling the code, literally taking hours for the full compile and
what feels like minutes for each tesk in the configure stage of the
Cactus compile.

I am not sure why, my guess would actually be that it is related to the
intel license server since the login node is neither busy with high cpu
load nor does the compiler spend a lot of time in "D" state (ie waiting
for IO).

I can confirm though that I am seeing similar linking issues as you
were seeing.

Yours,
Roland


> Hello Chia Hui,
>
> given that edison was tested in July it would seems as if NERSC updated
> their module stack (and removed old ones).
>
> I am updating the simfactory settings and testing again right now.
> Should not take too long.
>
> sourcebasedir and basedir may well require adjusting for each user
> though we normally try to keep them fairly generic. Note that edison
> shares some directories with cori which is why we sandwich in a
> work_edison in the directories.
>
> The files you included look sane to me.
>
> Yours,
> Roland
>
> > Dear Roland,
> > Thank you again.
> > I am afraid that maybe I missed something important or made some mistake while installation and setting up. However what I did is almost following the tutorial , except for some parts like machine set up ,including sourcebasedir and basedir, which is dependent on users.
> > And I found some parts of the edison.ini file,which is default, are outdated. For example, versions of some modules are not available so I comment or change them .
> > I wonder what is the cause of so much problems as well.
> >
> > Attached files are the four files of edison which I used. Maybe there is some problem which I did not notice.  If you can take a minute a check them , I will be very appreciate.
> > The example simulation I ran is from the new version of tutorial:
> > https://nbviewer.jupyter.org/github/nds-org/jupyter-et/blob/master/CactusTutorial.ipynb
> >
> > ./simfactory/bin/sim create-submit helloworld --parfile arrangements/CactusExamples/HelloWorld/par/HelloWorld.par --walltime 0:5:0
> >
> > And the error message is showed in another figure of attached file. It seems the job was successfully submitted but fail to run.
> >
> > Best regards,
> > Chia Hui
> > ________________________________________
> > 從: Roland Haas [roland.haas at physics.gatech.edu]
> > 寄件日期: 2017年12月15日 上午 03:27
> > 至: 林家暉
> > 副本: users at einsteintoolkit.org
> > 主旨: Re: [Users] questions about compilation
> >
> > Hello Chia Hui,
> >
> > Hmm, unfortunately I am not sure why you would get such an error
> > message and have not personally seen this type of behaviour before.
> >
> > You are doing all of this on NERSC's edison machine (at least
> > tone of the screenshots seems to show you being on edison09), are you?
> >
> > In that case I have to admit, I am a bit confused why you have so much
> > trouble with the machine since edison is already supported by
> > simfactory. You should be able to build without problems using the
> > regular simfactory definition files like so:
> >
> > cd Cactus
> > simfactory/bin/sim build --thornlist manifest/einsteintoolkit.th
> >
> > and then run eg the qc0 example:
> >
> > simfactory/bin/sim submit qc0-test1 --procs 24 --walltime 4:0:0
> >
> > which would run on one node (24 cores per node on edison) for 4 hours.
> >
> > At least on July 15th 2017 edison was still functional in that I could
> > run the Cactus test suites on it.
> >
> > You should never have to use srun directly when using simfactory.
> >
> > Looking at simfactory/mdb/runscripts/edison.run then srun should be
> > used like so (once your job has started ie inside of a SLURM script or
> > an interactive job):
> >
> > export OMP_NUM_THREADS=<number_of_threads_per_mpi_rank>
> > srun -n <number_of_mpi_ranks> -c $OMP_NUM_THREADS cactus_sim qc0.par
> >
> > For completeness: simfactory uses the file
> > simfactory/mdb/optionlists/edison.cfg as the options list, simfactory/mdb/runscripts/edison.run is what runs inside of the SLURM job and simfactory/mdb/submitscripts/edison.sub contains the required SLURM headers.
> >
> > Yours,
> > Roland
> >
> > > Dear Roland,
> > > Thanks for your information.
> > > Actually all flags which you list are not missing in my option list ,except the last one( LDFLAGS = -fopenmp).However it seems not working after I added it .The error message does not changed.
> > > Then I tried to ignore utilities and directly run an example code. It seems to be almost successful except the error:
> > > srun: error: ioctl(TIOCGWINSZ): Inappropriate ioctl for device
> > > srun: error: Not using a pseudo-terminal, disregarding --pty option
> > > I guess it is some error related to ssh . I have tried ssh -t , but not works. Am I on the right direction?
> > >
> > > Best regards,
> > > Chia Hui
> > > ________________________________________
> > > 從: Roland Haas [roland.haas at physics.gatech.edu]
> > > 寄件日期: 2017年12月14日 下午 10:31
> > > 至: 林家暉
> > > 副本: users at einsteintoolkit.org
> > > 主旨: Re: [Users] questions about compilation
> > >
> > > Hello Chia Hui Lin,
> > >
> > > sorry for the delay, your email got sorted into the wrong folder.
> > >
> > > The error that you are seeing (kmpc stuff missing) is usually caused by
> > > a missing OpenMP flags. Please make sure you have:
> > >
> > > OPENMP           = yes
> > > CPP_OPENMP_FLAGS = -fopenmp
> > > FPP_OPENMP_FLAGS = -fopenmp
> > > C_OPENMP_FLAGS   = -fopenmp
> > > CXX_OPENMP_FLAGS = -fopenmp
> > > F77_OPENMP_FLAGS = -fopenmp
> > > F90_OPENMP_FLAGS = -fopenmp
> > >
> > > in you option list. You can also try and see if
> > >
> > > LDFLAGS = -fopenmp
> > >
> > > helps.
> > >
> > > Note that this assumes that your are using the gcc or newer intel
> > > compilers. If you are using an older intel compile then the option is
> > > calles -openmp (no "f") instead.
> > >
> > > The utilities are normally not crucial though and you can use Cactus
> > > without them.
> > >
> > > Yours,
> > > Roland
> > >
> > > > Dear Roland,
> > > > Thanks for your kindly reply !
> > > > Sorry for bothering you for several times.
> > > > I tried your suggestion and solved the both error of -ljpeg and dlopen.
> > > > And it was showed that:
> > > > Done creating cactus_sim.
> > > > All done !
> > > > However the compilation is not finished.The process of building utilities for sim started. And another error occurred, as showed in the screenshot. It seems to be similar to the problem in the link:
> > > > http://lists.einsteintoolkit.org/pipermail/trac/2011-October/002449.html
> > > > But I did not find a solution to this error.Is there some possible solution?
> > > > By the way, I commented some module loading in the edison.ini since some of them cannot successfully loaded (maybe because the version of those modules are outdated , and some module even not appear in the module list of edison) . As showed in another screenshot. I am afraid that this caused the occurrence of some error.
> > > >
> > > > Best regards,
> > > > Chia Hui Lin
> > > > ________________________________________
> > > > 從: Roland Haas [roland.haas at physics.gatech.edu]
> > > > 寄件日期: 2017年11月28日 上午 03:28
> > > > 至: 林家暉
> > > > 副本: users at einsteintoolkit.org
> > > > 主旨: Re: [Users] questions about compilation
> > > >
> > > > Hello Chia Hui Lin,
> > > >
> > > > hmm. I am not sure about the -ljpeg since ExternalLibraries/libjpeg
> > > > should provide this.
> > > >
> > > > You are also receiving a link time warning about using dlopen in a
> > > > statically linked application. While not directly related to the jpeg
> > > > library issue a possible workaround is force dynamic linking (not the
> > > > default one edison) by setting:
> > > >
> > > > export CRAYPE_LINK_TYPE=dynamic
> > > > export CRAY_ADD_RPATH=yes
> > > >
> > > > either as part of the envsetup lines in the file
> > > > simfactory/mdb/machine/edison.ini or on your command line (if eg not
> > > > using simfactory).
> > > >
> > > > Yours,
> > > > Roland
> > > >
> > > > > Dear Roland,
> > > > > Thanks for your kindly help and sorry for replying such late.
> > > > > It indeed solved the error by using the second method you suggested.
> > > > > However another error occurred , the attached file is the screenshot of the error message.
> > > > > It seems the code cannot find the library of jepg , but I think it is not missing since I found some versions of it by using the command :ldconfig -p | grep jepg (showed in the screenshot).
> > > > > Although I tried several compilers including intel compiler version 16, it did not work.
> > > > > What kind of the problem is it and is there some possible solutions ?
> > > > > Thank you .
> > > > > Best regards,
> > > > > Chia Hui Lin
> > > > > ________________________________________
> > > > > 從: Roland Haas [roland.haas at physics.gatech.edu]
> > > > > 寄件日期: 2017年11月21日 上午 12:42
> > > > > 至: 林家暉
> > > > > 副本: users at einsteintoolkit.org
> > > > > 主旨: Re: [Users] questions about compilation
> > > > >
> > > > > Hello Chia Hui Lin,
> > > > >
> > > > > looking at your output (internal compiler failure when compiling
> > > > > bbox.cc) this seems to be an instance of this bug:
> > > > >
> > > > > https://trac.einsteintoolkit.org/ticket/2021
> > > > >
> > > > > there seem to be two way around this:
> > > > >
> > > > > 1. change the file edison.ini to load an older Intel compiler (version
> > > > > 16)
> > > > > 2. edit the file repos/carpet/CarpetLib/src/bbox.cc and add at the
> > > > > beginning:
> > > > >
> > > > > #if __INTEL_COMPILER >= 1700
> > > > > #pragma GCC optimization_level 1
> > > > > #endif
> > > > >
> > > > > which reduces the optimization level and avoids the problem.
> > > > >
> > > > > Yours,
> > > > > Roland
> > > > >
> > > > > > Dear sir/madam,
> > > > > > I am a master student of physics department ,National Taiwan University, and a beginner of Einstein toolkit.
> > > > > > I started with my laptop and everything worked fine. While I turning to super cluster called Edison ,NERSC, some errors came up during compilation(that is, after I type the command:$ ./simfactory/bin/sim build ).
> > > > > > 1.The attached file is the screenshot of the error message. It seems the error is caused by compiler. However the same error appeared after shifting the intel compiler to cray compiler or gnu compiler.How can I solve the error?
> > > > > >
> > > > > > 2.The machine definition file of Edison(edison.ini) is last tested in May,2015, so I am afraid that some information of the machine is outdated .  Is this related to the previous problem(1.)? How can I update this file ?
> > > > > >
> > > > > > Thanks for your help.
> > > > > > Best regards,
> > > > > > Chia Hui Lin
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > My email is as private as my paper mail. I therefore support encrypting
> > > > > and signing email messages. Get my PGP key from http://pgp.mit.edu .
> > > >
> > > >
> > > >
> > > > --
> > > > My email is as private as my paper mail. I therefore support encrypting
> > > > and signing email messages. Get my PGP key from http://pgp.mit.edu .
> > >
> > >
> > >
> > > --
> > > My email is as private as my paper mail. I therefore support encrypting
> > > and signing email messages. Get my PGP key from http://pgp.mit.edu .
> >
> >
> >
> > --
> > My email is as private as my paper mail. I therefore support encrypting
> > and signing email messages. Get my PGP key from http://pgp.mit.edu .
>
>
>



--
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://keys.gnupg.net.


More information about the Users mailing list