[Users] Segmentation fault in TOV test

Roland Haas rhaas at illinois.edu
Sat Jul 27 11:47:31 CDT 2024


Hello Riannon,

> I've managed to compile ET after switching to gnu compilers and
> openmpi, but I'm getting this warning:
> 
> /usr/bin/ld: warning: libgfortran.so.3, needed by
> /usr/lib64/../lib64/liblapack.so, may conflict with libgfortran.so.5

That warning means that the liblapack library, which in you case is the
system provided one, was compiled with at gfortran compiler with a
(significantly) different version that the remainder of your code and
uses a different Fortran runtime library. This can lead to difficult to
debug issues at runtime since only *one* runtime library will be
loaded, and the order is not immediately obvious (if the older version
is the one loaded one can expect the newer code to fail).

The fix for this would be to either load an environment module for
LAPACK (could well be a module name OpeBLAS or ATLAS) that was compiled
with the same (or similar) version of gfortran that you are using, or
you can ask Cactus to compile the included copy of LAPACK from scratch.
Note that LAPACK (certainly the one included as a source code in the
Einstein Toolkit and possibly also the system provided one) are the
"reference implementation" and are quite slow (but nothing in the
toolkit really relies on LAPACK / BLAS for speed).

For this you would have to add a 

LAPACK_DIR = <path-to-folder-containing-liblapack.a>
BLAS_DIR = <path-to-folder-containing-liblblas.a>

to your option list file when using an environment module.

Or setting both to the word BUILD to force compilation from source:

LAPACK_DIR=BUILD
BLAS_DIR=BUILD

Then recompile (ideally from scratch).

> I'm also wondering about another message I get:
> "This OpenMPI build is integrated with Slurm. Use 'srun' to launch in
> a Slurm job rather than 'mpirun'."

Since this is  a cluster (and not with publicly accessible
simfactory files) would it be possible to include them?

Those PMI messages would point to an incompatible MPI version being
used to compile and run.

I am not quite sure (without having seen the option list, machine ini
file, submit and run scripts) what is causing this.

As a guess I would say to make sure that the same MPI module that was
used to compile is also loaded when submitting and running the jobs.
This is most easily achieved by adding it to the envsetup block of the
machine ini file.

You can also verify that things are sane by running eg:

./simfactory/bin/sim execute "which mpirun"

which will show you the full path of the mpirun executable used when
the modules from envsetup are loaded.

Compare this to the paths that 

./simfactory/bin/sim execute "module show <your-mpi-module>"

shows.

You can also consider to follow the suggestion of the error message and
use `srun` instead of `mpirun` (there are examples how to do that among
the existing runscripts in im mdb/runscripts). Possibly this helps, it
depends on how the cluster was set up.

Yours,
Roland

-- 
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://pgp.mit.edu .
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.einsteintoolkit.org/pipermail/users/attachments/20240727/a9bce6b6/attachment.sig>


More information about the Users mailing list