[Users] ET and recent Intel compilers

Hee Il Kim heeilkim at gmail.com
Wed Aug 25 01:42:11 CDT 2021


Hi Steve,

I'm replying to my original post. You can reproduce the crash with NaNs by
using the gallery parfile (nsnstohmns.par) and the id of binary neutron
star mergers.

The version of OneAPI is:
/opt/intel/oneapi/mpi/2021.3.0/bin/mpicc

The command used is:
$ mpiexec.hydra -env OMP_NUM_THREADS 1 -n 48  ./SIMFACTORY/cactus_bns_1api
-L 3 nsnstohmns.par

Other than the Nan production, the remaining issues seem to be fixed
somehow by tweaking runtime options. If you have any working runtime
configurations, please let me know.

You can find more (minor?) Intel compiler issues below. But please note the
failure of Intel-2020.

Thanks,

Hee Il

On Sat, Aug 7, 2021 at 6:20 PM Hee Il Kim <heeilkim at gmail.com> wrote:

> Hi,
>
> I've encountered various issues from recent Intel compilers. Except for
> some versions having header file issues, I could manage to build ET
> executables but actual run stalled at various steps depending on the
> compiler versions. For example,
>
> Case 1. Before initial data generation
> ...
> INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
> INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
> INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
>

I couldn't recall when this happened. But as mentioned below, it might be
fixed by taking proper runtime options.


>
> Case 2. During reading Lorene data (e.g., while reading resu.d)
>

Seemingly, this is also related to the runtime environment. Broken
load-balancing at some point makes the calculation extremely slow.

Case 3. Evolution started with normal IDs but NaNs produced from the next
> evolution step.
>
> I've encountered the issues in various combinations of  OneAPI/IntelMPI
> and OneAPI/OpenMPI on Centos8 (gcc-8.3.1) machines.
>

This is the main issue.


>
> There had been no issues for OpenMPI/GCC. But I've just found Case 1 is
> produced on an old machine with devtoolset gcc-6.3.1/openmpi-1.8.4.
>

Please forget about this. There were some conflicts between shared memory
options (sm, vader, kmem etc), which were fixed by taking proper runtime
options.

I didn't make a lot of effort because the machine was small and temporarily
accessible, but here's the additional note  for the other Intel versions:

- intel 2020 update 2 cluster edition.
- & intel 2019u0 and 2019u5
runtime error due to CarpetRegrid2: NaNs are produced because of wrong grid
setup.
...
ERROR from host xeon2.localdomain.com process 5
  in thorn CarpetRegrid2, file
/home/khi/ET/Turing/arrangements/Carpet/CarpetRegrid2/src/regrid.cc:91:
  -> Region 3 has 8 levels active, which is larger than the maximum number
of refinement levels 6

- Intel 2018u1
failed to find c++11 compilers.  -std=gnu++11, c++11 both failed even for
hello.c


- Intel 2017u2 & 2017u8
checking for M_PI... no
configure: error: M_PI not defined. Try adding -D_XOPEN_SOURCE to CPPFLAGS.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20210825/59953c59/attachment.html 


More information about the Users mailing list