[Users] ET and recent Intel compilers
Hee Il Kim
heeilkim at gmail.com
Wed Aug 25 01:42:11 CDT 2021
Hi Steve,
I'm replying to my original post. You can reproduce the crash with NaNs by
using the gallery parfile (nsnstohmns.par) and the id of binary neutron
star mergers.
The version of OneAPI is:
/opt/intel/oneapi/mpi/2021.3.0/bin/mpicc
The command used is:
$ mpiexec.hydra -env OMP_NUM_THREADS 1 -n 48 ./SIMFACTORY/cactus_bns_1api
-L 3 nsnstohmns.par
Other than the Nan production, the remaining issues seem to be fixed
somehow by tweaking runtime options. If you have any working runtime
configurations, please let me know.
You can find more (minor?) Intel compiler issues below. But please note the
failure of Intel-2020.
Thanks,
Hee Il
On Sat, Aug 7, 2021 at 6:20 PM Hee Il Kim <heeilkim at gmail.com> wrote:
> Hi,
>
> I've encountered various issues from recent Intel compilers. Except for
> some versions having header file issues, I could manage to build ET
> executables but actual run stalled at various steps depending on the
> compiler versions. For example,
>
> Case 1. Before initial data generation
> ...
> INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
> INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
> INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
>
I couldn't recall when this happened. But as mentioned below, it might be
fixed by taking proper runtime options.
>
> Case 2. During reading Lorene data (e.g., while reading resu.d)
>
Seemingly, this is also related to the runtime environment. Broken
load-balancing at some point makes the calculation extremely slow.
Case 3. Evolution started with normal IDs but NaNs produced from the next
> evolution step.
>
> I've encountered the issues in various combinations of OneAPI/IntelMPI
> and OneAPI/OpenMPI on Centos8 (gcc-8.3.1) machines.
>
This is the main issue.
>
> There had been no issues for OpenMPI/GCC. But I've just found Case 1 is
> produced on an old machine with devtoolset gcc-6.3.1/openmpi-1.8.4.
>
Please forget about this. There were some conflicts between shared memory
options (sm, vader, kmem etc), which were fixed by taking proper runtime
options.
I didn't make a lot of effort because the machine was small and temporarily
accessible, but here's the additional note for the other Intel versions:
- intel 2020 update 2 cluster edition.
- & intel 2019u0 and 2019u5
runtime error due to CarpetRegrid2: NaNs are produced because of wrong grid
setup.
...
ERROR from host xeon2.localdomain.com process 5
in thorn CarpetRegrid2, file
/home/khi/ET/Turing/arrangements/Carpet/CarpetRegrid2/src/regrid.cc:91:
-> Region 3 has 8 levels active, which is larger than the maximum number
of refinement levels 6
- Intel 2018u1
failed to find c++11 compilers. -std=gnu++11, c++11 both failed even for
hello.c
- Intel 2017u2 & 2017u8
checking for M_PI... no
configure: error: M_PI not defined. Try adding -D_XOPEN_SOURCE to CPPFLAGS.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20210825/59953c59/attachment.html
More information about the Users
mailing list