[Users] Unable to set Cactus to use all threads in hypter-threaded cores
José Ferreira
jpmferreira at ua.pt
Thu Oct 31 12:03:54 CDT 2024
Dear Toolkit Community,
I’m struggling to make use of all of the available threads in the
toolkit when running on a machine that has hypter-threading enabled.
On my local machine, which does not have hypter-threading, if I invoke
the toolkit’s binary using |OMP_NUM_THREADS=2 mpirun -np 4 -- exe/base
-p par/parfile.par|, it outputs
|INFO (Carpet): MPI is enabled INFO (Carpet): Carpet is running on 4
processes INFO (Carpet): This is process 0 INFO (Carpet): OpenMP is
enabled INFO (Carpet): This process contains 2 threads, this is thread 0
INFO (Carpet): There are 8 threads in total INFO (Carpet): There are 2
threads per process |
This creates 4 processes with 2 threads each and uses all 8 of the
available threads in my CPU, as expected.
I am now free to change the number of processes and threads as I see
fit, in order to look for the configuration that minimizes the physical
time per hour.
However, most of my computations are performed in Marenostrum5, where
each machine has 2 sockets, each with 56 physical cores with
hyper-threading enabled, totaling to 112 physical cores or 224 threads
per machine. For some reason, the toolkit does not use all of the
available threads.
To replicate the scenario above, I use the following Slurm submission script
|#!/usr/bin/env bash #SBATCH -N 1 #SBATCH -n 4 #SBATCH -c 1 #SBATCH -t
30 export OMP_NUM_THREADS=2 srun --cpu-bind=none exe/base par/parfile.par |
where I ask for a single machine, 4 tasks (to me, task = a process) per
machine and 1 CPU per task, which due to hyper-threading should provide
2 threads per task.
The output of the toolkit is
|754 INFO (Carpet): MPI is enabled 755 INFO (Carpet): Carpet is running
on 4 processes 756 INFO (Carpet): This is process 0 757 INFO (Carpet):
OpenMP is enabled 758 INFO (Carpet): This process contains 1 threads,
this is thread 0 759 INFO (Carpet): There are 4 threads in total 760
INFO (Carpet): There are 1 threads per process 761 INFO (Carpet): This
process runs on host gs22r3b16, pid=1514092 762 INFO (Carpet): This
process runs on 8 cores: 54-55, 97, 105, 166-167, 209,217 763 INFO
(Carpet): Thread 0 runs on 8 cores: 54-55, 97, 105, 166-167, 209, 217 |
From the output above you can see that I have been provided with 8
cores, even though I have requested 4 CPUs in total, which means thar
the toolkit can see the available threads coming from hyper-threading.
It also shows that it ignored my request for 2 threads per process,
which I set via the environmental variable |OMP_NUM_THREADS|.
If I force |CACTUS_NUM_THREADS=2|, it crashes with the error
|INFO (Carpet): MPI is enabled INFO (Carpet): Carpet is running on 4
processes INFO (Carpet): This is process 0 INFO (Carpet): OpenMP is
enabled INFO (Carpet): This process contains 1 threads, this is thread 0
WARNING level 0 from host gs06r3b13 process 1 in thorn Carpet, file
/gpfs/home/uapt/uapt015213/projects/cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:187:
-> The environment variable CACTUS_NUM_THREADS is set to 2, but there
are 1 threads on this process. This may indicate a severe problem with
the OpenMP startup mechanism. |
which leads me to believe that it is MPI that is refusing to initialize
more threads, and not the toolkit itself.
My questions are:
1.
Is there a performance gain by making use of hyper-threading knowing
that the toolkit is memory bound and the different threads share the
same cache?
2.
If yes, how can I adapt my submission scripts to tell Cactus to make
use of hyper-threading?
Thank you in advance,
Best regards,
José Ferreira
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einsteintoolkit.org/pipermail/users/attachments/20241031/5ecb9dc8/attachment-0001.htm>
More information about the Users
mailing list