[Users] Unable to set Cactus to use all threads in hypter-threaded cores

José Ferreira jpmferreira at ua.pt
Thu Oct 31 12:03:54 CDT 2024


Dear Toolkit Community,


I’m struggling to make use of all of the available threads in the 
toolkit when running on a machine that has hypter-threading enabled.

On my local machine, which does not have hypter-threading, if I invoke 
the toolkit’s binary using |OMP_NUM_THREADS=2 mpirun -np 4 -- exe/base 
-p par/parfile.par|, it outputs

|INFO (Carpet): MPI is enabled INFO (Carpet): Carpet is running on 4 
processes INFO (Carpet): This is process 0 INFO (Carpet): OpenMP is 
enabled INFO (Carpet): This process contains 2 threads, this is thread 0 
INFO (Carpet): There are 8 threads in total INFO (Carpet): There are 2 
threads per process |

This creates 4 processes with 2 threads each and uses all 8 of the 
available threads in my CPU, as expected.

I am now free to change the number of processes and threads as I see 
fit, in order to look for the configuration that minimizes the physical 
time per hour.


However, most of my computations are performed in Marenostrum5, where 
each machine has 2 sockets, each with 56 physical cores with 
hyper-threading enabled, totaling to 112 physical cores or 224 threads 
per machine. For some reason, the toolkit does not use all of the 
available threads.

To replicate the scenario above, I use the following Slurm submission script

|#!/usr/bin/env bash #SBATCH -N 1 #SBATCH -n 4 #SBATCH -c 1 #SBATCH -t 
30 export OMP_NUM_THREADS=2 srun --cpu-bind=none exe/base par/parfile.par |

where I ask for a single machine, 4 tasks (to me, task = a process) per 
machine and 1 CPU per task, which due to hyper-threading should provide 
2 threads per task.

The output of the toolkit is

|754 INFO (Carpet): MPI is enabled 755 INFO (Carpet): Carpet is running 
on 4 processes 756 INFO (Carpet): This is process 0 757 INFO (Carpet): 
OpenMP is enabled 758 INFO (Carpet): This process contains 1 threads, 
this is thread 0 759 INFO (Carpet): There are 4 threads in total 760 
INFO (Carpet): There are 1 threads per process 761 INFO (Carpet): This 
process runs on host gs22r3b16, pid=1514092 762 INFO (Carpet): This 
process runs on 8 cores: 54-55, 97, 105, 166-167, 209,217 763 INFO 
(Carpet): Thread 0 runs on 8 cores: 54-55, 97, 105, 166-167, 209, 217 |

 From the output above you can see that I have been provided with 8 
cores, even though I have requested 4 CPUs in total, which means thar 
the toolkit can see the available threads coming from hyper-threading.

It also shows that it ignored my request for 2 threads per process, 
which I set via the environmental variable |OMP_NUM_THREADS|.

If I force |CACTUS_NUM_THREADS=2|, it crashes with the error

|INFO (Carpet): MPI is enabled INFO (Carpet): Carpet is running on 4 
processes INFO (Carpet): This is process 0 INFO (Carpet): OpenMP is 
enabled INFO (Carpet): This process contains 1 threads, this is thread 0 
WARNING level 0 from host gs06r3b13 process 1 in thorn Carpet, file 
/gpfs/home/uapt/uapt015213/projects/cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:187: 
-> The environment variable CACTUS_NUM_THREADS is set to 2, but there 
are 1 threads on this process. This may indicate a severe problem with 
the OpenMP startup mechanism. |

which leads me to believe that it is MPI that is refusing to initialize 
more threads, and not the toolkit itself.


My questions are:

 1.

    Is there a performance gain by making use of hyper-threading knowing
    that the toolkit is memory bound and the different threads share the
    same cache?

 2.

    If yes, how can I adapt my submission scripts to tell Cactus to make
    use of hyper-threading?


Thank you in advance,

Best regards,

José Ferreira

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einsteintoolkit.org/pipermail/users/attachments/20241031/5ecb9dc8/attachment-0001.htm>


More information about the Users mailing list