[Users] Generalising to OpenMP

Chris Stevens c.stevens at ru.ac.za
Thu Mar 29 07:05:59 CDT 2018


Hi Erik,

thanks for your reply.

I did not know that OpenMP can change the number of threads during 
execution! I will definitely put in a check for this.

I also came across something rather odd: omp_get_num_threads() does not 
work when using gcc (always returns 1), while Intel seems fine. I found 
a little code snippet online that seems to work fine:

int n = 0;
   #pragma omp parallel reduction(+:n)
   n += 1;
   return n;

Thanks for the info,

Chris


On 03/28/2018 03:06 PM, Erik Schnetter wrote:
> Chris
>
> The standard approach is to have one scratch array per OpenMP thread.
> This looks like
>
> #pragma omp parallel
> {
>     ... initialise array ...
> #pragma omp for
>    for (...) {
>      ... loop ...
>    }
> }
>
> I'm repeating what you wrote since the inner "omp for" should not (!)
> have a "parallel" directive, since it is already located in a parallel
> region. Depending on how you clean up the scratch array, you could
> also look at the "nowait" directive for the "omp for" block.
>
> If it is expensive to set up the scratch array, then you can cache
> this. Remember that the number of OpenMP threads can change during
> execution. Even if you do not want to handle this case, you should
> still detect it and abort. I typically use a C++ vector for the
> thread-specific data structures.
>
> vector<foo> scratch;
>
> And the loop that uses this:
>
> #pragma omp parallel
> {
> #pragma single
>    if (scratch.size() < omp_get_num_threads()) {
>      ... call initialization routine ...
>   }
> #pragma omp for
>    for (...) {
>      ... loop ..
>    }
> }
>
> The "omp single" region is executed by exactly one thread, and all
> other threads will wait until this thread has finished, ensuring the
> scratch data structure is correctly set up.
>
> -erik
>
>
>
>
>
> On Wed, Mar 28, 2018 at 4:24 AM, Chris Stevens <c.stevens at ru.ac.za> wrote:
>> Hi everyone,
>>
>> I am new to OpenMP programming, and have quickly come across an (I think)
>> interesting task that I want to parallelise with OpenMP that does not seem
>> straightforwardly generalisable. I would like to hear your feedback on the
>> potential solutions and whether there is a better way to do it:
>>
>> Task: Create a scratch array, say of CCTK_REAL, during an initialisation
>> phase that is shared with other code using a header file. This array is
>> defined only once. This scratch array is then used as temporary storage
>> space in a function that does some computation within a for loop. i.e.
>>
>> function blah ()
>>
>> {
>>
>>      for (....)
>>
>>      {
>>
>>          computation involving scratch array
>>
>>      }
>>
>> }
>>
>> This works fine when OpenMP is turned off. However when you add to the for
>> loop above #pragma omp parallel for, then one sees that this one scratch
>> array will be being used by multiple threads at once, thus causing issues.
>>
>> Solution 1: Each thread has its own scratch space. I am not sure if it is
>> within the OpenMP paradigm to create scratch spaces for each thread within
>> the initialisation file, and share them through a header. This doesn't seem
>> to work and I am thinking this shouldn't be possible?
>>
>> Solution 2: Create one serial scratch space during initialisation that has
>> size dependent on omp_get_num_threads(). One would then have to change the
>> computation involving the scratch array to be dependent on
>> omp_get_thread_num() so as to use its allocated part of the scratch space.
>>
>> Solution 3: One could do something like
>>
>> #pragma omp parallel
>>
>> {
>>
>>      create scratch space
>>
>>      #pragma omp parallel for
>>
>>      for
>>
>>      {
>>
>>          computation involving thread specific scratch space
>>
>>      }
>>
>> }
>>
>> In summary: Solution 1 doesn't seem to work and probably doesn't fit into
>> the OpenMP paradigm. Solution 2 would work but doesn't seem very nice.
>> Solution 3 is nicer but inefficient as I don't want to be creating scratch
>> spaces all the time.
>>
>> Is there perhaps another way that fits better with the OpenMP paradigm?
>>
>> Many thanks!
>>
>> Chris
>>
>> --
>> Dr Chris Stevens
>>
>> Claude Leon Postdoctoral Fellow
>>
>> Department of Mathematics
>>
>> Rhodes University
>>
>> Room 5
>>
>> Ph: +27 46 603 8932
>>
>> Web: www.chrisdoesmaths.com
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at einsteintoolkit.org
>> http://lists.einsteintoolkit.org/mailman/listinfo/users
>>
>
>

-- 
Dr Chris Stevens

Claude Leon Postdoctoral Fellow

Department of Mathematics

Rhodes University

Room 5

Ph: +27 46 603 8932

Web: www.chrisdoesmaths.com <http://www.chrisdoesmaths.com>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20180329/3ef03a77/attachment.html 


More information about the Users mailing list