[Users] OpenMP is making it slower?

Scott Hawley scott.hawley at belmont.edu
Wed May 18 16:37:28 CDT 2011


Erik, Frank, Peter: Thanks guys.  I will pursue your suggestions.  


--
Scott H. Hawley, Ph.D. 	 		Asst. Prof. of Physics                            
Chemistry & Physics Dept       		Office: Hitch 100D             
Belmont University                 	Tel:  +1-615-460-6206
Nashville, TN 37212 USA           	Fax: +1-615-460-5458
PGP Key at http://sks-keyservers.net



On May 18, 2011, at 12:14 AM, Peter Diener wrote:

> Hi Scott,
> 
> 
> On Tue, 17 May 2011, Frank Loeffler wrote:
> 
>> Hi,
>> 
>> On Tue, May 17, 2011 at 03:02:00PM -0700, Scott Hawley wrote:
>>> Do these all be need to be declared as private?
>> 
>> If the temporary variables are declared only inside the loop they are
>> automatically thread-local. Oh wait, that is Fortran. Well - in that
>> case you should either declare all private, or (maybe easier) put the
>> include files into separate functions, declare the temporary variables
>> only there and call the functions from within the loop, in which case
>> they also don't have to be specified for openmp (as long as they are not
>> static).
>> 
>>> i certainly don't want the various processors overwriting each others'
>>> work, which might be what they're doing. -- maybe they're even
>>> generating NaNs which would slow things down a bit!
> 
> Alternatively you may use the DEFAULT(PRIVATE) clause, so that you only 
> have to specify the shared variables. However, in that case you have to 
> make sure to really declare all the shared variables as shared, since 
> otherwise all processors will have to allocate storage and if they are 3d 
> variables this will slow down the code and increase memory consumption. 
> Also private variables have undefined values on entry to the parallel 
> region so not declaring all shared variables properly can also adversely
> affect the result. So be careful.
> 
>> You should see that in the results though. It might make sense to first
>> make sure that the results with different numbers of threads are the
>> same (depending on the problem you might actually get bit-by-bit
>> identical results), and work on optimization later. I agree that your
>> slow-down actually points towards some bug.
>> 
>> Frank
> 
> Cheers,
> 
>   Peter
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 535 bytes
Desc: This is a digitally signed message part
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20110518/5cc6daeb/attachment.bin 


More information about the Users mailing list