[Users] ET test failures on Stampede

Ian Hinder ian.hinder at aei.mpg.de
Wed Nov 5 04:04:29 CST 2014


On 31 Oct 2014, at 09:52, Ian Hinder <ian.hinder at aei.mpg.de> wrote:

> 
> On 30 Oct 2014, at 21:55, Erik Schnetter <schnetter at cct.lsu.edu> wrote:
> 
>> Ian
>> 
>> This new MPI version leads to problems running the benchmarks, and
>> runs at half the speed. (This test was on a single node.)
> 
> Ouch. I didn't notice that with my runs; either I wasn't paying attention or it didn't happen there.  I will check the next time I run on stampede.  

In my production simulations, I see a speed drop of 20% going from the current simfactory and stampede default of mvapich2/1.9 to mvapich2-x/2.0b as suggested by the TACC admins.  However, since the original simulations were hanging, I'm not sure which is better!   

Looking at the timers, prolongate is taking 868.4s with 1.9 and 1313.6s with 2.0b.  Sync is about the same speed on both, as are computational functions.  The -x suffix on the mvapich version seems to indicate that it supports the MICs; maybe there is some trade-off that is made.

> 
>> 
>> -erik
>> 
>> On Thu, Oct 30, 2014 at 11:34 AM, Ian Hinder <ian.hinder at aei.mpg.de> wrote:
>>> 
>>> On 30 Oct 2014, at 15:03, Erik Schnetter <schnetter at cct.lsu.edu> wrote:
>>> 
>>>> I've begun to run the automated tests for the ET on our production
>>>> machines. Things look very good almost everywhere, except on Stampede,
>>>> one of the machines that is most important to us. It seems that there
>>>> are many test failures for GRHydro, and these seem to be caused by
>>>> segfaults. Does anybody volunteer to investigate?
>>> 
>>> I don't know anything about the problems with GRHydro.
>>> 
>>> I was having problems a while back with the current simfactory default version of mvapich2, and TACC support suggested I try the mvapich2-x version.  The problem I saw was the MPI reductions would hang.  I have been using that for a few months with no problems.  The required change to the optionlist is:
>>> 
>>>> 
>>>> < MPI_DIR  = /opt/apps/intel13/mvapich2/1.9
>>>> ---
>>>>> MPI_DIR  = /home1/apps/intel13/mvapich2-x/2.0b
>>>>> MPI_LIB_DIRS = /home1/apps/intel13/mvapich2-x/2.0b/lib64
>>> 
>>> 
>>> At least, this was before the changes to the MPI thorn.  It's possible that this is no longer enough.
>>> 
>>> Should we change to this version in simfactory for the release?
>>> 
>>> --
>>> Ian Hinder
>>> http://numrel.aei.mpg.de/people/hinder
>>> 
>> 
>> 
>> 
>> -- 
>> Erik Schnetter <schnetter at cct.lsu.edu>
>> http://www.perimeterinstitute.ca/personal/eschnetter/
> 
> -- 
> Ian Hinder
> http://numrel.aei.mpg.de/people/hinder
> 
> _______________________________________________
> Users mailing list
> Users at einsteintoolkit.org
> http://lists.einsteintoolkit.org/mailman/listinfo/users

-- 
Ian Hinder
http://numrel.aei.mpg.de/people/hinder



More information about the Users mailing list