[Users] logic of scheduling SelectBoundConds in McLachlan?
ian.hinder at aei.mpg.de
Mon Feb 18 14:40:59 CST 2013
On 18 Feb 2013, at 21:11, "Kelly, Bernard J. (GSFC-660.0)[UNIVERSITY OF MARYLAND BALTIMORE COUNTY]" <bernard.j.kelly at nasa.gov> wrote:
> [re-sent, with smaller attachment]
> Hi Roland, and thanks for your reply. I'm still a bit confused, I confess
> (see below) ...
> On 2/16/13 12:38 AM, "Roland Haas" <roland.haas at physics.gatech.edu> wrote:
>> Hello Bernard,
>>> Hi. Can someone explain to me why ML_BSSN calls SelectBoundConds within
>>> MoL_PostStep? It seems like the kind of once-off routine that would
>>> near the start of a simulation, rather than something that has to be
>>> performed every single timestep.
>> In the "new" boundary/symmetry interface (ie using thorn Boundary and
>> Symbase) one has to Select the variables for boundaries each time before
>> ApplyBCs is scheduled. There is a routine in Boundaries that clears the
> OK; but can you tell me *why* this is? Why should we ever have to
> re-specify what kind of boundary conditions we use during a simulation,
> any more than we re-specify the evolution equations? Perhaps I don't
> really understand what "select the variables" means here.
Take a look at the documentation for thorn Boundary (http://einsteintoolkit.org/documentation/ThornDoc/CactusBase/Boundary/documentation.html). The "selection" process is part of the API of the thorn. Maybe one of the old-timers can explain why it was done this way.
>>> I wouldn't mind, but while trying to understand why ML_BSSN was evolving
>>> so slowly on one of our machines, I looked at the TimerReport files, and
>>> saw that SelectBoundConds was taking *much* more time (like 20 times as
>>> long) than the actual RHS calculation routines.
>> The long time is most likely caused by the fact that the boundary
>> selection routine tends to be the one calling SYNC which means it is the
>> one that does an MPI wait (if there is load imbalance) and communicates
>> data for buffer zone prolongation etc.
> So it might be spending most of the time waiting for other cores to catch
If you look at timer output just for one process, you will almost certainly reach erroneous conclusions due to things like this. I recommend to look at the output on all processes (yes, performance profiling is hard).
> But if it's really waiting for prior routines to finish on other
> processors, then on the handful of cores where SBC appears significantly
> *quicker* than usual (e.g. ~50,000 seconds instead of ~100,000) I should
> see earlier routines taking correspondingly *longer*, right? But I don't.
It may also be that timings change significantly from one iteration to the next. Have you set your CPU affinity settings correctly?
I recommend to set the parameters
Carpet::schedule_barriers = yes
Carpet::sync_barriers = yes
This will insert an MPI barrier before and after each scheduled function call and sync. Then you can rely on the timings of the individual functions, and also see how much time is spent waiting to catch up (i.e. in load imbalance). At the moment, the function timers for functions which do communication will include time spent waiting for the other process to catch up.
> I'm attaching TimerReport files for two cores on the same (128-core)
> evolution. Core 000 is typical. Line 184 (the most up-to-date instance of
> "large" SBC behaviour) shows about 100K seconds spent cumulatively over
> the simulation so far. Core 052 shows only about half as much time used in
> the same routine, but I can't see what other EVOL routines might be taking
> up the slack.
> (Note, BTW, that what I'm running isn't vanilla ML_BSSN, but a locally
> modified version called MH_BSSN. The scheduling and most routines are
> almost identical to McLachlan)
>> My email is as private as my paper mail. I therefore support encrypting
>> and signing email messages. Get my PGP key from http://keys.gnupg.net.
> Users mailing list
> Users at einsteintoolkit.org
More information about the Users