[ET Trac] [Einstein Toolkit] #2211: gh::regrid contains code whose runtime is quadratic in number of components

Thu Dec 6 12:45:00 CST 2018

#2211: gh::regrid contains code whose runtime is quadratic in number of components
--------------------------+---------------------------------
  Reporter:  Roland Haas  |      Owner:  (none)
      Type:  defect       |     Status:  new
  Priority:  minor        |  Milestone:
 Component:  Carpet       |    Version:  development version
Resolution:               |   Keywords:
--------------------------+---------------------------------
Description changed by Roland Haas:

Old description:

> The member function gh::regrid contains this code
>
> {{{
>   // Check component consistency
> for (int ml = 0; ml < mglevels(); ++ml) {
>   for (int rl = 0; rl < reflevels(); ++rl) {
>     assert(components(rl) >= 0);
>     for (int c = 0; c < components(rl); ++c) {
>       ibbox const &b = extent(ml, rl, c);
>       ibbox const &b0 = extent(ml, rl, 0);
>       assert(all(b.stride() == b0.stride()));
>       assert(b.is_aligned_with(b0));
>       for (int cc = c + 1; cc < components(rl); ++cc) {
>         assert((b & extent(ml, rl, cc)).empty());
>       }
>     }
>   }
> }
> }}}
>
> which b/c of the {{{c}}} and {{cc}} loops is quadratic in the number of
> components (MPI ranks).
>
> For many (tens of thousands) components this becomes a dominant cost of
> the regrid operation.
>
> The Carpet branch {{{rhaas/quadratic_regrid_time}}} contains a test thorn
> ArrayTest and a hacked version of Carpet that can simulate a number of
> MPI ranks using just one rank.
>
> One can run:
> {{{
> mpirun -n 1 exe/cactus_sim arrangements/Carpet/ArrayTest/par/array.par
> }}}
> and control the number of pretend ranks by setting {{{ArrayTest::size}}}
> in array.par.
>
> On my workstation regrid takes ~3.5s for 16k ranks with the consistency
> check and ~0.5s without.
>
> This is per grid array and per grid scalar. So for a typical setup with
> ~400 grid arrays and grid scalars (no matter whether they have storage or
> not) this amounts to 400 * 3s = 1200s of time spent in the consistency
> check.
>
> This happens only once per simulation (since grid arrays and grid scalars
> are only regrid once) but for a test simulation can be quite significant
> (at scale).
>
> The branch actually arranges for the consistency check to be skipped if
> one defines {{{CARPET_OPTIMISE}}}.
>
> I attach a gnuplot script that takes carpet-timing-statistics.0000.txt
> and shows how much time is spent in dh and gh regrid.

New description:

 The member function gh::regrid contains this code

 {{{
   // Check component consistency
 for (int ml = 0; ml < mglevels(); ++ml) {
   for (int rl = 0; rl < reflevels(); ++rl) {
     assert(components(rl) >= 0);
     for (int c = 0; c < components(rl); ++c) {
       ibbox const &b = extent(ml, rl, c);
       ibbox const &b0 = extent(ml, rl, 0);
       assert(all(b.stride() == b0.stride()));
       assert(b.is_aligned_with(b0));
       for (int cc = c + 1; cc < components(rl); ++cc) {
         assert((b & extent(ml, rl, cc)).empty());
       }
     }
   }
 }
 }}}

 which b/c of the {{{c}}} and {{{cc}}} loops is quadratic in the number of
 components (MPI ranks).

 For many (tens of thousands) components this becomes a dominant cost of
 the regrid operation.

 The Carpet branch {{{rhaas/quadratic_regrid_time}}} contains a test thorn
 ArrayTest and a hacked version of Carpet that can simulate a number of MPI
 ranks using just one rank.

 One can run:
 {{{
 mpirun -n 1 exe/cactus_sim arrangements/Carpet/ArrayTest/par/array.par
 }}}
 and control the number of pretend ranks by setting {{{ArrayTest::size}}}
 in array.par.

 On my workstation regrid takes ~3.5s for 16k ranks with the consistency
 check and ~0.5s without.

 This is per grid array and per grid scalar. So for a typical setup with
 ~400 grid arrays and grid scalars (no matter whether they have storage or
 not) this amounts to 400 * 3s = 1200s of time spent in the consistency
 check.

 This happens only once per simulation (since grid arrays and grid scalars
 are only regrid once) but for a test simulation can be quite significant
 (at scale).

 The branch actually arranges for the consistency check to be skipped if
 one defines {{{CARPET_OPTIMISE}}}.

 I attach a gnuplot script that takes carpet-timing-statistics.0000.txt and
 shows how much time is spent in dh and gh regrid.

--

-- 
Ticket URL: <https://trac.einsteintoolkit.org/ticket/2211#comment:7>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit