[Users] Problem with CarpetRegrid2/AMR

Thu Sep 1 10:37:02 CDT 2011

On Thu, Sep 1, 2011 at 10:53 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> On Tue, 2011-08-30 at 21:06 -0400, Erik Schnetter wrote:
>> On Tue, Aug 30, 2011 at 5:28 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> > Could I also decrease the block size? I currently have
>> > CarpetRegrid2::adaptive_block_size = 4, could it be smaller than that?
>> > Is there a restriction based on the number of ghost points?
>>
>> Yes, you can reduce the block size. I assume that both the regridding
>> operation and the time evolution will become slower if you do that,
>> because more blocks will have to be handled.
>
> Regardless of what I do, once we get past the first coarse time step,
> the program seems to "hang" at "INFO (Carpet): [ml=0][rl=0][m=0][tl=0]
> Regridding map 0...".
>
> Overall, it is in dh::regrid(do_init=true). It spends most of its time
> in bboxset<int, 3>::normalize() and, specifically, mostly in the loop:
> for (typename bset::iterator nsi = nbs.begin(); nsi != nbs.end(); ++
> nsi). The normalize() function does exit, however, so it is not hanging
> in that function.
>
> The core problem seems to be that it takes a long time to execute:
> boxes  = boxes .shift(-dir) - boxes;
> in dh::regrid(do_init=true). Probably because boxes has 129064 elements.
> The coarse grid is now only 30^3 and I've left the regrid box size at 4.
> I'd think, then, that the coarse grid should have a maximum of 30^3/4^3
> ~ 420 refinement regions.
>
> What is the best way to figure out what is going on?

Hal

Yes, this function is very slow. I did not expect it to be
prohibitively slow. Are you compiling with optimisation enabled?

The bboxset represents the set of refined regions, and it is
internally represented as a list of bboxes (regions). Carpet performs
set operations on these (intersection, union, complement, etc.) to
determine the communication schedule, i.e. which ghost zones of which
bbox need to be filled from which other bbox. Unfortunately, the
algorithm used for this is O(n^2) in the number of refined regions,
and set operations when implemented via lists themselves are O(n^2) in
the set size, leading to a rather unfortunate overall complexity. The
only cure is to reduce the number of bboxes (make them larger) and to
regrid fewer times.

We are running on several hundred MPI processes, and are therefore
dealing with set sizes of several hundred in production mode.
(Carpet's domain decomposition splits boxes, so that each process
receives a set of boxes.) It may be that the irregularity of the AMR
box distribution leads to a worse case than our domain decomposition.

We have implemented / are implementing set operations based on a tree
datastructure, which should be much more efficient. This is the work
of Ashley Zebrowski, a graduate student at LSU who just tried
(unsuccessfully -- a hurricane was in the way) to attend the ParCo
2011 conference in Belgium to present this work.

Frank, could you give a brief update of the status of Ashley's work?

-erik

-- 
Erik Schnetter <schnetter at cct.lsu.edu>   http://www.cct.lsu.edu/~eschnett/