[Users] CarpetIOHDF5 recover failure with manual topology

Haas, Roland rhaas at illinois.edu
Fri Sep 13 14:21:20 CDT 2019


Hello all,

map=51 seems like a bug to me (access to uninitialized variables most
likely) since the "map" is Llama's coordinate patch (with the Cartesian
patch being map 0).

Unless your really have at least 52 patches (and are using Llama) this
thus seems like a bug.

Yours,
Roland

> Do you want to submit a PR?
> 
> --Steve
> 
> On 9/13/2019 9:27 AM, Yosef Zlochower wrote:
> > Thanks. The issue seems to be that with manual topology a region_t 
> > structure has it's map entry incorrectly set
> >
> > What happens is, the in
> >
> > bool gh::recompose there is the check
> >   bool const do_recompose = level_did_change(rl);
> >
> > In level_did_change, the level is considered to change because
> >
> > the new region_t is
> >
> > region_t(extent=([41,0,0]:[80,10,10]:[1,1,1]/[41,0,0]:[80,10,10]/[40,11,11]/4840),outer_boundaries=[[0,1,1],[1,1,1]],map=51,processor=1) 
> >
> >
> > while the old 
> > isregion_t(extent=([41,0,0]:[80,10,10]:[1,1,1]/[41,0,0]:[80,10,10]/[40,11,11]/4840),outer_boundaries=[[0,1,1],[1,1,1]],map=0,processor=1)
> >
> > The only difference is the new map is 51.
> >
> > If I add a line Carpet/src/Recompose.cc:SplitRegions_AsSpecified
> > to force the map entry to be zero, then all seems to work.
> >
> >
> > Without the change, Carpet recomposes the grid but never calls the 
> > postregrid functions. Hence the Nans in grid::x
> >
> >
> >
> > On 9/12/19 2:37 PM, Steven R. Brandt wrote:  
> >> I said on the call there was an easy way to trace what function call you
> >> are in...
> >>
> >> Add this to your thornlist...
> >>
> >> !TARGET  = $ARR
> >> !TYPE = git
> >> !URL = https://github.com/stevenrbrandt/ReadWriteDiagnostics.git
> >> !REPO_PATH=$2
> >> !CHECKOUT =
> >> ReadWriteDiagnostics/FCall
> >>
> >> Then add FCall to your ActiveThorns and you'll see a message printed
> >> before and after each scheduled function.
> >>
> >> --Steve
> >>
> >> On 9/10/2019 3:03 PM, Yosef Zlochower wrote:  
> >>> It seems that there may be multiple issues. The parfile I sent before
> >>> tests for NaNs in grid::x. grid::x is not a checkpointed variable. It
> >>> seems that with manual topology, the grid::x is filled with nans during
> >>> the recover step (the pointer is actually pointing to a new area of
> >>> memory). With standard topology, the array pointer and contents do not
> >>> change on recover. I have also seen NaNs in the recovered variables, 
> >>> but
> >>> this parfile doesn't show that.
> >>>
> >>>
> >>>
> >>> On 9/9/19 4:24 PM, Yosef Zlochower wrote:  
> >>>> Hi,
> >>>>
> >>>>      I have been trying to debug why some runs I was performing 
> >>>> could not
> >>>> recover from a checkpoint file, but would otherwise proceed as normal.
> >>>>
> >>>> I attached a minimalist parfile showing the problem. A small grid is
> >>>> manually distributed over 8 processors and terminates at iteration 
> >>>> 2. An
> >>>> attempt at recover fails with nans on grid::x. If the manual topology
> >>>> section is commented out, no problems are seen.
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Users mailing list
> >>>> Users at einsteintoolkit.org
> >>>> http://lists.einsteintoolkit.org/mailman/listinfo/users
> >>>>  
> >>> _______________________________________________
> >>> Users mailing list
> >>> Users at einsteintoolkit.org
> >>> http://lists.einsteintoolkit.org/mailman/listinfo/users  
> >> _______________________________________________
> >> Users mailing list
> >> Users at einsteintoolkit.org
> >> http://lists.einsteintoolkit.org/mailman/listinfo/users
> >>  
> _______________________________________________
> Users mailing list
> Users at einsteintoolkit.org
> http://lists.einsteintoolkit.org/mailman/listinfo/users



-- 
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://pgp.mit.edu .
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20190913/05f12035/attachment.bin 


More information about the Users mailing list