[Users] CarpetIOHDF5 recover failure with manual topology

Yosef Zlochower yosef at astro.rit.edu
Tue Sep 10 15:03:35 CDT 2019


It seems that there may be multiple issues. The parfile I sent before 
tests for NaNs in grid::x. grid::x is not a checkpointed variable. It 
seems that with manual topology, the grid::x is filled with nans during 
the recover step (the pointer is actually pointing to a new area of 
memory). With standard topology, the array pointer and contents do not 
change on recover. I have also seen NaNs in the recovered variables, but 
this parfile doesn't show that.



On 9/9/19 4:24 PM, Yosef Zlochower wrote:
> Hi,
> 
>    I have been trying to debug why some runs I was performing could not 
> recover from a checkpoint file, but would otherwise proceed as normal.
> 
> I attached a minimalist parfile showing the problem. A small grid is 
> manually distributed over 8 processors and terminates at iteration 2. An 
> attempt at recover fails with nans on grid::x. If the manual topology 
> section is commented out, no problems are seen.
> 
> 
> _______________________________________________
> Users mailing list
> Users at einsteintoolkit.org
> http://lists.einsteintoolkit.org/mailman/listinfo/users
> 


More information about the Users mailing list