[Users] Crash due to CarpetLib (apparently)

Lorenzo Ennoggi lorenzo.ennoggi at gmail.com
Wed Nov 17 18:32:02 CST 2021


Dear Erik,
thank you very much for your help. I resubmitted the run and indeed the
error disappeared: the time at which I got the crash before has now been
passed with no issues (up to now). I will go on and hope no more errors are
encountered. Unless I'm forced to, I'm not planning on changing the grid
structure, because I'm comparing Spritz and IllinoisGRMHD and I have
already completed the run with the latter code: changing the grid structure
in the Spritz run at this point would make the comparison a bit less fair
than it is.

I have just noticed that I forgot to actually attach the files in my
previous message: I'm attaching them for real now, even though they are
probably not useful anymore.

Thank you very much again.

Cheers,
Lorenzo


Il giorno mer 17 nov 2021 alle ore 15:16 Erik Schnetter <schnetter at gmail.com>
ha scritto:

> Lorenzo
>
> Thank you for the detailed analysis of the error location. This helps a
> lot!
>
> This is likely caused by an integer division by zero. If you look at
> the first error
> "Cactus/arrangements/Carpet/CarpetLib/src/defs.hh:144", you see that
> this function calculates an integer modulo. Signal 8 is a floating
> point exception, which is also raised for integer math, in particular
> for division by zero.
>
> It seems that this happens during output, while determining which
> regions of the grid are output. The modulo operations there are
> usually used either for error checking or to determine whether coarse
> and fine grid points are aligned.
>
> I don't know what would cause this problem. It could be that the error
> goes away if you try again. It might also be caused by a weird grid
> structure. Carpet has a lot of checks to ensure that the grid
> structure is reasonable, but these checks seem to fail sometimes. If
> that is the case, then a minor change to the grid structure (making
> refined regions slightly larger or smaller) might avoid the error. It
> might also be that changing the number of MPI processes helps, since
> this would change the domain decomposition of the grid structure.
>
> To investigate further we would need to know the grid structure at the
> time when the error occurs.
>
> -erik
>
> On Wed, Nov 17, 2021 at 2:30 PM Lorenzo Ennoggi
> <lorenzo.ennoggi at gmail.com> wrote:
> >
> > Hi,
> > one of my BNS simulations with the Spritz code started from a checkpoint
> and, after running for about one day, crashed with exit code 8. The error
> seems to be related to the Einstein Toolkit infrastructure, so I am posting
> this message on this mailing list. I am attaching the stdout and stderr
> (even though they don't look very informative) and the backtrace.
> >
> > Running addr2line -e <Cactus executable> <address> with the <address>
> listed at point 3 in the backtrace, I see that the error originates from
> Cactus/arrangements/Carpet/CarpetLib/src/defs.hh:144 . In order to see how
> we get to that point, I am listing here the files and line numbers
> corresponding to points 4 to 11 in the backtrace (points 12 and 13 are not
> relevant I think):
> >
> > 4.   Cactus/arrangements/Carpet/CarpetIOHDF5/src/OutputSlice.cc:1087
> > 5.   Cactus/arrangements/Carpet/CarpetIOHDF5/src/OutputSlice.cc:562
> > 6.   Cactus/arrangements/Carpet/CarpetIOHDF5/src/OutputSlice.cc:469
> > 7.   Cactus/arrangements/Carpet/CarpetIOHDF5/src/OutputSlice.cc:356
> > 8.   Cactus/arrangements/Carpet/Carpet/src/OutputGH.cc:51
> > 9.   Cactus/arrangements/Carpet/Carpet/src/Evolve.cc:730
> > 10. Cactus/arrangements/Carpet/Carpet/src/Evolve.cc:703
> > 11. Cactus/src/main/flesh.cc:88
> >
> > Visually, the 1D and 2D output does not show any obviously wrong
> features, so I have no clue about what is going on. Do you have any ideas?
> >
> > I am also attaching the parameter file I am running with and the
> optionlist I used to compile. Kindly let me know if I can provide further
> info and/or attach any other file you may find useful.
> >
> > Thank you very much in advance for your help,
> > Lorenzo Ennoggi
> > _______________________________________________
> > Users mailing list
> > Users at einsteintoolkit.org
> > http://lists.einsteintoolkit.org/mailman/listinfo/users
>
>
>
> --
> Erik Schnetter <schnetter at gmail.com>
> http://www.perimeterinstitute.ca/personal/eschnetter/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20211117/cd7ff32d/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: output_crash.zip
Type: application/zip
Size: 756000 bytes
Desc: not available
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20211117/cd7ff32d/attachment-0001.zip 


More information about the Users mailing list