[Users] GRHydro: NaN checks

Erik Schnetter schnetter at cct.lsu.edu
Mon Jul 19 14:47:32 CDT 2010


On Mon, Jul 19, 2010 at 9:04 AM, Christian D. Ott
<cott at tapir.caltech.edu> wrote:
>
> Hi,
>
> On Mon, Jul 19, 2010 at 08:43:00AM -0500, Frank Loeffler wrote:
>>
>> I agree with Erik that if they affect performance that much we should
>> protect them with #defines, but leave them in the code - even if this
>> clutters it. They are much easier to put back in by setting a #define
>> than by putting them all in back by hand. They can help quite a bit when
>> you know that there are NaNs produces somewhere in GRHydro and you want
>> to know quickly where and why.
>
> they don't do that.
>
> (a) They are not general -- i.e., in Prim2Con, there is a check on
> the lorentz factor and on dens, but that's it (it's even redunant!).
> This test won't tell you if there is a nan in another var.
>
> (b) They are only in some parts of the code (i.e., the parts that
> the person who needed them was using).
>
> (c) They don't tell you 'quickly where and why'. If you get an error
>    message like:
>
>   call CCTK_WARN(GRHydro_NaN_verbose,
>     "c2p failed and being told not to reset the pressure")
>
>    Do you have any clue what happened? You are not even being told
>    where (what physical location; what reflevel).
>
> Of course, one could spend one's life making these messages better.
> But we are paid for doing science, not for writing foolproof code.
>
> I maintain that there is no point in having such messages.  They don't
> help -- if you catch yourself a NaN (like I did yesterday!), you still
> have to go in and do old-fashioned debugging to see what is going on.
>
> Since we are not reaching a conclusion via e-mail, I would like
> to discuss this in today's call.

For the record I would like to mention some of the possibilities
mentioned during todays' phone call:

(1) There could be a macro CCTK_NANCHECK that checks a scalar variable
for nan, outputting a warning message.  This would be easy to disable,
and would be much cleaner than the current multi-line code for each
variable.

(2) The NaNChecker thorn could be used to generate a nanmask, either
automatically after each scheduled function (e.g. via Carpet's
CallFunction), or manually after each 3D loop.

(3) The thorn NaNCatcher can enable floating point exceptions as soon
as any nan is encountered, aborting the job (or starting a debugger).

These approaches do not work well at the moment, because there are
coarse grid points which do not contribute to the overall solution,
but it is not obvious which coarse grid points these are.  Carpet
needs to provide a mask describing which grid points remain "unused"
after restriction.

-erik

-- 
Erik Schnetter <schnetter at cct.lsu.edu>   http://www.cct.lsu.edu/~eschnett/


More information about the Users mailing list