[ET Trac] [Einstein Toolkit] #1751: [Pull request: CactusUtils/WatchDog] new thorn to automatically terminate jobs that hang
Einstein Toolkit
trac-noreply at einsteintoolkit.org
Tue Mar 3 11:07:19 CST 2015
#1751: [Pull request: CactusUtils/WatchDog] new thorn to automatically terminate
jobs that hang
------------------------------------+---------------------------------------
Reporter: dradice@… | Owner: dradice@…
Type: enhancement | Status: assigned
Priority: optional | Milestone:
Component: EinsteinToolkit thorn | Version: development version
Resolution: | Keywords:
------------------------------------+---------------------------------------
Changes (by rhaas):
* owner: => dradice@…
* priority: unset => optional
* status: new => assigned
Comment:
I had a look through the code. At this point, the thorn is not yet ready
to be included in the ET or Cactus, however it seems worthwhile to have
such a thorn. Comments from most severe to least severe:
* the thorn needs documentation in the standard documentation.tex file,
this can be very short, I would think that the description in the pull
request plus the usual boilerplate text will be sufficient
* the thorn has no test case however since it would have to test for an
abort, a test case may be hard to design
* the thorn uses fprintf(stderr, ...) for both error and informational
messages. For informational messages (the "Everything is fine" message) it
could use CCTK_VInfo() in the main thread's ANALYSIS routine. The warnings
cannot be changed since they are emitted by the secondary thread and
Cactus is not thread safe.
* reading the man-page for asctime_r I do not think that the explicit zero
termination in line 32 and 49 is required since asctime null terminates
its output (which is also guaranteed to fit within 26 characters).
* if possible, the thorn should check for the presence of PTHREADS,
currently since PTHREADS is a Cactus extras, the only way to do so seems
to be at compile time via:
{{{
#ifndef(CCTK_PTHREADS)
#error "WATCHDOG required PTHREADS. Please enable PTHREADS=yes in your
option list."
#endif
}}}
* it may be interesting to make check_every STEERABLE=ALWAYS by resetting
it inside the ANALYSIS routine (and protecting access to it by the mutex).
--
Ticket URL: <https://trac.einsteintoolkit.org/ticket/1751#comment:2>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit
More information about the Trac
mailing list