[Users] Thorn ADM

Wed Dec 4 17:55:34 CST 2013

On 4 Dec 2013, at 17:57, Erik Schnetter <schnetter at cct.lsu.edu> wrote:

> On Dec 4, 2013, at 10:28 , Ian Hinder <ian.hinder at aei.mpg.de> wrote:
> 
>> On 3 Dec 2013, at 05:05, Frank Loeffler <knarf at cct.lsu.edu> wrote:
>> 
>>> Hi
>>> 
>>> I would like to take thorn ADM out of the Einstein Toolkit thornlist
>>> entirely. It's using ADMMacros (with all the problems that come with
>>> it), all test cases use a static conformal metric , which isn't
>>> supported anymore and which let's them currently all fail as well.
>>> 
>>> I just committed a bunch of changes to testsuites of other thorns that
>>> were using ADM. Most of them were trivial, in one case I had to create a
>>> new testsuite using ML_BSSN instead. Now no testsuite of another thorn
>>> within the toolkit uses ADM. It's time to retire it. If somebody really
>>> would like to use ADM it would likely be better to look at ML_ADM
>>> instead (which isn't in the toolkit, but could probably be if there is
>>> interest).
>>> 
>>> Please object here loudly if you disagree. If we don't hear from you in
>>> some time, thorn ADM will be removed from the thornlist and will no
>>> longer be part of releases.
>> 
>> Hi Frank,
>> 
>> On what timescale will the thorn be removed?  As long as it is in the thornlist, its tests fail.  Having failing tests is bad, as it obscures new failures.  Since the test cases use the now-unsupported static conformal metric, I suggest that they are removed now, even if the thorn itself remains in the thornlist for some time.  In the very unlikely event that someone wants to resurrect the test cases, they are in the version history anyway.
> 
> [Slightly off-topic; this is a general argument, so please don't shoot it down by arguing about the current state of thorn ADM. Also, please don't dissect my argument paragraph by paragraph -- please respond to the whole argument.]
> 
> I believe that it is unrealistic to address all test case failures immediately. The only feasible way to do so would be a hard "revert" policy, where all changes that introduce test case failures are quickly reverted. Note that this means that changes that are harmless in and itself also need to be reverted if they happen to uncover failures that were hidden before. This is probably not what we want.
> 
> We can then either live with test cases that fail for extended periods of time (probably weeks at least), until the underlying issue has been addressed. In some cases, there will even need to be some discussion on how to address the test case failures. Removing failing test cases that are not quickly corrected may also not be what we want to do generally; presumably, someone put effort into the test case.
> 
> What we need is thus a way to "disable" a test case so that it doesn't continue to fail and detract from real issues, such as a commit that introduces a genuine bug. We want to be able to "disable" a test case in such a way that it is still marked as "valid", such as e.g. as "valid in principle, but currently under construction".
> 
> Of course, marking a test case as "xfail" (expected to fail) can't be a knee-jerk reaction; this would defeat the purpose. However, this gives people a legitimate way to take their time to properly discuss and implement a solution to a complex issue that may be uncovered, without pressure to either remove a thorn or remove a test case. When used in moderation, such a mechanism can be quite valuable.
> 
> Obviously, when the time for a release comes, one would hope to reduce the number of xfail test cases, similar that one would hope to reduce the number of open bugs. In an ideal world, each bug report would be accompanied by a test case, so that one knew whether the bug had been corrected. In this case, we would definitely want to have all the (currently failing) test cases, even if it may take weeks or months to address the bugs.
> 
> Thus I suggest to add a flag for each test case that can be set to either XSUCCEED or XFAIL. Test cases that don't behave according to this expectation are then counted as failure. 

After much pondering, I agree with you.  Test failures indicate where parts of the code are not working correctly; the analogy with tickets is a good one.  We don't stop work on everything else just because there is a bug; we allocate resources according to importance, and if a test failure is not important (e.g. affects an ancient thorn only) then it shouldn't require immediate action.

Unfortunately, Jenkins does not directly support what you propose, as far as I can tell.

The closest I can find is a "skipped" status for a test.  We would have to maintain a list of tests to "skip" and Jenkins would then report these as skipped.  I can experiment with this; I don't know how it would affect the overall job status.  On the per-machine tests, we would want to specify this on a per-machine basis.  This might not do what we want, in the end.

I have installed the "claim" plugin for Jenkins.  This allows a developer to "claim" a particular test failure and attach a message to it.  This could be a reason for failure, for example.  In the test report, you can then see of the failed tests have been claimed.  Ideally, we would want the overall job status to have an additional state "Claimed failures only" in addition to "Tests failed" and "All tests passed", but this is not supported.

I am looking at making the email notifications more transparent; they are supposed to only be sent when the situation changes, but they are also sent if you manually abort a build and then run it again, as the situation has "changed".  This happened just now, so it sent an email saying "new failures".  

-- 
Ian Hinder
http://numrel.aei.mpg.de/people/hinder

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20131205/5384b2b9/attachment.bin