[Users] Test failures in CT_MultiLevel

Ian Hinder ian.hinder at aei.mpg.de
Wed Oct 14 13:40:16 CDT 2015


Hi Eloisa and Erik,

I am running the CT_MultiLevel elliptic solver tests on Datura.  I ran into two problems. 

1. BLAS/LAPACK vs OpenBLAS
2. Differences above tolerance in the Hamiltonian constraint

The first is that the test parameter files ask for thorns BLAS and LAPACK, which are disabled on Datura.  The ET thornlist contains

ExternalLibraries/BLAS
ExternalLibraries/LAPACK
#DISABLED ExternalLibraries/OpenBLAS

whereas datura.ini contains

disabled-thorns = <<EOT
    ExternalLibraries/BLAS
    ExternalLibraries/LAPACK

enabled-thorns = <<EOT
    ExternalLibraries/OpenBLAS

So by default, machines will use BLAS and LAPACK, but datura is configured to use OpenBLAS instead.  

I don't quite understand this, because parameter files will activate one or the other thorns, so a parameter file which requires either of these cannot work both on datura and on a machine which does not do this enabling/disabling.  There is a ticket (https://trac.einsteintoolkit.org/ticket/1674) in which the suggestion is made to switch to OpenBLAS across the ET, but as far as is written in the ticket, this was not implemented.

If we follow the suggestion in the ticket, then this would be solved by using OpenBLAS all the time, and editing all parameter files to use that instead.  If not, then it would probably be better for the choice of library to be a machine-specific option, and not something that needs to be selected in a parameter file by activating the right thorn.  Also, I'm not sure how the other options in datura.cfg affect this, since on Datura, the intention is to use MKL anyway.

LDFLAGS = -rdynamic -Wl,-rpath,/cluster/Compiler/Intel/ics_2013.1.039/mkl/lib/intel64 -Wl,-rpath,/cluster/Compiler/Intel/ics_2013.1.039/lib/intel64 # -Wl,-rpath,/usr/lib64
BLAS_DIR  = /cluster/Compiler/Intel/ics_2013.1.039/mkl/lib/intel64
BLAS_LIBS = -Wl,--start-group mkl_intel_lp64 mkl_intel_thread mkl_core -Wl,--end-group   iomp5   pthread
LAPACK_DIR  = /cluster/Compiler/Intel/ics_2013.1.039/mkl/lib/intel64
LAPACK_LIBS = -Wl,--start-group mkl_intel_lp64 mkl_intel_thread mkl_core -Wl,--end-group
OPENBLAS_DIR  = NO_BUILD
OPENBLAS_LIBS = -mkl

What is the best way forward?

The second problem is that when I switch to OpenBLAS in the test parameter files, all tests pass except for CTThorns/CT_MultiLevel/test/boostedpuncture.par, which fails with 

   ml_bssn-ml_ham.d.asc: substantial differences
      significant differences on 56 (out of 109) lines
      maximum absolute difference in column 13 is 16.796401082424
      maximum relative difference in column 13 is 10.214243488001
      (insignificant differences on 45 lines)
   ml_bssn-ml_ham.x.asc: substantial differences
      significant differences on 88 (out of 182) lines
      maximum absolute difference in column 13 is 5.30707663071086
      maximum relative difference in column 13 is 0.897391183181081
      (insignificant differences on 94 lines)
   ml_bssn-ml_ham.y.asc: substantial differences
      significant differences on 88 (out of 182) lines
      maximum absolute difference in column 13 is 5.30900732375572
      maximum relative difference in column 13 is 0.897408813491657
      (insignificant differences on 94 lines)
   ml_bssn-ml_ham.z.asc: substantial differences
      significant differences on 56 (out of 109) lines
      maximum absolute difference in column 13 is 5.30900949367607
      maximum relative difference in column 13 is 0.8974032303582
      (insignificant differences on 53 lines)

All other solution variables are within tolerance, suggesting that the solver is getting the right answer, though ml_ham would be more sensitive to small solution differences anyway.  None of the other CT_MultiLevel tests seem to be checking ml_ham, so it might be that if ml_ham was checked, it would also differ.  Could this be related to the McLachlan rewrite, or is the Hamiltonian constraint expected to be the same across the rewrite?  It could be the usual roundoff-error problem, but the differences listed are quite large.  Or maybe this comes from the method used by the elliptic solver, which may not be 100% deterministic due to thread ordering in the Gauss Seidel method?

-- 
Ian Hinder
http://members.aei.mpg.de/ianhin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20151014/63748ad4/attachment.html 


More information about the Users mailing list