[Users] std::out_of_range error while checkpointing

Miguel Zilhão mzilhao at ffn.ub.es
Fri Aug 7 07:31:08 CDT 2015


Erik,

some more info. i briefly went through git history and found the offending commit. up to Carpet 
commit d52042b867eea1e0770b6de87ff3929ab7b9d297 everything works fine for me. from commit 
69c73fb12d2ee41f01928122cd178d1bae9f8e13 onward i get the error i mentioned.

indeed, in the current state, if i remove writing of the "active" attribute everything works just fine:

diff --git a/CarpetIOHDF5/src/Output.cc b/CarpetIOHDF5/src/Output.cc
index ec9b090..30c017d 100644
--- a/CarpetIOHDF5/src/Output.cc
+++ b/CarpetIOHDF5/src/Output.cc
@@ -894,10 +894,11 @@ static int AddAttributes (const cGH *const cctkGH, const char *fullname,
      HDF5_ERROR (H5Awrite (attr, H5T_NATIVE_INT, &ioffset[0]));
           HDF5_ERROR (H5Aclose (attr));

-    ostringstream buf;
-    buf << (vdd.at(Carpet::map)->
-            local_boxes.at(mglevel).at(refinementlevel).at(component).active);
-    WriteAttribute(dataset, "active", buf.str().c_str());
+  //   ostringstream buf;
+  //   buf << (vdd.at(Carpet::map)->
+  //           local_boxes.at(mglevel).at(refinementlevel).at(component).active);
+  //   WriteAttribute(dataset, "active", buf.str().c_str());
+
    }

    if (is_index) {


i can't tell why it breaks once that bit of code is in, though...

thanks,
Miguel


On 06/08/15 15:43, Miguel Zilhão wrote:
> hi Erik,
>
>> Can you look at <https://trac.einsteintoolkit.org/ticket/1800>? Can you try replacing "reflevel" in
>> line 816 of the file CarpetIOHDF5/src/Output.cc with "refinementlevel", and see whether this avoids
>> the problem?
>
> thanks, but this does not seem to help... i still get the same error and backtrace in gdb:
>
> #6  0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()
>      from /usr/lib64/libstdc++.so.6
> #7  0x0000000000c70402 in _M_range_check (__n=<optimized out>,
>       this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803
> #8  at (__n=<optimized out>, this=<optimized out>)
>       at /usr/include/c++/5/bits/stl_vector.h:824
> #9  CarpetIOHDF5::AddAttributes (cctkGH=cctkGH at entry=0x1b1b930,
>       fullname=fullname at entry=0x3e52d760 "ML_BSSN::cA", vdim=3,
>       refinementlevel=refinementlevel at entry=0, request=request at entry=0x3df4c990,
>       bbox=..., dataset=83886080, is_index=false)
>       at /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899
> #10 0x0000000000c72e5a in CarpetIOHDF5::WriteVarChunkedParallel (
>       cctkGH=cctkGH at entry=0x1b1b930, outfile=outfile at entry=167
>
>
> Miguel
>
>
>> On Wed, Aug 5, 2015 at 8:11 PM, Miguel Zilhão <mzilhao at ffn.ub.es <mailto:mzilhao at ffn.ub.es>> wrote:
>>
>>      hi all,
>>
>>      i'm running latest ET Hilbert on openSUSE tumbleweed and i'm having the following issue. upon
>>      trying to run a simple head-on collision configuration with McLachlan (attached parameter file),
>>      i get the error
>>
>>         INFO (CarpetIOHDF5): ---------------------------------------------------------
>>         INFO (CarpetIOHDF5): Dumping initial checkpoint at iteration 0, simulation time 0
>>         INFO (CarpetIOHDF5): ---------------------------------------------------------
>>         terminate called after throwing an instance of 'std::out_of_range'
>>           what():  vector::_M_range_check: __n (which is 1) >= this->size() (which is 1)
>>           Rank 1 with PID 5958 received signal 6
>>
>>      when writing the checkpoint file.
>>      this only happens if i run with more than one MPI process; with a single processor it runs fine.
>>
>>      i'm compiling with gcc-5, but i find the same problem with gcc-4.8. i was running this very same
>>      configuration just fine a couple of months ago, so it must have been some update i've made in
>>      the meantime (either to my OS or to ET).
>>      i've also tried with different configurations and the outcome is the same.
>>
>>      i've ran this through gdb, here's the relevant output:
>>
>>
>>      #6  0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()
>>          from /usr/lib64/libstdc++.so.6
>>      #7  0x00000000005bb398 in _M_range_check (__n=<optimized out>,
>>           this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803
>>      #8  at (__n=<optimized out>, this=<optimized out>)
>>           at /usr/include/c++/5/bits/stl_vector.h:824
>>      #9  CarpetIOHDF5::AddAttributes (cctkGH=cctkGH at entry=0x1b507d0,
>>           fullname=fullname at entry=0x3f2434a0 "ML_BSSN::cA", vdim=3,
>>           refinementlevel=refinementlevel at entry=0, request=request at entry=0x3df96370,
>>           bbox=..., dataset=83886080, is_index=false)
>>           at
>>      /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899
>>      #10 0x00000000005bdea4 in CarpetIOHDF5::WriteVarChunkedParallel (
>>           cctkGH=cctkGH at entry=0x1b507d0, outfile=outfile at entry=16777216,
>>           io_bytes=@0x7fffffffc980: 1110772, request=0x3df96370,
>>           called_from_checkpoint=called_from_checkpoint at entry=true,indexfile=indexfile at entry=-1)
>>           at
>>      /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:706
>>      #11 0x00000000005a233e in CarpetIOHDF5::Checkpoint (cctkGH=0x1b507d0,
>>           called_from=0)
>>           at
>>      /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/CarpetIOHDF5.cc:1277
>>      #12 0x000000000041f0d5 in CCTK_CallFunction (
>>           function=function at entry=0x5a2da0 <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,
>>      fdata=fdata at entry=0x1b4a4e8, data=data at entry=0x1b507d0)
>>           at /home/mzilhao/Trabalho/projectos/ET/Cactus/src/main/ScheduleInterface.c:312
>>      #13 0x0000000000ef6499 in Carpet::CallScheduledFunction (
>>           time_and_mode=time_and_mode at entry=0x1174842 "Meta mode",
>>           function=function at entry=0x5a2da0 <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,
>>      attribute=attribute at entry=0x1b4a4e8, data=data at entry=0x1b507d0,
>>           user_timer=...)
>>           at
>>      /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/Carpet/src/CallFunction.cc:380
>>
>>
>>      so the relevant bits of code seem to be in CarpetIOHDF5/src/Output.cc:706 and
>>      CarpetIOHDF5/src/Output.cc:899
>>
>>      this seems to be triggered when writing hdf5 output in parallel. if i remove checkpointing the
>>      run goes fine, and i do get regular 2D hdf5 output. this does not seem to be written in
>>      parallel, though, as i get only one file per grid function/group. so it seems to be the parallel
>>      output that triggers the crash.
>>
>>      i have also tried removing all my hdf5 libs and configuring ET with HDF5_DIR=BUILD, but the
>>      outcome was the same.
>>
>>      has anyone seen such an error before? anything else i could provide to help diagnose this?
>>
>>      thanks,
>>      Miguel
>>
>>      _______________________________________________
>>      Users mailing list
>>      Users at einsteintoolkit.org <mailto:Users at einsteintoolkit.org>
>>      http://lists.einsteintoolkit.org/mailman/listinfo/users
>>
>>
>>
>>
>> --
>> Erik Schnetter <schnetter at cct.lsu.edu <mailto:schnetter at cct.lsu.edu>>
>> http://www.perimeterinstitute.ca/personal/eschnetter/
> _______________________________________________
> Users mailing list
> Users at einsteintoolkit.org
> http://lists.einsteintoolkit.org/mailman/listinfo/users
>


More information about the Users mailing list