[Users] std::out_of_range error while checkpointing

Miguel Zilhão mzilhao at ffn.ub.es
Fri Aug 7 10:22:10 CDT 2015


hi Erik,

actually, i've just noticed that, for whatever reason, i was using Carpet's master branch... the 
rest of my ET installation does point to ET_2015_05 (and i thought Carpet did as well). i must have 
typed 'git checkout master' in the wrong place at some point, and never noticed it.

so the 'fix' i mentioned in my previous email applies to Carpet's current master branch. if i 
checkout branch ET_2015_05 and compile that (as i thought i was doing), it works fine without me 
needing to change anything.

thanks,
Miguel


On 07/08/15 15:08, Erik Schnetter wrote:
> Miguel
>
> Thanks for tracking this down. I believe this code looks already different on the current master.
> Can you just disable it, as you did? The set of active grid points is important for some very new
> post-processing and visualization tools, but you probably don't need it, and you don't need it in a
> checkpoint file unless you want to visualize the data stored there.
>
> -erik
>
>
> On Fri, Aug 7, 2015 at 8:31 AM, Miguel Zilhão <mzilhao at ffn.ub.es <mailto:mzilhao at ffn.ub.es>> wrote:
>
>     Erik,
>
>     some more info. i briefly went through git history and found the offending commit. up to Carpet
>     commit d52042b867eea1e0770b6de87ff3929ab7b9d297 everything works fine for me. from commit
>     69c73fb12d2ee41f01928122cd178d1bae9f8e13 onward i get the error i mentioned.
>
>     indeed, in the current state, if i remove writing of the "active" attribute everything works
>     just fine:
>
>     diff --git a/CarpetIOHDF5/src/Output.cc b/CarpetIOHDF5/src/Output.cc
>     index ec9b090..30c017d 100644
>     --- a/CarpetIOHDF5/src/Output.cc
>     +++ b/CarpetIOHDF5/src/Output.cc
>     @@ -894,10 +894,11 @@ static int AddAttributes (const cGH *const cctkGH, const char *fullname,
>           HDF5_ERROR (H5Awrite (attr, H5T_NATIVE_INT, &ioffset[0]));
>                HDF5_ERROR (H5Aclose (attr));
>
>     -    ostringstream buf;
>     -    buf << (vdd.at <http://vdd.at>(Carpet::map)->
>     - local_boxes.at <http://local_boxes.at>(mglevel).at(refinementlevel).at(component).active);
>     -    WriteAttribute(dataset, "active", buf.str().c_str());
>     +  //   ostringstream buf;
>     +  //   buf << (vdd.at <http://vdd.at>(Carpet::map)->
>     +  // local_boxes.at <http://local_boxes.at>(mglevel).at(refinementlevel).at(component).active);
>     +  //   WriteAttribute(dataset, "active", buf.str().c_str());
>     +
>         }
>
>         if (is_index) {
>
>
>     i can't tell why it breaks once that bit of code is in, though...
>
>     thanks,
>     Miguel
>
>
>
>     On 06/08/15 15:43, Miguel Zilhão wrote:
>
>         hi Erik,
>
>             Can you look at <https://trac.einsteintoolkit.org/ticket/1800>? Can you try replacing
>             "reflevel" in
>             line 816 of the file CarpetIOHDF5/src/Output.cc with "refinementlevel", and see whether
>             this avoids
>             the problem?
>
>
>         thanks, but this does not seem to help... i still get the same error and backtrace in gdb:
>
>         #6  0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()
>               from /usr/lib64/libstdc++.so.6
>         #7  0x0000000000c70402 in _M_range_check (__n=<optimized out>,
>                this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803
>         #8  at (__n=<optimized out>, this=<optimized out>)
>                at /usr/include/c++/5/bits/stl_vector.h:824
>         #9  CarpetIOHDF5::AddAttributes (cctkGH=cctkGH at entry=0x1b1b930,
>                fullname=fullname at entry=0x3e52d760 "ML_BSSN::cA", vdim=3,
>                refinementlevel=refinementlevel at entry=0, request=request at entry=0x3df4c990,
>                bbox=..., dataset=83886080, is_index=false)
>                at
>         /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899
>         #10 0x0000000000c72e5a in CarpetIOHDF5::WriteVarChunkedParallel (
>                cctkGH=cctkGH at entry=0x1b1b930, outfile=outfile at entry=167
>
>
>         Miguel
>
>
>             On Wed, Aug 5, 2015 at 8:11 PM, Miguel Zilhão <mzilhao at ffn.ub.es
>             <mailto:mzilhao at ffn.ub.es> <mailto:mzilhao at ffn.ub.es <mailto:mzilhao at ffn.ub.es>>> wrote:
>
>                   hi all,
>
>                   i'm running latest ET Hilbert on openSUSE tumbleweed and i'm having the following
>             issue. upon
>                   trying to run a simple head-on collision configuration with McLachlan (attached
>             parameter file),
>                   i get the error
>
>                      INFO (CarpetIOHDF5): ---------------------------------------------------------
>                      INFO (CarpetIOHDF5): Dumping initial checkpoint at iteration 0, simulation time 0
>                      INFO (CarpetIOHDF5): ---------------------------------------------------------
>                      terminate called after throwing an instance of 'std::out_of_range'
>                        what():  vector::_M_range_check: __n (which is 1) >= this->size() (which is 1)
>                        Rank 1 with PID 5958 received signal 6
>
>                   when writing the checkpoint file.
>                   this only happens if i run with more than one MPI process; with a single processor
>             it runs fine.
>
>                   i'm compiling with gcc-5, but i find the same problem with gcc-4.8. i was running
>             this very same
>                   configuration just fine a couple of months ago, so it must have been some update
>             i've made in
>                   the meantime (either to my OS or to ET).
>                   i've also tried with different configurations and the outcome is the same.
>
>                   i've ran this through gdb, here's the relevant output:
>
>
>                   #6  0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()
>                       from /usr/lib64/libstdc++.so.6
>                   #7  0x00000000005bb398 in _M_range_check (__n=<optimized out>,
>                        this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803
>                   #8  at (__n=<optimized out>, this=<optimized out>)
>                        at /usr/include/c++/5/bits/stl_vector.h:824
>                   #9  CarpetIOHDF5::AddAttributes (cctkGH=cctkGH at entry=0x1b507d0,
>                        fullname=fullname at entry=0x3f2434a0 "ML_BSSN::cA", vdim=3,
>                        refinementlevel=refinementlevel at entry=0, request=request at entry=0x3df96370,
>                        bbox=..., dataset=83886080, is_index=false)
>                        at
>
>               /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899
>                   #10 0x00000000005bdea4 in CarpetIOHDF5::WriteVarChunkedParallel (
>                        cctkGH=cctkGH at entry=0x1b507d0, outfile=outfile at entry=16777216,
>                        io_bytes=@0x7fffffffc980: 1110772, request=0x3df96370,
>
>             called_from_checkpoint=called_from_checkpoint at entry=true,indexfile=indexfile at entry=-1)
>                        at
>
>               /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:706
>                   #11 0x00000000005a233e in CarpetIOHDF5::Checkpoint (cctkGH=0x1b507d0,
>                        called_from=0)
>                        at
>
>               /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/CarpetIOHDF5.cc:1277
>                   #12 0x000000000041f0d5 in CCTK_CallFunction (
>                        function=function at entry=0x5a2da0
>             <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,
>                   fdata=fdata at entry=0x1b4a4e8, data=data at entry=0x1b507d0)
>                        at /home/mzilhao/Trabalho/projectos/ET/Cactus/src/main/ScheduleInterface.c:312
>                   #13 0x0000000000ef6499 in Carpet::CallScheduledFunction (
>                        time_and_mode=time_and_mode at entry=0x1174842 "Meta mode",
>                        function=function at entry=0x5a2da0
>             <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,
>                   attribute=attribute at entry=0x1b4a4e8, data=data at entry=0x1b507d0,
>                        user_timer=...)
>                        at
>
>               /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/Carpet/src/CallFunction.cc:380
>
>
>                   so the relevant bits of code seem to be in CarpetIOHDF5/src/Output.cc:706 and
>                   CarpetIOHDF5/src/Output.cc:899
>
>                   this seems to be triggered when writing hdf5 output in parallel. if i remove
>             checkpointing the
>                   run goes fine, and i do get regular 2D hdf5 output. this does not seem to be
>             written in
>                   parallel, though, as i get only one file per grid function/group. so it seems to
>             be the parallel
>                   output that triggers the crash.
>
>                   i have also tried removing all my hdf5 libs and configuring ET with
>             HDF5_DIR=BUILD, but the
>                   outcome was the same.
>
>                   has anyone seen such an error before? anything else i could provide to help
>             diagnose this?
>
>                   thanks,
>                   Miguel
>
>                   _______________________________________________
>                   Users mailing list
>             Users at einsteintoolkit.org <mailto:Users at einsteintoolkit.org>
>             <mailto:Users at einsteintoolkit.org <mailto:Users at einsteintoolkit.org>>
>             http://lists.einsteintoolkit.org/mailman/listinfo/users
>
>
>
>
>             --
>             Erik Schnetter <schnetter at cct.lsu.edu <mailto:schnetter at cct.lsu.edu>
>             <mailto:schnetter at cct.lsu.edu <mailto:schnetter at cct.lsu.edu>>>
>             http://www.perimeterinstitute.ca/personal/eschnetter/
>
>         _______________________________________________
>         Users mailing list
>         Users at einsteintoolkit.org <mailto:Users at einsteintoolkit.org>
>         http://lists.einsteintoolkit.org/mailman/listinfo/users
>
>
>
>
> --
> Erik Schnetter <schnetter at cct.lsu.edu <mailto:schnetter at cct.lsu.edu>>
> http://www.perimeterinstitute.ca/personal/eschnetter/


More information about the Users mailing list