[Users] std::out_of_range error while checkpointing

Erik Schnetter schnetter at cct.lsu.edu
Wed Aug 12 12:46:56 CDT 2015


Miguel

This was an error in the development version of CarpetIOHDF5 -- at one
place, "component" should have been "local_component". This has now been
corrected.

-erik

On Fri, Aug 7, 2015 at 11:22 AM, Miguel Zilhão <mzilhao at ffn.ub.es> wrote:

> hi Erik,
>
> actually, i've just noticed that, for whatever reason, i was using
> Carpet's master branch... the rest of my ET installation does point to
> ET_2015_05 (and i thought Carpet did as well). i must have typed 'git
> checkout master' in the wrong place at some point, and never noticed it.
>
> so the 'fix' i mentioned in my previous email applies to Carpet's current
> master branch. if i checkout branch ET_2015_05 and compile that (as i
> thought i was doing), it works fine without me needing to change anything.
>
> thanks,
> Miguel
>
>
> On 07/08/15 15:08, Erik Schnetter wrote:
>
>> Miguel
>>
>> Thanks for tracking this down. I believe this code looks already
>> different on the current master.
>> Can you just disable it, as you did? The set of active grid points is
>> important for some very new
>> post-processing and visualization tools, but you probably don't need it,
>> and you don't need it in a
>> checkpoint file unless you want to visualize the data stored there.
>>
>> -erik
>>
>>
>> On Fri, Aug 7, 2015 at 8:31 AM, Miguel Zilhão <mzilhao at ffn.ub.es <mailto:
>> mzilhao at ffn.ub.es>> wrote:
>>
>>     Erik,
>>
>>     some more info. i briefly went through git history and found the
>> offending commit. up to Carpet
>>     commit d52042b867eea1e0770b6de87ff3929ab7b9d297 everything works fine
>> for me. from commit
>>     69c73fb12d2ee41f01928122cd178d1bae9f8e13 onward i get the error i
>> mentioned.
>>
>>     indeed, in the current state, if i remove writing of the "active"
>> attribute everything works
>>     just fine:
>>
>>     diff --git a/CarpetIOHDF5/src/Output.cc b/CarpetIOHDF5/src/Output.cc
>>     index ec9b090..30c017d 100644
>>     --- a/CarpetIOHDF5/src/Output.cc
>>     +++ b/CarpetIOHDF5/src/Output.cc
>>     @@ -894,10 +894,11 @@ static int AddAttributes (const cGH *const
>> cctkGH, const char *fullname,
>>           HDF5_ERROR (H5Awrite (attr, H5T_NATIVE_INT, &ioffset[0]));
>>                HDF5_ERROR (H5Aclose (attr));
>>
>>     -    ostringstream buf;
>>     -    buf << (vdd.at <http://vdd.at>(Carpet::map)->
>>     - local_boxes.at <http://local_boxes.at
>> >(mglevel).at(refinementlevel).at(component).active);
>>     -    WriteAttribute(dataset, "active", buf.str().c_str());
>>     +  //   ostringstream buf;
>>     +  //   buf << (vdd.at <http://vdd.at>(Carpet::map)->
>>     +  // local_boxes.at <http://local_boxes.at
>> >(mglevel).at(refinementlevel).at(component).active);
>>
>>     +  //   WriteAttribute(dataset, "active", buf.str().c_str());
>>     +
>>         }
>>
>>         if (is_index) {
>>
>>
>>     i can't tell why it breaks once that bit of code is in, though...
>>
>>     thanks,
>>     Miguel
>>
>>
>>
>>     On 06/08/15 15:43, Miguel Zilhão wrote:
>>
>>         hi Erik,
>>
>>             Can you look at <https://trac.einsteintoolkit.org/ticket/1800>?
>> Can you try replacing
>>             "reflevel" in
>>             line 816 of the file CarpetIOHDF5/src/Output.cc with
>> "refinementlevel", and see whether
>>             this avoids
>>             the problem?
>>
>>
>>         thanks, but this does not seem to help... i still get the same
>> error and backtrace in gdb:
>>
>>         #6  0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char
>> const*, ...) ()
>>               from /usr/lib64/libstdc++.so.6
>>         #7  0x0000000000c70402 in _M_range_check (__n=<optimized out>,
>>                this=<optimized out>) at
>> /usr/include/c++/5/bits/stl_vector.h:803
>>         #8  at (__n=<optimized out>, this=<optimized out>)
>>                at /usr/include/c++/5/bits/stl_vector.h:824
>>         #9  CarpetIOHDF5::AddAttributes (cctkGH=cctkGH at entry=0x1b1b930,
>>                fullname=fullname at entry=0x3e52d760 "ML_BSSN::cA", vdim=3,
>>                refinementlevel=refinementlevel at entry=0,
>> request=request at entry=0x3df4c990,
>>                bbox=..., dataset=83886080, is_index=false)
>>                at
>>
>> /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899
>>         #10 0x0000000000c72e5a in CarpetIOHDF5::WriteVarChunkedParallel (
>>                cctkGH=cctkGH at entry=0x1b1b930, outfile=outfile at entry=167
>>
>>
>>         Miguel
>>
>>
>>             On Wed, Aug 5, 2015 at 8:11 PM, Miguel Zilhão <
>> mzilhao at ffn.ub.es
>>             <mailto:mzilhao at ffn.ub.es> <mailto:mzilhao at ffn.ub.es <mailto:
>> mzilhao at ffn.ub.es>>> wrote:
>>
>>                   hi all,
>>
>>                   i'm running latest ET Hilbert on openSUSE tumbleweed
>> and i'm having the following
>>             issue. upon
>>                   trying to run a simple head-on collision configuration
>> with McLachlan (attached
>>             parameter file),
>>                   i get the error
>>
>>                      INFO (CarpetIOHDF5):
>> ---------------------------------------------------------
>>                      INFO (CarpetIOHDF5): Dumping initial checkpoint at
>> iteration 0, simulation time 0
>>                      INFO (CarpetIOHDF5):
>> ---------------------------------------------------------
>>                      terminate called after throwing an instance of
>> 'std::out_of_range'
>>                        what():  vector::_M_range_check: __n (which is 1)
>> >= this->size() (which is 1)
>>                        Rank 1 with PID 5958 received signal 6
>>
>>                   when writing the checkpoint file.
>>                   this only happens if i run with more than one MPI
>> process; with a single processor
>>             it runs fine.
>>
>>                   i'm compiling with gcc-5, but i find the same problem
>> with gcc-4.8. i was running
>>             this very same
>>                   configuration just fine a couple of months ago, so it
>> must have been some update
>>             i've made in
>>                   the meantime (either to my OS or to ET).
>>                   i've also tried with different configurations and the
>> outcome is the same.
>>
>>                   i've ran this through gdb, here's the relevant output:
>>
>>
>>                   #6  0x00007ffff4f6d4d5 in
>> std::__throw_out_of_range_fmt(char const*, ...) ()
>>                       from /usr/lib64/libstdc++.so.6
>>                   #7  0x00000000005bb398 in _M_range_check
>> (__n=<optimized out>,
>>                        this=<optimized out>) at
>> /usr/include/c++/5/bits/stl_vector.h:803
>>                   #8  at (__n=<optimized out>, this=<optimized out>)
>>                        at /usr/include/c++/5/bits/stl_vector.h:824
>>                   #9  CarpetIOHDF5::AddAttributes (cctkGH=cctkGH at entry
>> =0x1b507d0,
>>                        fullname=fullname at entry=0x3f2434a0 "ML_BSSN::cA",
>> vdim=3,
>>                        refinementlevel=refinementlevel at entry=0,
>> request=request at entry=0x3df96370,
>>                        bbox=..., dataset=83886080, is_index=false)
>>                        at
>>
>>
>> /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899
>>                   #10 0x00000000005bdea4 in
>> CarpetIOHDF5::WriteVarChunkedParallel (
>>                        cctkGH=cctkGH at entry=0x1b507d0,
>> outfile=outfile at entry=16777216,
>>                        io_bytes=@0x7fffffffc980: 1110772,
>> request=0x3df96370,
>>
>>             called_from_checkpoint=called_from_checkpoint at entry
>> =true,indexfile=indexfile at entry=-1)
>>                        at
>>
>>
>> /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:706
>>                   #11 0x00000000005a233e in CarpetIOHDF5::Checkpoint
>> (cctkGH=0x1b507d0,
>>                        called_from=0)
>>                        at
>>
>>
>> /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/CarpetIOHDF5.cc:1277
>>                   #12 0x000000000041f0d5 in CCTK_CallFunction (
>>                        function=function at entry=0x5a2da0
>>             <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,
>>                   fdata=fdata at entry=0x1b4a4e8, data=data at entry=0x1b507d0)
>>                        at
>> /home/mzilhao/Trabalho/projectos/ET/Cactus/src/main/ScheduleInterface.c:312
>>                   #13 0x0000000000ef6499 in Carpet::CallScheduledFunction
>> (
>>                        time_and_mode=time_and_mode at entry=0x1174842 "Meta
>> mode",
>>                        function=function at entry=0x5a2da0
>>             <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,
>>                   attribute=attribute at entry=0x1b4a4e8, data=data at entry
>> =0x1b507d0,
>>                        user_timer=...)
>>                        at
>>
>>
>> /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/Carpet/src/CallFunction.cc:380
>>
>>
>>                   so the relevant bits of code seem to be in
>> CarpetIOHDF5/src/Output.cc:706 and
>>                   CarpetIOHDF5/src/Output.cc:899
>>
>>                   this seems to be triggered when writing hdf5 output in
>> parallel. if i remove
>>             checkpointing the
>>                   run goes fine, and i do get regular 2D hdf5 output.
>> this does not seem to be
>>             written in
>>                   parallel, though, as i get only one file per grid
>> function/group. so it seems to
>>             be the parallel
>>                   output that triggers the crash.
>>
>>                   i have also tried removing all my hdf5 libs and
>> configuring ET with
>>             HDF5_DIR=BUILD, but the
>>                   outcome was the same.
>>
>>                   has anyone seen such an error before? anything else i
>> could provide to help
>>             diagnose this?
>>
>>                   thanks,
>>                   Miguel
>>
>>                   _______________________________________________
>>                   Users mailing list
>>             Users at einsteintoolkit.org <mailto:Users at einsteintoolkit.org>
>>             <mailto:Users at einsteintoolkit.org <mailto:
>> Users at einsteintoolkit.org>>
>>             http://lists.einsteintoolkit.org/mailman/listinfo/users
>>
>>
>>
>>
>>             --
>>             Erik Schnetter <schnetter at cct.lsu.edu <mailto:
>> schnetter at cct.lsu.edu>
>>             <mailto:schnetter at cct.lsu.edu <mailto:schnetter at cct.lsu.edu
>> >>>
>>             http://www.perimeterinstitute.ca/personal/eschnetter/
>>
>>         _______________________________________________
>>         Users mailing list
>>         Users at einsteintoolkit.org <mailto:Users at einsteintoolkit.org>
>>         http://lists.einsteintoolkit.org/mailman/listinfo/users
>>
>>
>>
>>
>> --
>> Erik Schnetter <schnetter at cct.lsu.edu <mailto:schnetter at cct.lsu.edu>>
>> http://www.perimeterinstitute.ca/personal/eschnetter/
>>
>


-- 
Erik Schnetter <schnetter at cct.lsu.edu>
http://www.perimeterinstitute.ca/personal/eschnetter/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20150812/bd18c9ac/attachment-0001.html 


More information about the Users mailing list