<div dir="ltr">Miguel<div><br></div><div>Can you look at <<a href="https://trac.einsteintoolkit.org/ticket/1800">https://trac.einsteintoolkit.org/ticket/1800</a>>? Can you try replacing "reflevel" in line 816 of the file CarpetIOHDF5/src/Output.cc with "refinementlevel", and see whether this avoids the problem?</div><div><br></div><div>-erik</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 5, 2015 at 8:11 PM, Miguel Zilhão <span dir="ltr"><<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">hi all,<br>
<br>
i'm running latest ET Hilbert on openSUSE tumbleweed and i'm having the following issue. upon trying to run a simple head-on collision configuration with McLachlan (attached parameter file), i get the error<br>
<br>
INFO (CarpetIOHDF5): ---------------------------------------------------------<br>
INFO (CarpetIOHDF5): Dumping initial checkpoint at iteration 0, simulation time 0<br>
INFO (CarpetIOHDF5): ---------------------------------------------------------<br>
terminate called after throwing an instance of 'std::out_of_range'<br>
what(): vector::_M_range_check: __n (which is 1) >= this->size() (which is 1)<br>
Rank 1 with PID 5958 received signal 6<br>
<br>
when writing the checkpoint file.<br>
this only happens if i run with more than one MPI process; with a single processor it runs fine.<br>
<br>
i'm compiling with gcc-5, but i find the same problem with gcc-4.8. i was running this very same configuration just fine a couple of months ago, so it must have been some update i've made in the meantime (either to my OS or to ET).<br>
i've also tried with different configurations and the outcome is the same.<br>
<br>
i've ran this through gdb, here's the relevant output:<br>
<br>
<br>
#6 0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()<br>
from /usr/lib64/libstdc++.so.6<br>
#7 0x00000000005bb398 in _M_range_check (__n=<optimized out>,<br>
this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803<br>
#8 at (__n=<optimized out>, this=<optimized out>)<br>
at /usr/include/c++/5/bits/stl_vector.h:824<br>
#9 CarpetIOHDF5::AddAttributes (cctkGH=cctkGH@entry=0x1b507d0,<br>
fullname=fullname@entry=0x3f2434a0 "ML_BSSN::cA", vdim=3,<br>
refinementlevel=refinementlevel@entry=0, request=request@entry=0x3df96370,<br>
bbox=..., dataset=83886080, is_index=false)<br>
at /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899<br>
#10 0x00000000005bdea4 in CarpetIOHDF5::WriteVarChunkedParallel (<br>
cctkGH=cctkGH@entry=0x1b507d0, outfile=outfile@entry=16777216,<br>
io_bytes=@0x7fffffffc980: 1110772, request=0x3df96370,<br>
called_from_checkpoint=called_from_checkpoint@entry=true,indexfile=indexfile@entry=-1)<br>
at /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:706<br>
#11 0x00000000005a233e in CarpetIOHDF5::Checkpoint (cctkGH=0x1b507d0,<br>
called_from=0)<br>
at /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/CarpetIOHDF5.cc:1277<br>
#12 0x000000000041f0d5 in CCTK_CallFunction (<br>
function=function@entry=0x5a2da0 <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>, fdata=fdata@entry=0x1b4a4e8, data=data@entry=0x1b507d0)<br>
at /home/mzilhao/Trabalho/projectos/ET/Cactus/src/main/ScheduleInterface.c:312<br>
#13 0x0000000000ef6499 in Carpet::CallScheduledFunction (<br>
time_and_mode=time_and_mode@entry=0x1174842 "Meta mode",<br>
function=function@entry=0x5a2da0 <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>, attribute=attribute@entry=0x1b4a4e8, data=data@entry=0x1b507d0,<br>
user_timer=...)<br>
at /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/Carpet/src/CallFunction.cc:380<br>
<br>
<br>
so the relevant bits of code seem to be in CarpetIOHDF5/src/Output.cc:706 and CarpetIOHDF5/src/Output.cc:899<br>
<br>
this seems to be triggered when writing hdf5 output in parallel. if i remove checkpointing the run goes fine, and i do get regular 2D hdf5 output. this does not seem to be written in parallel, though, as i get only one file per grid function/group. so it seems to be the parallel output that triggers the crash.<br>
<br>
i have also tried removing all my hdf5 libs and configuring ET with HDF5_DIR=BUILD, but the outcome was the same.<br>
<br>
has anyone seen such an error before? anything else i could provide to help diagnose this?<br>
<br>
thanks,<br>
Miguel<br>
<br>_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@einsteintoolkit.org">Users@einsteintoolkit.org</a><br>
<a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>><br><a href="http://www.perimeterinstitute.ca/personal/eschnetter/" target="_blank">http://www.perimeterinstitute.ca/personal/eschnetter/</a></div>
</div>