<div dir="ltr">Miguel<div><br></div><div>Thanks for tracking this down. I believe this code looks already different on the current master. Can you just disable it, as you did? The set of active grid points is important for some very new post-processing and visualization tools, but you probably don't need it, and you don't need it in a checkpoint file unless you want to visualize the data stored there.</div><div><br></div><div>-erik</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 7, 2015 at 8:31 AM, Miguel Zilhão <span dir="ltr"><<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Erik,<br>
<br>
some more info. i briefly went through git history and found the offending commit. up to Carpet commit d52042b867eea1e0770b6de87ff3929ab7b9d297 everything works fine for me. from commit 69c73fb12d2ee41f01928122cd178d1bae9f8e13 onward i get the error i mentioned.<br>
<br>
indeed, in the current state, if i remove writing of the "active" attribute everything works just fine:<br>
<br>
diff --git a/CarpetIOHDF5/src/Output.cc b/CarpetIOHDF5/src/Output.cc<br>
index ec9b090..30c017d 100644<br>
--- a/CarpetIOHDF5/src/Output.cc<br>
+++ b/CarpetIOHDF5/src/Output.cc<br>
@@ -894,10 +894,11 @@ static int AddAttributes (const cGH *const cctkGH, const char *fullname,<br>
HDF5_ERROR (H5Awrite (attr, H5T_NATIVE_INT, &ioffset[0]));<br>
HDF5_ERROR (H5Aclose (attr));<br>
<br>
- ostringstream buf;<br>
- buf << (<a href="http://vdd.at" rel="noreferrer" target="_blank">vdd.at</a>(Carpet::map)-><br>
- <a href="http://local_boxes.at" rel="noreferrer" target="_blank">local_boxes.at</a>(mglevel).at(refinementlevel).at(component).active);<br>
- WriteAttribute(dataset, "active", buf.str().c_str());<br>
+ // ostringstream buf;<br>
+ // buf << (<a href="http://vdd.at" rel="noreferrer" target="_blank">vdd.at</a>(Carpet::map)-><br>
+ // <a href="http://local_boxes.at" rel="noreferrer" target="_blank">local_boxes.at</a>(mglevel).at(refinementlevel).at(component).active);<br>
+ // WriteAttribute(dataset, "active", buf.str().c_str());<br>
+<br>
}<br>
<br>
if (is_index) {<br>
<br>
<br>
i can't tell why it breaks once that bit of code is in, though...<br>
<br>
thanks,<br>
Miguel<div><div class="h5"><br>
<br>
<br>
On 06/08/15 15:43, Miguel Zilhão wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
hi Erik,<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Can you look at <<a href="https://trac.einsteintoolkit.org/ticket/1800" rel="noreferrer" target="_blank">https://trac.einsteintoolkit.org/ticket/1800</a>>? Can you try replacing "reflevel" in<br>
line 816 of the file CarpetIOHDF5/src/Output.cc with "refinementlevel", and see whether this avoids<br>
the problem?<br>
</blockquote>
<br>
thanks, but this does not seem to help... i still get the same error and backtrace in gdb:<br>
<br>
#6 0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()<br>
from /usr/lib64/libstdc++.so.6<br>
#7 0x0000000000c70402 in _M_range_check (__n=<optimized out>,<br>
this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803<br>
#8 at (__n=<optimized out>, this=<optimized out>)<br>
at /usr/include/c++/5/bits/stl_vector.h:824<br>
#9 CarpetIOHDF5::AddAttributes (cctkGH=cctkGH@entry=0x1b1b930,<br>
fullname=fullname@entry=0x3e52d760 "ML_BSSN::cA", vdim=3,<br>
refinementlevel=refinementlevel@entry=0, request=request@entry=0x3df4c990,<br>
bbox=..., dataset=83886080, is_index=false)<br>
at /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899<br>
#10 0x0000000000c72e5a in CarpetIOHDF5::WriteVarChunkedParallel (<br>
cctkGH=cctkGH@entry=0x1b1b930, outfile=outfile@entry=167<br>
<br>
<br>
Miguel<br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Wed, Aug 5, 2015 at 8:11 PM, Miguel Zilhão <<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a> <mailto:<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a>>> wrote:<br>
<br>
hi all,<br>
<br>
i'm running latest ET Hilbert on openSUSE tumbleweed and i'm having the following issue. upon<br>
trying to run a simple head-on collision configuration with McLachlan (attached parameter file),<br>
i get the error<br>
<br>
INFO (CarpetIOHDF5): ---------------------------------------------------------<br>
INFO (CarpetIOHDF5): Dumping initial checkpoint at iteration 0, simulation time 0<br>
INFO (CarpetIOHDF5): ---------------------------------------------------------<br>
terminate called after throwing an instance of 'std::out_of_range'<br>
what(): vector::_M_range_check: __n (which is 1) >= this->size() (which is 1)<br>
Rank 1 with PID 5958 received signal 6<br>
<br>
when writing the checkpoint file.<br>
this only happens if i run with more than one MPI process; with a single processor it runs fine.<br>
<br>
i'm compiling with gcc-5, but i find the same problem with gcc-4.8. i was running this very same<br>
configuration just fine a couple of months ago, so it must have been some update i've made in<br>
the meantime (either to my OS or to ET).<br>
i've also tried with different configurations and the outcome is the same.<br>
<br>
i've ran this through gdb, here's the relevant output:<br>
<br>
<br>
#6 0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()<br>
from /usr/lib64/libstdc++.so.6<br>
#7 0x00000000005bb398 in _M_range_check (__n=<optimized out>,<br>
this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803<br>
#8 at (__n=<optimized out>, this=<optimized out>)<br>
at /usr/include/c++/5/bits/stl_vector.h:824<br>
#9 CarpetIOHDF5::AddAttributes (cctkGH=cctkGH@entry=0x1b507d0,<br>
fullname=fullname@entry=0x3f2434a0 "ML_BSSN::cA", vdim=3,<br>
refinementlevel=refinementlevel@entry=0, request=request@entry=0x3df96370,<br>
bbox=..., dataset=83886080, is_index=false)<br>
at<br>
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899<br>
#10 0x00000000005bdea4 in CarpetIOHDF5::WriteVarChunkedParallel (<br>
cctkGH=cctkGH@entry=0x1b507d0, outfile=outfile@entry=16777216,<br>
io_bytes=@0x7fffffffc980: 1110772, request=0x3df96370,<br>
called_from_checkpoint=called_from_checkpoint@entry=true,indexfile=indexfile@entry=-1)<br>
at<br>
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:706<br>
#11 0x00000000005a233e in CarpetIOHDF5::Checkpoint (cctkGH=0x1b507d0,<br>
called_from=0)<br>
at<br>
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/CarpetIOHDF5.cc:1277<br>
#12 0x000000000041f0d5 in CCTK_CallFunction (<br>
function=function@entry=0x5a2da0 <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,<br>
fdata=fdata@entry=0x1b4a4e8, data=data@entry=0x1b507d0)<br>
at /home/mzilhao/Trabalho/projectos/ET/Cactus/src/main/ScheduleInterface.c:312<br>
#13 0x0000000000ef6499 in Carpet::CallScheduledFunction (<br>
time_and_mode=time_and_mode@entry=0x1174842 "Meta mode",<br>
function=function@entry=0x5a2da0 <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,<br>
attribute=attribute@entry=0x1b4a4e8, data=data@entry=0x1b507d0,<br>
user_timer=...)<br>
at<br>
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/Carpet/src/CallFunction.cc:380<br>
<br>
<br>
so the relevant bits of code seem to be in CarpetIOHDF5/src/Output.cc:706 and<br>
CarpetIOHDF5/src/Output.cc:899<br>
<br>
this seems to be triggered when writing hdf5 output in parallel. if i remove checkpointing the<br>
run goes fine, and i do get regular 2D hdf5 output. this does not seem to be written in<br>
parallel, though, as i get only one file per grid function/group. so it seems to be the parallel<br>
output that triggers the crash.<br>
<br>
i have also tried removing all my hdf5 libs and configuring ET with HDF5_DIR=BUILD, but the<br>
outcome was the same.<br>
<br>
has anyone seen such an error before? anything else i could provide to help diagnose this?<br>
<br>
thanks,<br>
Miguel<br>
<br>
_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a> <mailto:<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a>><br>
<a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
<br>
<br>
<br>
<br>
--<br>
Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a> <mailto:<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>>><br>
<a href="http://www.perimeterinstitute.ca/personal/eschnetter/" rel="noreferrer" target="_blank">http://www.perimeterinstitute.ca/personal/eschnetter/</a><br>
</blockquote>
_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a><br>
</div></div><a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
<br>
</blockquote>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>><br><a href="http://www.perimeterinstitute.ca/personal/eschnetter/" target="_blank">http://www.perimeterinstitute.ca/personal/eschnetter/</a></div>
</div>