<div dir="ltr">Miguel<div><br></div><div>This was an error in the development version of CarpetIOHDF5 -- at one place, "component" should have been "local_component". This has now been corrected.</div><div><br></div><div>-erik</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 7, 2015 at 11:22 AM, Miguel Zilhão <span dir="ltr"><<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">hi Erik,<br>
<br>
actually, i've just noticed that, for whatever reason, i was using Carpet's master branch... the rest of my ET installation does point to ET_2015_05 (and i thought Carpet did as well). i must have typed 'git checkout master' in the wrong place at some point, and never noticed it.<br>
<br>
so the 'fix' i mentioned in my previous email applies to Carpet's current master branch. if i checkout branch ET_2015_05 and compile that (as i thought i was doing), it works fine without me needing to change anything.<br>
<br>
thanks,<br>
Miguel<span class=""><br>
<br>
<br>
On 07/08/15 15:08, Erik Schnetter wrote:<br>
</span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
Miguel<br>
<br>
Thanks for tracking this down. I believe this code looks already different on the current master.<br>
Can you just disable it, as you did? The set of active grid points is important for some very new<br>
post-processing and visualization tools, but you probably don't need it, and you don't need it in a<br>
checkpoint file unless you want to visualize the data stored there.<br>
<br>
-erik<br>
<br>
<br></span><span class="">
On Fri, Aug 7, 2015 at 8:31 AM, Miguel Zilhão <<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a> <mailto:<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a>>> wrote:<br>
<br>
Erik,<br>
<br>
some more info. i briefly went through git history and found the offending commit. up to Carpet<br>
commit d52042b867eea1e0770b6de87ff3929ab7b9d297 everything works fine for me. from commit<br>
69c73fb12d2ee41f01928122cd178d1bae9f8e13 onward i get the error i mentioned.<br>
<br>
indeed, in the current state, if i remove writing of the "active" attribute everything works<br>
just fine:<br>
<br>
diff --git a/CarpetIOHDF5/src/Output.cc b/CarpetIOHDF5/src/Output.cc<br>
index ec9b090..30c017d 100644<br>
--- a/CarpetIOHDF5/src/Output.cc<br>
+++ b/CarpetIOHDF5/src/Output.cc<br>
@@ -894,10 +894,11 @@ static int AddAttributes (const cGH *const cctkGH, const char *fullname,<br>
HDF5_ERROR (H5Awrite (attr, H5T_NATIVE_INT, &ioffset[0]));<br>
HDF5_ERROR (H5Aclose (attr));<br>
<br>
- ostringstream buf;<br></span>
- buf << (<a href="http://vdd.at" rel="noreferrer" target="_blank">vdd.at</a> <<a href="http://vdd.at" rel="noreferrer" target="_blank">http://vdd.at</a>>(Carpet::map)-><br>
- <a href="http://local_boxes.at" rel="noreferrer" target="_blank">local_boxes.at</a> <<a href="http://local_boxes.at" rel="noreferrer" target="_blank">http://local_boxes.at</a>>(mglevel).at(refinementlevel).at(component).active);<span class=""><br>
- WriteAttribute(dataset, "active", buf.str().c_str());<br>
+ // ostringstream buf;<br></span>
+ // buf << (<a href="http://vdd.at" rel="noreferrer" target="_blank">vdd.at</a> <<a href="http://vdd.at" rel="noreferrer" target="_blank">http://vdd.at</a>>(Carpet::map)-><br>
+ // <a href="http://local_boxes.at" rel="noreferrer" target="_blank">local_boxes.at</a> <<a href="http://local_boxes.at" rel="noreferrer" target="_blank">http://local_boxes.at</a>>(mglevel).at(refinementlevel).at(component).active);<div><div class="h5"><br>
+ // WriteAttribute(dataset, "active", buf.str().c_str());<br>
+<br>
}<br>
<br>
if (is_index) {<br>
<br>
<br>
i can't tell why it breaks once that bit of code is in, though...<br>
<br>
thanks,<br>
Miguel<br>
<br>
<br>
<br>
On 06/08/15 15:43, Miguel Zilhão wrote:<br>
<br>
hi Erik,<br>
<br>
Can you look at <<a href="https://trac.einsteintoolkit.org/ticket/1800" rel="noreferrer" target="_blank">https://trac.einsteintoolkit.org/ticket/1800</a>>? Can you try replacing<br>
"reflevel" in<br>
line 816 of the file CarpetIOHDF5/src/Output.cc with "refinementlevel", and see whether<br>
this avoids<br>
the problem?<br>
<br>
<br>
thanks, but this does not seem to help... i still get the same error and backtrace in gdb:<br>
<br>
#6 0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()<br>
from /usr/lib64/libstdc++.so.6<br>
#7 0x0000000000c70402 in _M_range_check (__n=<optimized out>,<br>
this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803<br>
#8 at (__n=<optimized out>, this=<optimized out>)<br>
at /usr/include/c++/5/bits/stl_vector.h:824<br>
#9 CarpetIOHDF5::AddAttributes (cctkGH=cctkGH@entry=0x1b1b930,<br>
fullname=fullname@entry=0x3e52d760 "ML_BSSN::cA", vdim=3,<br>
refinementlevel=refinementlevel@entry=0, request=request@entry=0x3df4c990,<br>
bbox=..., dataset=83886080, is_index=false)<br>
at<br>
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899<br>
#10 0x0000000000c72e5a in CarpetIOHDF5::WriteVarChunkedParallel (<br>
cctkGH=cctkGH@entry=0x1b1b930, outfile=outfile@entry=167<br>
<br>
<br>
Miguel<br>
<br>
<br>
On Wed, Aug 5, 2015 at 8:11 PM, Miguel Zilhão <<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a><br></div></div><div><div class="h5">
<mailto:<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a>> <mailto:<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a> <mailto:<a href="mailto:mzilhao@ffn.ub.es" target="_blank">mzilhao@ffn.ub.es</a>>>> wrote:<br>
<br>
hi all,<br>
<br>
i'm running latest ET Hilbert on openSUSE tumbleweed and i'm having the following<br>
issue. upon<br>
trying to run a simple head-on collision configuration with McLachlan (attached<br>
parameter file),<br>
i get the error<br>
<br>
INFO (CarpetIOHDF5): ---------------------------------------------------------<br>
INFO (CarpetIOHDF5): Dumping initial checkpoint at iteration 0, simulation time 0<br>
INFO (CarpetIOHDF5): ---------------------------------------------------------<br>
terminate called after throwing an instance of 'std::out_of_range'<br>
what(): vector::_M_range_check: __n (which is 1) >= this->size() (which is 1)<br>
Rank 1 with PID 5958 received signal 6<br>
<br>
when writing the checkpoint file.<br>
this only happens if i run with more than one MPI process; with a single processor<br>
it runs fine.<br>
<br>
i'm compiling with gcc-5, but i find the same problem with gcc-4.8. i was running<br>
this very same<br>
configuration just fine a couple of months ago, so it must have been some update<br>
i've made in<br>
the meantime (either to my OS or to ET).<br>
i've also tried with different configurations and the outcome is the same.<br>
<br>
i've ran this through gdb, here's the relevant output:<br>
<br>
<br>
#6 0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()<br>
from /usr/lib64/libstdc++.so.6<br>
#7 0x00000000005bb398 in _M_range_check (__n=<optimized out>,<br>
this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803<br>
#8 at (__n=<optimized out>, this=<optimized out>)<br>
at /usr/include/c++/5/bits/stl_vector.h:824<br>
#9 CarpetIOHDF5::AddAttributes (cctkGH=cctkGH@entry=0x1b507d0,<br>
fullname=fullname@entry=0x3f2434a0 "ML_BSSN::cA", vdim=3,<br>
refinementlevel=refinementlevel@entry=0, request=request@entry=0x3df96370,<br>
bbox=..., dataset=83886080, is_index=false)<br>
at<br>
<br>
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899<br>
#10 0x00000000005bdea4 in CarpetIOHDF5::WriteVarChunkedParallel (<br>
cctkGH=cctkGH@entry=0x1b507d0, outfile=outfile@entry=16777216,<br>
io_bytes=@0x7fffffffc980: 1110772, request=0x3df96370,<br>
<br>
called_from_checkpoint=called_from_checkpoint@entry=true,indexfile=indexfile@entry=-1)<br>
at<br>
<br>
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:706<br>
#11 0x00000000005a233e in CarpetIOHDF5::Checkpoint (cctkGH=0x1b507d0,<br>
called_from=0)<br>
at<br>
<br>
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/CarpetIOHDF5.cc:1277<br>
#12 0x000000000041f0d5 in CCTK_CallFunction (<br>
function=function@entry=0x5a2da0<br>
<CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,<br>
fdata=fdata@entry=0x1b4a4e8, data=data@entry=0x1b507d0)<br>
at /home/mzilhao/Trabalho/projectos/ET/Cactus/src/main/ScheduleInterface.c:312<br>
#13 0x0000000000ef6499 in Carpet::CallScheduledFunction (<br>
time_and_mode=time_and_mode@entry=0x1174842 "Meta mode",<br>
function=function@entry=0x5a2da0<br>
<CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,<br>
attribute=attribute@entry=0x1b4a4e8, data=data@entry=0x1b507d0,<br>
user_timer=...)<br>
at<br>
<br>
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/Carpet/src/CallFunction.cc:380<br>
<br>
<br>
so the relevant bits of code seem to be in CarpetIOHDF5/src/Output.cc:706 and<br>
CarpetIOHDF5/src/Output.cc:899<br>
<br>
this seems to be triggered when writing hdf5 output in parallel. if i remove<br>
checkpointing the<br>
run goes fine, and i do get regular 2D hdf5 output. this does not seem to be<br>
written in<br>
parallel, though, as i get only one file per grid function/group. so it seems to<br>
be the parallel<br>
output that triggers the crash.<br>
<br>
i have also tried removing all my hdf5 libs and configuring ET with<br>
HDF5_DIR=BUILD, but the<br>
outcome was the same.<br>
<br>
has anyone seen such an error before? anything else i could provide to help<br>
diagnose this?<br>
<br>
thanks,<br>
Miguel<br>
<br>
_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a> <mailto:<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a>><br></div></div>
<mailto:<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a> <mailto:<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a>>><span class=""><br>
<a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
<br>
<br>
<br>
<br>
--<br>
Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a> <mailto:<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>><br></span>
<mailto:<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a> <mailto:<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>>>><span class=""><br>
<a href="http://www.perimeterinstitute.ca/personal/eschnetter/" rel="noreferrer" target="_blank">http://www.perimeterinstitute.ca/personal/eschnetter/</a><br>
<br>
_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a> <mailto:<a href="mailto:Users@einsteintoolkit.org" target="_blank">Users@einsteintoolkit.org</a>><br>
<a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
<br>
<br>
<br>
<br>
--<br>
Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a> <mailto:<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>>><br>
<a href="http://www.perimeterinstitute.ca/personal/eschnetter/" rel="noreferrer" target="_blank">http://www.perimeterinstitute.ca/personal/eschnetter/</a><br>
</span></blockquote>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">Erik Schnetter <<a href="mailto:schnetter@cct.lsu.edu" target="_blank">schnetter@cct.lsu.edu</a>><br><a href="http://www.perimeterinstitute.ca/personal/eschnetter/" target="_blank">http://www.perimeterinstitute.ca/personal/eschnetter/</a></div>
</div>