[Users] std::out_of_range error while checkpointing
Miguel Zilhão
mzilhao at ffn.ub.es
Wed Aug 5 19:11:12 CDT 2015
hi all,
i'm running latest ET Hilbert on openSUSE tumbleweed and i'm having the following issue. upon trying
to run a simple head-on collision configuration with McLachlan (attached parameter file), i get the
error
INFO (CarpetIOHDF5): ---------------------------------------------------------
INFO (CarpetIOHDF5): Dumping initial checkpoint at iteration 0, simulation time 0
INFO (CarpetIOHDF5): ---------------------------------------------------------
terminate called after throwing an instance of 'std::out_of_range'
what(): vector::_M_range_check: __n (which is 1) >= this->size() (which is 1)
Rank 1 with PID 5958 received signal 6
when writing the checkpoint file.
this only happens if i run with more than one MPI process; with a single processor it runs fine.
i'm compiling with gcc-5, but i find the same problem with gcc-4.8. i was running this very same
configuration just fine a couple of months ago, so it must have been some update i've made in the
meantime (either to my OS or to ET).
i've also tried with different configurations and the outcome is the same.
i've ran this through gdb, here's the relevant output:
#6 0x00007ffff4f6d4d5 in std::__throw_out_of_range_fmt(char const*, ...) ()
from /usr/lib64/libstdc++.so.6
#7 0x00000000005bb398 in _M_range_check (__n=<optimized out>,
this=<optimized out>) at /usr/include/c++/5/bits/stl_vector.h:803
#8 at (__n=<optimized out>, this=<optimized out>)
at /usr/include/c++/5/bits/stl_vector.h:824
#9 CarpetIOHDF5::AddAttributes (cctkGH=cctkGH at entry=0x1b507d0,
fullname=fullname at entry=0x3f2434a0 "ML_BSSN::cA", vdim=3,
refinementlevel=refinementlevel at entry=0, request=request at entry=0x3df96370,
bbox=..., dataset=83886080, is_index=false)
at /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:899
#10 0x00000000005bdea4 in CarpetIOHDF5::WriteVarChunkedParallel (
cctkGH=cctkGH at entry=0x1b507d0, outfile=outfile at entry=16777216,
io_bytes=@0x7fffffffc980: 1110772, request=0x3df96370,
called_from_checkpoint=called_from_checkpoint at entry=true,indexfile=indexfile at entry=-1)
at /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/Output.cc:706
#11 0x00000000005a233e in CarpetIOHDF5::Checkpoint (cctkGH=0x1b507d0,
called_from=0)
at
/home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/CarpetIOHDF5/src/CarpetIOHDF5.cc:1277
#12 0x000000000041f0d5 in CCTK_CallFunction (
function=function at entry=0x5a2da0 <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,
fdata=fdata at entry=0x1b4a4e8, data=data at entry=0x1b507d0)
at /home/mzilhao/Trabalho/projectos/ET/Cactus/src/main/ScheduleInterface.c:312
#13 0x0000000000ef6499 in Carpet::CallScheduledFunction (
time_and_mode=time_and_mode at entry=0x1174842 "Meta mode",
function=function at entry=0x5a2da0 <CarpetIOHDF5::CarpetIOHDF5_InitialDataCheckpoint(cGH*)>,
attribute=attribute at entry=0x1b4a4e8, data=data at entry=0x1b507d0,
user_timer=...)
at /home/mzilhao/Trabalho/projectos/ET/Cactus/arrangements/Carpet/Carpet/src/CallFunction.cc:380
so the relevant bits of code seem to be in CarpetIOHDF5/src/Output.cc:706 and
CarpetIOHDF5/src/Output.cc:899
this seems to be triggered when writing hdf5 output in parallel. if i remove checkpointing the run
goes fine, and i do get regular 2D hdf5 output. this does not seem to be written in parallel,
though, as i get only one file per grid function/group. so it seems to be the parallel output that
triggers the crash.
i have also tried removing all my hdf5 libs and configuring ET with HDF5_DIR=BUILD, but the outcome
was the same.
has anyone seen such an error before? anything else i could provide to help diagnose this?
thanks,
Miguel
-------------- next part --------------
# Basic Cactus parameters
#------------------------------------------------------------------------------
Cactus::cctk_run_title = $parfile
Cactus::cctk_full_warnings = yes
Cactus::highlight_warning_messages = no
# Cactus::terminate = "any"
Cactus::terminate = "time"
# Cactus::cctk_final_time = 100.0
Cactus::cctk_final_time = 16.0
# Cactus::max_runtime = 5 # 5.0 min
# Basic thorns necessary to run
#------------------------------------------------------------------------------
ActiveThorns = "AEILocalInterp"
ActiveThorns = "Fortran GSL HDF5"
ActiveThorns = "GenericFD"
ActiveThorns = "LocalInterp LoopControl Slab"
ActiveThorns = "InitBase IOUtil"
#------------------------------------------------------------------------------
InitBase::initial_data_setup_method = "init_all_levels"
# Grid setup
#------------------------------------------------------------------------------
ActiveThorns = "Boundary CartGrid3D CoordBase"
ActiveThorns = "ReflectionSymmetry SymBase"
#ActiveThorns = "RotatingSymmetry180"
#------------------------------------------------------------------------------
CoordBase::domainsize = "minmax"
# make sure all (xmax - xmin)/dx are integers!
CoordBase::xmin = 0.00
CoordBase::ymin = 0.00
CoordBase::zmin = 0.00
CoordBase::xmax = +16.2849
CoordBase::ymax = +16.2849
CoordBase::zmax = +16.2849
CoordBase::dx = 0.2857
CoordBase::dy = 0.2857
CoordBase::dz = 0.2857
CoordBase::boundary_size_x_lower = 3
CoordBase::boundary_size_y_lower = 3
CoordBase::boundary_size_z_lower = 3
CoordBase::boundary_size_x_upper = 3
CoordBase::boundary_size_y_upper = 3
CoordBase::boundary_size_z_upper = 3
CoordBase::boundary_shiftout_x_lower = 1
CoordBase::boundary_shiftout_y_lower = 1
CoordBase::boundary_shiftout_z_lower = 1
CartGrid3D::type = "coordbase"
ReflectionSymmetry::reflection_x = yes
ReflectionSymmetry::reflection_y = yes
ReflectionSymmetry::reflection_z = yes
ReflectionSymmetry::avoid_origin_x = no
ReflectionSymmetry::avoid_origin_y = no
ReflectionSymmetry::avoid_origin_z = no
# Some Carpet parameters
#------------------------------------------------------------------------------
ActiveThorns = "Carpet CarpetLib CarpetInterp CarpetReduce CarpetSlab"
#------------------------------------------------------------------------------
Carpet::verbose = no
Carpet::veryverbose = no
Carpet::schedule_barriers = no
Carpet::storage_verbose = no
#Carpet::timers_verbose = no
CarpetLib::output_bboxes = no
Carpet::domain_from_coordbase = yes
Carpet::max_refinement_levels = 5
driver::ghost_size = 3
Carpet::use_buffer_zones = yes
Carpet::prolongation_order_space = 5
Carpet::prolongation_order_time = 2
Carpet::convergence_level = 0
Carpet::init_fill_timelevels = yes
Carpet::init_3_timelevels = no
Carpet::poison_new_timelevels = yes
CarpetLib::poison_new_memory = yes
Carpet::output_timers_every = 32
CarpetLib::print_timestats_every = 5120
CarpetLib::print_memstats_every = 5120
Carpet::grid_structure_filename = "carpet-grid-structure"
Carpet::grid_coordinates_filename = "carpet-grid-coordinates"
# Check for NaNs
#------------------------------------------------------------------------------
ActiveThorns = "NaNChecker"
#------------------------------------------------------------------------------
NaNChecker::check_every = 512
#NaNChecker::verbose = "all"
#NaNChecker::action_if_found = "just warn"
NaNChecker::action_if_found = "terminate"
NaNChecker::check_vars = "
ADMBase::metric
ADMBase::curv
ADMBase::lapse
ADMBase::shift
"
# Puncture tracking and regrid
#------------------------------------------------------------------------------
ActiveThorns = "SphericalSurface"
ActiveThorns = "ADMBase CarpetRegrid2 PunctureTracker"
ActiveThorns = "CarpetTracker"
#------------------------------------------------------------------------------
SphericalSurface::nsurfaces = 5
CarpetTracker::surface[0] = 0
CarpetTracker::surface[1] = 1
PunctureTracker::track [0] = "yes"
PunctureTracker::initial_z [0] = 3.001
PunctureTracker::which_surface_to_store_info[0] = 0
PunctureTracker::track [1] = "yes"
PunctureTracker::initial_z [1] = -3.001
PunctureTracker::which_surface_to_store_info[1] = 1
PunctureTracker::verbose = "no"
CarpetRegrid2::regrid_every = 64
CarpetRegrid2::freeze_unaligned_levels = yes
CarpetRegrid2::symmetry_rotating180 = no
CarpetRegrid2::verbose = no
CarpetRegrid2::num_centres = 2
CarpetRegrid2::num_levels_1 = 5
CarpetRegrid2::position_z_1 = 3.0
CarpetRegrid2::radius_1[ 1] = 8.0
CarpetRegrid2::radius_1[ 2] = 2.0
CarpetRegrid2::radius_1[ 3] = 1.0
CarpetRegrid2::radius_1[ 4] = 0.5
CarpetRegrid2::movement_threshold_1 = 0.16
CarpetRegrid2::num_levels_2 = 5
CarpetRegrid2::position_z_2 = -3.0
CarpetRegrid2::radius_2[ 1] = 8.0
CarpetRegrid2::radius_2[ 2] = 2.0
CarpetRegrid2::radius_2[ 3] = 1.0
CarpetRegrid2::radius_2[ 4] = 0.5
CarpetRegrid2::movement_threshold_2 = 0.16
# ActiveThorns = "CarpetMask"
# CarpetMask::verbose = no
# CarpetMask::excluded_surface [0] = 0
# CarpetMask::excluded_surface_factor[0] = 1.0
# CarpetMask::excluded_surface [1] = 1
# CarpetMask::excluded_surface_factor[1] = 1.0
# CarpetMask::excluded_surface [2] = 2
# CarpetMask::excluded_surface_factor[2] = 1.0
# Integration method
#------------------------------------------------------------------------------
ActiveThorns = "MoL Time"
#------------------------------------------------------------------------------
MoL::ODE_Method = "RK4"
MoL::MoL_Intermediate_Steps = 4
MoL::MoL_Num_Scratch_Levels = 1
Carpet::time_refinement_factors = "[1, 2, 4, 8, 16, 32]"
Time::dtfac = 0.5
# Initial data
#------------------------------------------------------------------------------
ActiveThorns = "ADMCoupling ADMMacros CoordGauge SpaceMask StaticConformal TmunuBase"
ActiveThorns = "TwoPunctures"
#------------------------------------------------------------------------------
ADMMacros::spatial_order = 4
ADMBase::metric_type = "physical"
ADMBase::initial_data = "twopunctures"
ADMBase::initial_lapse = "twopunctures-averaged"
ADMBase::initial_shift = "zero"
ADMBase::initial_dtlapse = "zero"
ADMBase::initial_dtshift = "zero"
# needed for AHFinderDirect
ADMBase::metric_timelevels = 3
TwoPunctures::swap_xz = "yes"
TwoPunctures::par_b = 3.001
TwoPunctures::par_m_plus = 0.5
TwoPunctures::par_m_minus = 0.5
TwoPunctures::par_P_plus [1] = -0.0
TwoPunctures::par_P_minus[1] = 0.0
TwoPunctures::TP_epsilon = 1.0e-10
TwoPunctures::TP_Tiny = 1.0e-10
# TwoPunctures::verbose = yes
# Thorns for the evolution
#------------------------------------------------------------------------------
ActiveThorns = "ML_BSSN ML_BSSN_Helper NewRad"
ActiveThorns = "ML_ADMConstraints"
#------------------------------------------------------------------------------
ADMBase::evolution_method = "ML_BSSN"
ADMBase::lapse_evolution_method = "ML_BSSN"
ADMBase::shift_evolution_method = "ML_BSSN"
ADMBase::dtlapse_evolution_method = "ML_BSSN"
ADMBase::dtshift_evolution_method = "ML_BSSN"
ML_BSSN::harmonicN = 1 # 1+log
ML_BSSN::harmonicF = 2.0 # 1+log
ML_BSSN::ShiftGammaCoeff = 0.75
ML_BSSN::BetaDriver = 1.0
ML_BSSN::LapseAdvectionCoeff = 1.0
ML_BSSN::ShiftAdvectionCoeff = 1.0
ML_BSSN::MinimumLapse = 1.0e-8
ML_BSSN::my_initial_boundary_condition = "extrapolate-gammas"
ML_BSSN::my_rhs_boundary_condition = "NewRad"
Boundary::radpower = 2
ML_BSSN::ML_log_confac_bound = "none"
ML_BSSN::ML_metric_bound = "none"
ML_BSSN::ML_Gamma_bound = "none"
ML_BSSN::ML_trace_curv_bound = "none"
ML_BSSN::ML_curv_bound = "none"
ML_BSSN::ML_lapse_bound = "none"
ML_BSSN::ML_dtlapse_bound = "none"
ML_BSSN::ML_shift_bound = "none"
ML_BSSN::ML_dtshift_bound = "none"
# Numerical dissipation
#------------------------------------------------------------------------------
ActiveThorns = "Dissipation"
#------------------------------------------------------------------------------
Dissipation::order = 5
Dissipation::vars = "
ML_BSSN::ML_log_confac
ML_BSSN::ML_metric
ML_BSSN::ML_trace_curv
ML_BSSN::ML_curv
ML_BSSN::ML_Gamma
ML_BSSN::ML_lapse
ML_BSSN::ML_shift
ML_BSSN::ML_dtlapse
ML_BSSN::ML_dtshift
"
# Further necessary thorns
#------------------------------------------------------------------------------
ActiveThorns = "SummationByParts"
#------------------------------------------------------------------------------
SummationByParts::order = 4
# # Wave extraction
# #------------------------------------------------------------------------------
# ActiveThorns = "WeylScal4 Multipole"
# #------------------------------------------------------------------------------
# WeylScal4::offset = 1e-8
# WeylScal4::fd_order = "4th"
# WeylScal4::verbose = 0
# Multipole::nradii = 3
# Multipole::out_every = 128
# Multipole::radius[0] = 60
# Multipole::radius[1] = 80
# Multipole::radius[2] = 100
# Multipole::variables = "WeylScal4::Psi4r{sw=-2 cmplx='WeylScal4::Psi4i' name='Psi4'}"
# Multipole::l_max = 4
# #Multipole::m_mode = 4
# Horizon thorns
#------------------------------------------------------------------------------
ActiveThorns = "AHFinderDirect"
#------------------------------------------------------------------------------
AHFinderDirect::find_every = 128
AHFinderDirect::run_at_CCTK_POST_RECOVER_VARIABLES = no
AHFinderDirect::move_origins = yes
AHFinderDirect::reshape_while_moving = yes
AHFinderDirect::predict_origin_movement = yes
AHFinderDirect::geometry_interpolator_name = "Lagrange polynomial interpolation"
AHFinderDirect::geometry_interpolator_pars = "order=4"
AHFinderDirect::surface_interpolator_name = "Lagrange polynomial interpolation"
AHFinderDirect::surface_interpolator_pars = "order=4"
AHFinderDirect::output_h_every = 0
AHFinderDirect::N_horizons = 3
AHFinderDirect::origin_z [1] = +3.0
AHFinderDirect::initial_guess__coord_sphere__z_center[1] = +3.0
AHFinderDirect::initial_guess__coord_sphere__radius [1] = 0.25
AHFinderDirect::which_surface_to_store_info [1] = 2
AHFinderDirect::reset_horizon_after_not_finding [1] = no
AHFinderDirect::track_origin_from_grid_scalar [1] = yes
AHFinderDirect::track_origin_source_x [1] = "PunctureTracker::pt_loc_x[0]"
AHFinderDirect::track_origin_source_y [1] = "PunctureTracker::pt_loc_y[0]"
AHFinderDirect::track_origin_source_z [1] = "PunctureTracker::pt_loc_z[0]"
AHFinderDirect::max_allowable_horizon_radius [1] = 3
#AHFinderDirect::dont_find_after_individual_time [1] = 30.0
AHFinderDirect::origin_z [2] = -3.0
AHFinderDirect::initial_guess__coord_sphere__z_center[2] = -3.0
AHFinderDirect::initial_guess__coord_sphere__radius [2] = 0.25
AHFinderDirect::which_surface_to_store_info [2] = 3
AHFinderDirect::reset_horizon_after_not_finding [2] = no
AHFinderDirect::track_origin_from_grid_scalar [2] = yes
AHFinderDirect::track_origin_source_x [2] = "PunctureTracker::pt_loc_x[1]"
AHFinderDirect::track_origin_source_y [2] = "PunctureTracker::pt_loc_y[1]"
AHFinderDirect::track_origin_source_z [2] = "PunctureTracker::pt_loc_z[1]"
AHFinderDirect::max_allowable_horizon_radius [2] = 3
#AHFinderDirect::dont_find_after_individual_time [2] = 30.0
AHFinderDirect::origin_x [3] = 0
AHFinderDirect::find_after_individual_time [3] = 20.0
AHFinderDirect::initial_guess__coord_sphere__z_center[3] = 0
AHFinderDirect::initial_guess__coord_sphere__radius [3] = 1.2
AHFinderDirect::which_surface_to_store_info [3] = 4
AHFinderDirect::reset_horizon_after_not_finding [3] = no
AHFinderDirect::max_allowable_horizon_radius [3] = 6
# I/O thorns
#------------------------------------------------------------------------------
ActiveThorns = "CarpetIOBasic"
ActiveThorns = "CarpetIOScalar"
ActiveThorns = "CarpetIOASCII"
Activethorns = "CarpetIOHDF5"
#------------------------------------------------------------------------------
IOBasic::outInfo_every = 8
# IOBasic::outInfo_reductions = "norm2"
IOBasic::outInfo_reductions = "norm2 minimum maximum"
IOBasic::outInfo_vars = "
Carpet::physical_time_per_hour
ADMBase::lapse
ML_ADMConstraints::H
"
IO::out_dir = $parfile
# # for scalar reductions of 3D grid functions
# IOScalar::one_file_per_group = no
# IOScalar::outScalar_every = 128
# IOScalar::outScalar_vars = "
# ADMBase::lapse
# "
IOASCII::one_file_per_group = no
# IOASCII::output_symmetry_points = no
# IOASCII::out3D_ghosts = no
# IOASCII::out1D_every = 128
IOASCII::out1D_every = 1
IOASCII::out1D_d = "no"
IOASCII::out1D_vars = "ADMBase::shift
ADMBase::lapse
ML_BSSN::phi
ML_BSSN::ML_metric
ML_BSSN::ML_curv
ML_BSSN::ML_trace_curv
ML_BSSN::ML_Gamma
"
IOHDF5::out2D_every = 256
IOHDF5::out2D_vars = "
ADMBase::metric
ADMBase::curv
ADMBase::lapse
ADMBase::shift
ML_ADMConstraints::ML_Ham
ML_ADMConstraints::ML_mom
ML_BSSN::phi
"
# IOHDF5::use_checksums = yes
# IOHDF5::compression_level = 1
# IOHDF5::one_file_per_group = yes
# IOHDF5::output_symmetry_points = no
# IOHDF5::out3D_ghosts = no
# Checkpointing and recovery
# -----------------------------------------------------------------------------
IOHDF5::checkpoint = yes
IO::checkpoint_dir = $parfile
IO::checkpoint_ID = yes
IO::checkpoint_every_walltime_hours = 6.0
IO::checkpoint_on_terminate = yes
IO::recover = "autoprobe"
IO::recover_dir = $parfile
# Formaline and TimerReport
# -----------------------------------------------------------------------------
# ActiveThorns = "Formaline"
ActiveThorns = "TimerReport"
# -----------------------------------------------------------------------------
TimerReport::out_every = 5120
TimerReport::out_filename = "TimerReport"
TimerReport::output_all_timers_together = yes
TimerReport::output_all_timers_readable = yes
TimerReport::n_top_timers = 20
More information about the Users
mailing list