[Users] CarpetIOHDF5 recover failure with manual topology
Yosef Zlochower
yosef at astro.rit.edu
Mon Sep 9 15:24:52 CDT 2019
Hi,
I have been trying to debug why some runs I was performing could not
recover from a checkpoint file, but would otherwise proceed as normal.
I attached a minimalist parfile showing the problem. A small grid is
manually distributed over 8 processors and terminates at iteration 2. An
attempt at recover fails with nans on grid::x. If the manual topology
section is commented out, no problems are seen.
--
Dr. Yosef Zlochower
Center for Computational Relativity and Gravitation
Associate Professor
School of Mathematical Sciences
Rochester Institute of Technology
85 Lomb Memorial Drive
Rochester, NY 14623
Office:74-2067
Phone: +1 585-475-6103
yosef at astro.rit.edu
CONFIDENTIALITY NOTE: The information transmitted, including
attachments, is intended only for the person(s) or entity to which it
is addressed and may contain confidential and/or privileged material.
Any review, retransmission, dissemination or other use of, or taking
of any action in reliance upon this information by persons or entities
other than the intended recipient is prohibited. If you received this
in error, please contact the sender and destroy any copies of this
information.
-------------- next part --------------
ActiveThorns = "time"
#ActiveThorns = "SystemTopology hwloc"
ActiveThorns = "Carpet CarpetLib CarpetReduce CarpetIOBasic"
ActiveThorns = "CartGrid3D CoordBase SymBase ioutil carpetioascii CarpetIOScalar CarpetIOHDF5 LocalReduce"
ActiveThorns = "NaNChecker"
NaNChecker::check_every = 1 #16384 #2^number of refinement levels
NaNChecker::action_if_found = "terminate" #, "just warn", "abort"
NaNChecker::check_vars = "
grid::x
"
NaNChecker::verbose = "all"
NaNChecker::out_NaNmask = "yes"
################################################################################
# Time stepping
################################################################################
# grid parameters
CartGrid3D::type = "coordbase"
CartGrid3D::domain = "full"
CoordBase::xmin = -1
CoordBase::ymin = -1
CoordBase::zmin = -1
CoordBase::xmax = 1
CoordBase::ymax = 1
CoordBase::zmax = 1
CoordBase::domainsize = "minmax"
CoordBase::spacing = "numcells"
CoordBase::ncells_x = 200
CoordBase::ncells_y = 12
CoordBase::ncells_z = 12
CoordBase::boundary_size_x_lower = 3
CoordBase::boundary_size_y_lower = 3
CoordBase::boundary_size_z_lower = 3
CoordBase::boundary_size_x_upper = 3
CoordBase::boundary_size_y_upper = 3
CoordBase::boundary_size_z_upper = 3
Carpet::max_refinement_levels = 1
driver::ghost_size = 3
#
Carpet::domain_from_coordbase = "yes"
Carpet::enable_all_storage = no
Carpet::poison_new_timelevels = "yes"
Carpet::poison_value = 255
Carpet::init_3_timelevels = "no"
Carpet::init_fill_timelevels = "yes"
CarpetLib::poison_new_memory = "yes"
CarpetLib::poison_value = 255
Carpet::refinement_centering = vertex
Carpet::use_tapered_grids = no
Carpet::prolongation_order_space = 5
Carpet::prolongation_order_time = 2
Carpet::max_timelevels = 3
Carpet::verbose = no
Carpet::veryverbose = no
#Carpet::processor_topology = "manual"
#Carpet::processor_topology_3d_x = 8
#Carpet::processor_topology_3d_y = 1
#Carpet::processor_topology_3d_z = 1
IOBasic::outInfo_every = 2
IOBasic::outInfo_vars = "grid::x"
IO::out_fileinfo = "all"
IOScalar::outScalar_every = 128
IOScalar::one_file_per_group = yes
IOScalar::outScalar_vars = "
Grid::x
"
IO::out_dir = $parfile
################################################################################
# Checkpointing and recovery
################################################################################
CarpetIOHDF5::checkpoint = yes
IO::checkpoint_ID = no
IO::recover = "autoprobe"
IO::checkpoint_every_walltime_hours = 12
#IO::out_proc_every = 2
IO::checkpoint_keep = 3
IO::checkpoint_on_terminate = yes
IO::checkpoint_dir = "checkpoints"
IO::recover_dir = "checkpoints"
IO::abort_on_io_errors = yes
CarpetIOHDF5::open_one_input_file_at_a_time = yes
cactus::cctk_itlast = 2
More information about the Users
mailing list