[ET Trac] #2890: CarpetX: crash/failed assertion in Z4c_RHS

Miren Radia trac-noreply at einsteintoolkit.org
Tue Oct 7 10:47:43 CDT 2025


#2890: CarpetX: crash/failed assertion in Z4c_RHS

 Reporter: Miren Radia
   Status: submitted
Milestone: 
  Version: development version
     Type: bug
 Priority: major
Component: CarpetX

Changes (by Miren Radia):
Hello,

I’ve been trying to run a BH binary simulation with CarpetX on [tursa](https://epcced.github.io/dirac-docs/tursa-user-guide/hardware/) \(Nvidia A100 GPUs\) but I seem to be running into problems with the code crashing/assertions failing that I can’t resolve. I have been using Liwei Ji’s development branches in order to use subcycling \(following the guidance described [here](https://github.com/lwJi/Tutorial-Subcycling)\) so it’s possible my problems are related to this.

Please find attached the following:

* My thornlist: `carpetx-subcycling.th`
* My parameter file: `test-bbh.par` \(this is a very low resolution/small configuration just to reproduce the crash. I was originally trying something bigger\)
* My OptionsList: `tursa.cfg`
* The full output from running the code through `cuda-gdb` including a backtrace at the failed assertion and printing out some variables that seem to be leading to the failed assertion: `test-bbh-debug.out`

Here’s the end of the output printed from the simulation \(I have increased the verbosity\):

```
INFO (CarpetX): Starting evolution...
INFO (CarpetX): Regridding...
INFO (CarpetX): Setting max_grid_size values for all levels before regridding
INFO (CarpetX): ErrorEst patch 0 level 0
INFO (CarpetX): ErrorEst patch 0 level 0 done. Set/clear/total=288/3808/4096=7%/93%/100%
INFO (CarpetX):   old levels 2, new levels 2
INFO (CarpetX):   level 0: 1 boxes, 4096 cells (100%)
INFO (CarpetX):   level 1: 1 boxes, 8192 cells (25%, 25%)
INFO (CarpetX): ScheduleTraverseGH iteration 1 CCTK_PRESTEP
INFO (CarpetX): ScheduleTraverseGH iteration 1 CCTK_EVOL
INFO (CarpetX): CallFunction iteration 1 CCTK_EVOL: ODESolvers::ODESolvers_Solve_Subcycling
INFO (ODESolvers): Integrator is RK4
INFO (ODESolvers):   Integrating 22 variables
INFO (CarpetX): SyncGroupsProlongateOnly Z4C::CHI_OLD, Z4C::GAMMA_TILDE_OLD, Z4C::K_HAT_OLD, Z4C::A_TILDE_OLD, Z4C::GAM_TILDE_OLD, Z4C::THETA_OLD, Z4C::
ALPHAG_OLD, Z4C::BETAG_OLD
INFO (CarpetX): SyncGroupsProlongateOnly Z4C::CHI_K1, Z4C::GAMMA_TILDE_K1, Z4C::K_HAT_K1, Z4C::A_TILDE_K1, Z4C::GAM_TILDE_K1, Z4C::THETA_K1, Z4C::ALPHAG
_K1, Z4C::BETAG_K1
INFO (CarpetX): SyncGroupsProlongateOnly Z4C::CHI_K2, Z4C::GAMMA_TILDE_K2, Z4C::K_HAT_K2, Z4C::A_TILDE_K2, Z4C::GAM_TILDE_K2, Z4C::THETA_K2, Z4C::ALPHAG
_K2, Z4C::BETAG_K2
INFO (CarpetX): SyncGroupsProlongateOnly Z4C::CHI_K3, Z4C::GAMMA_TILDE_K3, Z4C::K_HAT_K3, Z4C::A_TILDE_K3, Z4C::GAM_TILDE_K3, Z4C::THETA_K3, Z4C::ALPHAG
_K3, Z4C::BETAG_K3
INFO (CarpetX): SyncGroupsProlongateOnly Z4C::CHI_K4, Z4C::GAMMA_TILDE_K4, Z4C::K_HAT_K4, Z4C::A_TILDE_K4, Z4C::GAM_TILDE_K4, Z4C::THETA_K4, Z4C::ALPHAG
_K4, Z4C::BETAG_K4
INFO (ODESolvers): Set interior old state at t=0, to be prolongated later
INFO (ODESolvers): Fill refinement boundary ghost zones using Ys for stage #1 at t=0
INFO (ODESolvers): Calculating RHS #1 at t=0
INFO (CarpetX): CallFunction iteration 1 Z4c_RHSGroup: Z4c::Z4c_RHS
terminate called recursively
terminate called after throwing an instance of 'std::runtime_error'
  what():  Assertion `this->domain.numPts() > 0' failed, file "/mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/configs/sim-debug/scrat
ch/build/AMReX/amrex-24.10/Src/Base/AMReX_BaseFab.H", line 1930
```

Here’s the relevant part of the backtrace:

```none
#0  0x000015554c8de52f in raise () from /lib64/libc.so.6
#1  0x000015554c8b1e65 in abort () from /lib64/libc.so.6
#2  0x000015554d176bd9 in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x000015554d18225a in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
#4  0x000015554d1812c9 in __cxa_call_terminate (ue_header=0x63b60110) at ../../../../libstdc++-v3/libsupc++/eh_call.cc:54
#5  0x000015554d1819e6 in __gxx_personality_v0 (version=<optimized out>, actions=6, exception_class=5138137972254386944, ue_header=<optimized out>,
    context=0x7fffffff46c0) at ../../../../libstdc++-v3/libsupc++/eh_personality.cc:688
#6  0x000015554cc7cd64 in _Unwind_RaiseException_Phase2 (exc=0x63b60110, context=0x7fffffff46c0, frames_p=0x7fffffff47b0)
    at ../../../libgcc/unwind.inc:64
#7  0x000015554cc7d421 in _Unwind_RaiseException (exc=0x63b60110) at ../../../libgcc/unwind.inc:136
#8  0x000015554d18250a in __cxa_throw (obj=<optimized out>, tinfo=0x52feb9e0 <typeinfo for std::runtime_error@@GLIBCXX_3.4>,
    dest=0x423c00 <std::runtime_error::~runtime_error()@plt>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:93
#9  0x0000000002d5bab7 in amrex::Assert_host (EX=0x437ad0f "this->domain.numPts() > 0",
    file=0x4379500 "/mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/configs/sim-debug/scratch/build/AMReX/amrex-24.10/Src/Base/AMReX_B
aseFab.H", line=1930, msg=0x0 <cub::CUB_200200_600_610_700_750_800_860_NS::EmptyKernel<void>()>)
    at /mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/configs/sim-debug/scratch/build/AMReX/amrex-24.10/Src/Base/AMReX.cpp:292
#10 0x0000000002e58032 in amrex::Assert (msg=0x0 <cub::CUB_200200_600_610_700_750_800_860_NS::EmptyKernel<void>()>, line=1930,
    file=0x4379500 "/mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/configs/sim-debug/scratch/build/AMReX/amrex-24.10/Src/Base/AMReX_B
aseFab.H", EX=0x437ad0f "this->domain.numPts() > 0")
    at /mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/configs/sim-debug/scratch/build/AMReX/amrex-24.10/Src/Base/AMReX.H:186
#11 amrex::BaseFab<double>::define (this=0x7fffffff5a08)
    at /mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/configs/sim-debug/scratch/build/AMReX/amrex-24.10/Src/Base/AMReX_BaseFab.H:1930
#12 0x0000000002e55e09 in amrex::BaseFab<double>::BaseFab (this=0x7fffffff5a08, bx=..., n=154, ar=0x5458ddc0)
    at /mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/configs/sim-debug/scratch/build/AMReX/amrex-24.10/Src/Base/AMReX_BaseFab.H:1970
#13 0x0000000002e4fe32 in amrex::FArrayBox::FArrayBox (this=0x7fffffff5a08, b=..., ncomp=154, ar=0x5458ddc0)
    at /mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/configs/sim-debug/scratch/build/AMReX/amrex-24.10/Src/Base/AMReX_FArrayBox.cpp:
115
#14 0x0000000001a860a6 in Loop::GF3D5vector<double>::make_fab (layout=..., nvars=154)
    at /mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/arrangements/CarpetX/Loop/src/loop.hxx:1566
#15 0x0000000001a83166 in Loop::GF3D5vector<double>::GF3D5vector (this=0x7fffffff59d0, layout=..., nvars=154)
    at /mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/arrangements/CarpetX/Loop/src/loop.hxx:1570
#16 0x0000000001ab399f in Z4c_RHS (cctkGH=0x155530004ed0)
    at /mnt/lustre/tursafs1/home/dp325/dp325/dc-radi1/ETK/Cactus-CarpetX/arrangements/SpacetimeX/Z4c/src/rhs.cxx:97
```

The software versions I am using are:

* GCC 12.2.0
* CUDA 12.3
* Open MPI 4.1.5

Please let me know if any further information would be helpful to debug this.

--
Ticket URL: https://bitbucket.org/einsteintoolkit/tickets/issues/2890/carpetx-crash-failed-assertion-in-z4c_rhs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einsteintoolkit.org/pipermail/trac/attachments/20251007/f1fa772a/attachment-0001.htm>


More information about the Trac mailing list