[Users] Issue with black hole binary evolution employing IGM

Thu Sep 22 00:37:12 CDT 2022

Hello Erik,

   Thanks for your reply.

I followed your suggestions and found two strings that may be indicative of
what's going on. In the standard outputs 'CCTK_Proc14.out' and
'CCTK_Proc15.out' the last lines read

INFO (IllinoisGRMHD): Font fix failed!
INFO (IllinoisGRMHD): i,j,k = 67 63 16, stats.failure_checker = 0 x,y,z =
3.392857e+00 8.892857e+00 2.392857e+00 , index=111739 st_i = -1.002115e+08
2.298583e+08 -1.221746e+08, rhostar = 1.573103e+02, Bi = -1.064528e+03
1.120144e+03 2.972675e+03, gij = 6.643816e+00 5.521615e-01 4.380688e-01
7.244355e+00 -1.685406e-03 6.830374e+00, Psi6 = 1.803534e+01

I assume this means that there are issues in the con2prim of IGM. That INFO
is printed by harm_primitives_lowlevel.C:

    // Use the new Font fix subroutine
>     int font_fix_applied=0;
>     if(check!=0) {
>       font_fix_applied=1;
>       CCTK_REAL u_xl=1e100, u_yl=1e100, u_zl=1e100; // Set to insane
> values to ensure they are overwritten.
>       if (gamma_equals2==1) {
>         check =
> font_fix_gamma_equals2(u_xl,u_yl,u_zl,CONSERVS,PRIMS,METRIC_PHYS,METRIC_LAP_PSI4,eos);
>       } else {
> check =
> font_fix_general_gamma(u_xl,u_yl,u_zl,CONSERVS,PRIMS,METRIC_PHYS,METRIC_LAP_PSI4,eos);
>       }
>       //Translate to HARM primitive now:
>       prim[UTCON1] = METRIC_PHYS[GUPXX]*u_xl + METRIC_PHYS[GUPXY]*u_yl +
> METRIC_PHYS[GUPXZ]*u_zl;
>       prim[UTCON2] = METRIC_PHYS[GUPXY]*u_xl + METRIC_PHYS[GUPYY]*u_yl +
> METRIC_PHYS[GUPYZ]*u_zl;
>       prim[UTCON3] = METRIC_PHYS[GUPXZ]*u_xl + METRIC_PHYS[GUPYZ]*u_yl +
> METRIC_PHYS[GUPZZ]*u_zl;
>       if (check==1) {
>         CCTK_VInfo(CCTK_THORNSTRING,"Font fix failed!");
>         CCTK_VInfo(CCTK_THORNSTRING,"i,j,k = %d %d %d,
> stats.failure_checker = %d x,y,z = %e %e %e , index=%d st_i = %e %e %e,
> rhostar = %e, Bi = %e %e %e, gij = %e %e %e %e %e %e, Psi6 =
> %e",i,j,k,stats.failure_checker,X[index],Y[index],Z[index],index,mhd_st_x_orig,mhd_st_y_orig,mhd_st_z_orig,rho_star_orig,PRIMS[BX_CENTER],PRIMS[BY_CENTER],PRIMS[BZ_CENTER],METRIC_PHYS[GXX],METRIC_PHYS[GXY],METRIC_PHYS[GXZ],METRIC_PHYS[GYY],METRIC_PHYS[GYZ],METRIC_PHYS[GZZ],METRIC_LAP_PSI4[PSI6]);
>         exit(1);  // Let's exit instead of printing potentially GBs of log
> files. Uncomment if you really want to deal with a mess.
>       }
>     }
>     stats.failure_checker+=font_fix_applied*10000;
>     stats.font_fixed=font_fix_applied;
>

Can I do anything that may help pinpoint the cause of this error?

Thanks in advance,

Federico

Il giorno gio 15 set 2022 alle ore 15:30 Erik Schnetter <schnetter at gmail.com>
ha scritto:

> Federico
>
> Thanks for including the output, that is helpful.
>
> There are parameters "Carpet::verbose" and "Carpet::veryverbose". You
> can set them to "yes" and recover from a checkpoint. This gives more
> information about what the code is doing, and thus where it crashes.
>
> The output you attached is only from the first MPI process. Other
> processes' output might contain a clue. You can add the command line
> option "-roe" to Cactus when you run the simulation. This will collect
> output from all processes.
>
> -erik
>
> On Thu, Sep 15, 2022 at 9:20 AM Federico Cattorini
> <f.cattorini at campus.unimib.it> wrote:
> >
> > Hello everyone,
> >
> > I am experiencing some issue in a number of GRMHD simulations of black
> hole binaries employing IllinoisGRMHD.
> >
> > As an example, I will write about an unequal-mass BHB configuration
> (with q = 2) that I'm running.
> >
> > After approximately ten orbits, the run stops with no error codes or any
> other message that could help me identify the issue. The last lines of the
> standard output are
> >
> > INFO (IllinoisGRMHD): ***** Iter. # 353949, Lev: 9, Integrating to time:
> 3.160260e+03 *****
> > INFO (IllinoisGRMHD): C2P: Lev: 9 NumPts= 569160 | Fixes: Font= 393 VL=
> 179 rho*= 2 | Failures: 0 InHoriz= 0 / 0 | Error: 7.124e-02, ErrDenom:
> 4.838e+13 | 4.51 iters/gridpt
> > INFO (IllinoisGRMHD): ***** Iter. # 353949, Lev: 9, Integrating to time:
> 3.160269e+03 *****
> > Simfactory Done at date: gio 04 ago 2022 11:43:01 CEST
> >
> >
> >
> > I tried restarting my simulation from the latest checkpoint, but the
> same sudden stop occurred at the same timestep.
> >
> > At first, I thought about some problem with IGM. The last INFO is
> printed by IllinoisGRMHD_driver_evaluate_MHD_rhs.C, so I put some prints in
> it to identify the spot where the error occurs.
> > Unfortunately, I drew a blank, since the stop seems to occur just after
> the end of IllinoisGRMHD_driver_evaluate_MHD_rhs:
> >
> > INFO (IllinoisGRMHD): ***** line 52: entering
> IllinoisGRMHD_driver_evaluate_MHD_rhs *****
> > INFO (IllinoisGRMHD): ***** Iter. # 353949, Lev: 10, Integrating to
> time: 3.160251e+03 *****
> > INFO (IllinoisGRMHD): ***** line 100:
> IllinoisGRMHD_driver_evaluate_MHD_rhs *****
> > INFO (IllinoisGRMHD): ***** line 204: just before
> reconstruct_set_of_prims_PPM *****
> > INFO (IllinoisGRMHD): ***** DEBUG END of
> IllinoisGRMHD_driver_evaluate_MHD_rhs *****
> > Simfactory Done at date: gio 04 ago 2022 19:44:55 CEST
> >
> >
> > I tried to restart the simulation and run it on pure MPI. It ran for a
> few more iterations, then stopped as well:
> >
> > INFO (IllinoisGRMHD): ***** line 52: entering
> IllinoisGRMHD_driver_evaluate_MHD_rhs *****
> > INFO (IllinoisGRMHD): ***** Iter. # 353565, Lev: 10, Integrating to
> time: 3.156831e+03 *****
> > INFO (IllinoisGRMHD): ***** line 100:
> IllinoisGRMHD_driver_evaluate_MHD_rhs *****
> > INFO (IllinoisGRMHD): ***** line 204: just before
> reconstruct_set_of_prims_PPM *****
> > INFO (IllinoisGRMHD): ***** DEBUG END of
> IllinoisGRMHD_driver_evaluate_MHD_rhs *****
> > Simfactory Done at date: ven 05 ago 2022 19:00:13 CEST
> >
> >
> > The simulation setup is as follows:
> >
> >    Allocated:
> >       Nodes:                      10
> >       Cores per node:             48
> >    SLURM setting
> >       SLURM_NNODES :  10
> >       SLURM_NPROCS :  20
> >       SLURM_NTASKS :  20
> >       SLURM_CPUS_ON_NODE  :  48
> >       SLURM_CPUS_PER_TASK :  24
> >       SLURM_TASKS_PER_NODE:  2(x10)
> >    Running:
> >       MPI processes:              20
> >       OpenMP threads per process: 24
> >       MPI processes per node:     2.0
> >       OpenMP threads per core:    1.0
> >       OpenMP threads per node:    48
> >
> >
> > while the pure-MPI setup is
> >
> >    Allocated:
> >       Nodes:                      10
> >       Cores per node:             48
> >    SLURM setting
> >       SLURM_NNODES :  10
> >       SLURM_NPROCS :  480
> >       SLURM_NTASKS :  480
> >       SLURM_CPUS_ON_NODE  :  48
> >       SLURM_CPUS_PER_TASK :  1
> >       SLURM_TASKS_PER_NODE:  48(x10)
> >    Running:
> >       MPI processes:              480
> >       OpenMP threads per process: 1
> >       MPI processes per node:     48.0
> >       OpenMP threads per core:    1.0
> >       OpenMP threads per node:    48
> >
> >
> > I am using The Lorentz version of ET.
> >
> > I've had this issue for two binary BH simulations, both unequal-mass
> with q = 2. My colleague Giacomo Fedrigo experienced the same problem
> running an equal-mass simulation.
> >
> > I attach the q = 2 (s_UUmis_Q2) parameter file and the ET config-info
> file. Also, I attach the st. error and output of my q = 2 run and of
> Giacomo's run (b1_UUmis_a12b_pol3_r56_gauss_9). The st. outputs were cut
> for readability reasons.
> >
> > Can someone please help me with this?
> >
> > Thanks in advance,
> >
> > Federico
> > _______________________________________________
> > Users mailing list
> > Users at einsteintoolkit.org
> > http://lists.einsteintoolkit.org/mailman/listinfo/users
>
>
>
> --
> Erik Schnetter <schnetter at gmail.com>
> http://www.perimeterinstitute.ca/personal/eschnetter/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20220922/454103dd/attachment.html