[Users] Issue with black hole binary evolution employing IGM

Roland Haas rhaas at illinois.edu
Thu Sep 22 08:29:22 CDT 2022


Hello all,

The exit(1) really should be a CCTK_ERROR("Failed after Font fix")
ideally with output of the offending variable values.

Would one of you mind creating a bug report for IllinoisGRMHD so  that
the IGM maintainers become aware and can produce a fix before the next
release?

Yours,
Roland

> Hello Federico,
> I was having the same problem with binary black holes evolutions with
> IllinoisGRMHD. In my case, the "exit(1)" statement in
> harm_primitives_lowlevel.C in the piece of code you copied above silently
> killed my runs when the Font fix failed, pretty much as in your run. Just
> commenting it out solved that problem for me, so you can give it a try if
> you want.
> 
> On Thu, Sep 22, 2022, 07:37 Federico Cattorini <f.cattorini at campus.unimib.it>
> wrote:
> 
> > Hello Erik,
> >
> >    Thanks for your reply.
> >
> > I followed your suggestions and found two strings that may be indicative
> > of what's going on. In the standard outputs 'CCTK_Proc14.out' and
> > 'CCTK_Proc15.out' the last lines read
> >
> > INFO (IllinoisGRMHD): Font fix failed!
> > INFO (IllinoisGRMHD): i,j,k = 67 63 16, stats.failure_checker = 0 x,y,z =
> > 3.392857e+00 8.892857e+00 2.392857e+00 , index=111739 st_i = -1.002115e+08
> > 2.298583e+08 -1.221746e+08, rhostar = 1.573103e+02, Bi = -1.064528e+03
> > 1.120144e+03 2.972675e+03, gij = 6.643816e+00 5.521615e-01 4.380688e-01
> > 7.244355e+00 -1.685406e-03 6.830374e+00, Psi6 = 1.803534e+01
> >
> > I assume this means that there are issues in the con2prim of IGM. That
> > INFO is printed by harm_primitives_lowlevel.C:
> >
> >     // Use the new Font fix subroutine  
> >>     int font_fix_applied=0;
> >>     if(check!=0) {
> >>       font_fix_applied=1;
> >>       CCTK_REAL u_xl=1e100, u_yl=1e100, u_zl=1e100; // Set to insane
> >> values to ensure they are overwritten.
> >>       if (gamma_equals2==1) {
> >>         check =
> >> font_fix_gamma_equals2(u_xl,u_yl,u_zl,CONSERVS,PRIMS,METRIC_PHYS,METRIC_LAP_PSI4,eos);
> >>       } else {
> >> check =
> >> font_fix_general_gamma(u_xl,u_yl,u_zl,CONSERVS,PRIMS,METRIC_PHYS,METRIC_LAP_PSI4,eos);
> >>       }
> >>       //Translate to HARM primitive now:
> >>       prim[UTCON1] = METRIC_PHYS[GUPXX]*u_xl + METRIC_PHYS[GUPXY]*u_yl +
> >> METRIC_PHYS[GUPXZ]*u_zl;
> >>       prim[UTCON2] = METRIC_PHYS[GUPXY]*u_xl + METRIC_PHYS[GUPYY]*u_yl +
> >> METRIC_PHYS[GUPYZ]*u_zl;
> >>       prim[UTCON3] = METRIC_PHYS[GUPXZ]*u_xl + METRIC_PHYS[GUPYZ]*u_yl +
> >> METRIC_PHYS[GUPZZ]*u_zl;
> >>       if (check==1) {
> >>         CCTK_VInfo(CCTK_THORNSTRING,"Font fix failed!");
> >>         CCTK_VInfo(CCTK_THORNSTRING,"i,j,k = %d %d %d,
> >> stats.failure_checker = %d x,y,z = %e %e %e , index=%d st_i = %e %e %e,
> >> rhostar = %e, Bi = %e %e %e, gij = %e %e %e %e %e %e, Psi6 =
> >> %e",i,j,k,stats.failure_checker,X[index],Y[index],Z[index],index,mhd_st_x_orig,mhd_st_y_orig,mhd_st_z_orig,rho_star_orig,PRIMS[BX_CENTER],PRIMS[BY_CENTER],PRIMS[BZ_CENTER],METRIC_PHYS[GXX],METRIC_PHYS[GXY],METRIC_PHYS[GXZ],METRIC_PHYS[GYY],METRIC_PHYS[GYZ],METRIC_PHYS[GZZ],METRIC_LAP_PSI4[PSI6]);
> >>         exit(1);  // Let's exit instead of printing potentially GBs of
> >> log files. Uncomment if you really want to deal with a mess.
> >>       }
> >>     }
> >>     stats.failure_checker+=font_fix_applied*10000;
> >>     stats.font_fixed=font_fix_applied;
> >>  
> >
> >
> > Can I do anything that may help pinpoint the cause of this error?
> >
> > Thanks in advance,
> >
> > Federico
> >
> > Il giorno gio 15 set 2022 alle ore 15:30 Erik Schnetter <  
> > schnetter at gmail.com> ha scritto:  
> >  
> >> Federico
> >>
> >> Thanks for including the output, that is helpful.
> >>
> >> There are parameters "Carpet::verbose" and "Carpet::veryverbose". You
> >> can set them to "yes" and recover from a checkpoint. This gives more
> >> information about what the code is doing, and thus where it crashes.
> >>
> >> The output you attached is only from the first MPI process. Other
> >> processes' output might contain a clue. You can add the command line
> >> option "-roe" to Cactus when you run the simulation. This will collect
> >> output from all processes.
> >>
> >> -erik
> >>
> >> On Thu, Sep 15, 2022 at 9:20 AM Federico Cattorini
> >> <f.cattorini at campus.unimib.it> wrote:  
> >> >
> >> > Hello everyone,
> >> >
> >> > I am experiencing some issue in a number of GRMHD simulations of black  
> >> hole binaries employing IllinoisGRMHD.  
> >> >
> >> > As an example, I will write about an unequal-mass BHB configuration  
> >> (with q = 2) that I'm running.  
> >> >
> >> > After approximately ten orbits, the run stops with no error codes or  
> >> any other message that could help me identify the issue. The last lines of
> >> the standard output are  
> >> >
> >> > INFO (IllinoisGRMHD): ***** Iter. # 353949, Lev: 9, Integrating to  
> >> time: 3.160260e+03 *****  
> >> > INFO (IllinoisGRMHD): C2P: Lev: 9 NumPts= 569160 | Fixes: Font= 393 VL=  
> >> 179 rho*= 2 | Failures: 0 InHoriz= 0 / 0 | Error: 7.124e-02, ErrDenom:
> >> 4.838e+13 | 4.51 iters/gridpt  
> >> > INFO (IllinoisGRMHD): ***** Iter. # 353949, Lev: 9, Integrating to  
> >> time: 3.160269e+03 *****  
> >> > Simfactory Done at date: gio 04 ago 2022 11:43:01 CEST
> >> >
> >> >
> >> >
> >> > I tried restarting my simulation from the latest checkpoint, but the  
> >> same sudden stop occurred at the same timestep.  
> >> >
> >> > At first, I thought about some problem with IGM. The last INFO is  
> >> printed by IllinoisGRMHD_driver_evaluate_MHD_rhs.C, so I put some prints in
> >> it to identify the spot where the error occurs.  
> >> > Unfortunately, I drew a blank, since the stop seems to occur just after  
> >> the end of IllinoisGRMHD_driver_evaluate_MHD_rhs:  
> >> >
> >> > INFO (IllinoisGRMHD): ***** line 52: entering  
> >> IllinoisGRMHD_driver_evaluate_MHD_rhs *****  
> >> > INFO (IllinoisGRMHD): ***** Iter. # 353949, Lev: 10, Integrating to  
> >> time: 3.160251e+03 *****  
> >> > INFO (IllinoisGRMHD): ***** line 100:  
> >> IllinoisGRMHD_driver_evaluate_MHD_rhs *****  
> >> > INFO (IllinoisGRMHD): ***** line 204: just before  
> >> reconstruct_set_of_prims_PPM *****  
> >> > INFO (IllinoisGRMHD): ***** DEBUG END of  
> >> IllinoisGRMHD_driver_evaluate_MHD_rhs *****  
> >> > Simfactory Done at date: gio 04 ago 2022 19:44:55 CEST
> >> >
> >> >
> >> > I tried to restart the simulation and run it on pure MPI. It ran for a  
> >> few more iterations, then stopped as well:  
> >> >
> >> > INFO (IllinoisGRMHD): ***** line 52: entering  
> >> IllinoisGRMHD_driver_evaluate_MHD_rhs *****  
> >> > INFO (IllinoisGRMHD): ***** Iter. # 353565, Lev: 10, Integrating to  
> >> time: 3.156831e+03 *****  
> >> > INFO (IllinoisGRMHD): ***** line 100:  
> >> IllinoisGRMHD_driver_evaluate_MHD_rhs *****  
> >> > INFO (IllinoisGRMHD): ***** line 204: just before  
> >> reconstruct_set_of_prims_PPM *****  
> >> > INFO (IllinoisGRMHD): ***** DEBUG END of  
> >> IllinoisGRMHD_driver_evaluate_MHD_rhs *****  
> >> > Simfactory Done at date: ven 05 ago 2022 19:00:13 CEST
> >> >
> >> >
> >> > The simulation setup is as follows:
> >> >
> >> >    Allocated:
> >> >       Nodes:                      10
> >> >       Cores per node:             48
> >> >    SLURM setting
> >> >       SLURM_NNODES :  10
> >> >       SLURM_NPROCS :  20
> >> >       SLURM_NTASKS :  20
> >> >       SLURM_CPUS_ON_NODE  :  48
> >> >       SLURM_CPUS_PER_TASK :  24
> >> >       SLURM_TASKS_PER_NODE:  2(x10)
> >> >    Running:
> >> >       MPI processes:              20
> >> >       OpenMP threads per process: 24
> >> >       MPI processes per node:     2.0
> >> >       OpenMP threads per core:    1.0
> >> >       OpenMP threads per node:    48
> >> >
> >> >
> >> > while the pure-MPI setup is
> >> >
> >> >    Allocated:
> >> >       Nodes:                      10
> >> >       Cores per node:             48
> >> >    SLURM setting
> >> >       SLURM_NNODES :  10
> >> >       SLURM_NPROCS :  480
> >> >       SLURM_NTASKS :  480
> >> >       SLURM_CPUS_ON_NODE  :  48
> >> >       SLURM_CPUS_PER_TASK :  1
> >> >       SLURM_TASKS_PER_NODE:  48(x10)
> >> >    Running:
> >> >       MPI processes:              480
> >> >       OpenMP threads per process: 1
> >> >       MPI processes per node:     48.0
> >> >       OpenMP threads per core:    1.0
> >> >       OpenMP threads per node:    48
> >> >
> >> >
> >> > I am using The Lorentz version of ET.
> >> >
> >> > I've had this issue for two binary BH simulations, both unequal-mass  
> >> with q = 2. My colleague Giacomo Fedrigo experienced the same problem
> >> running an equal-mass simulation.  
> >> >
> >> > I attach the q = 2 (s_UUmis_Q2) parameter file and the ET config-info  
> >> file. Also, I attach the st. error and output of my q = 2 run and of
> >> Giacomo's run (b1_UUmis_a12b_pol3_r56_gauss_9). The st. outputs were cut
> >> for readability reasons.  
> >> >
> >> > Can someone please help me with this?
> >> >
> >> > Thanks in advance,
> >> >
> >> > Federico
> >> > _______________________________________________
> >> > Users mailing list
> >> > Users at einsteintoolkit.org
> >> > https://urldefense.com/v3/__http://lists.einsteintoolkit.org/mailman/listinfo/users__;!!DZ3fjg!7hy50UiCZQBt5fMM3BkYAFxBF_QtxwDruO1lrCkhZyGvaCqcGuHp9Eg9PprMhD3OmG4NmwDCZZRZhS79OILzwtdeSg$    
> >>
> >>
> >>
> >> --
> >> Erik Schnetter <schnetter at gmail.com>
> >> https://urldefense.com/v3/__http://www.perimeterinstitute.ca/personal/eschnetter/__;!!DZ3fjg!7hy50UiCZQBt5fMM3BkYAFxBF_QtxwDruO1lrCkhZyGvaCqcGuHp9Eg9PprMhD3OmG4NmwDCZZRZhS79OIJtSmdJRA$  
> >>  
> > _______________________________________________
> > Users mailing list
> > Users at einsteintoolkit.org
> > https://urldefense.com/v3/__http://lists.einsteintoolkit.org/mailman/listinfo/users__;!!DZ3fjg!7hy50UiCZQBt5fMM3BkYAFxBF_QtxwDruO1lrCkhZyGvaCqcGuHp9Eg9PprMhD3OmG4NmwDCZZRZhS79OILzwtdeSg$  
> >  



-- 
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://keys.gnupg.net.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20220922/6b1b71e6/attachment.bin 


More information about the Users mailing list