<div dir="auto"><div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Hello Erik,<br></div><div dir="ltr"><br> Thanks for your reply.<br> <br>I followed your suggestions and found two strings that may be indicative of what's going on. In the standard outputs 'CCTK_Proc14.out' and 'CCTK_Proc15.out' the last lines read<br><br>INFO (IllinoisGRMHD): Font fix failed!<br>INFO (IllinoisGRMHD): i,j,k = 67 63 16, stats.failure_checker = 0 x,y,z = 3.392857e+00 8.892857e+00 2.392857e+00 , index=111739 st_i = -1.002115e+08 2.298583e+08 -1.221746e+08, rhostar = 1.573103e+02, Bi = -1.064528e+03 1.120144e+03 2.972675e+03, gij = 6.643816e+00 5.521615e-01 4.380688e-01 7.244355e+00 -1.685406e-03 6.830374e+00, Psi6 = 1.803534e+01<br><br>I assume this means that there are issues in the con2prim of IGM. That INFO is printed by harm_primitives_lowlevel.C:<br><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> // Use the new Font fix subroutine<br> int font_fix_applied=0;<br> if(check!=0) {<br> font_fix_applied=1;<br> CCTK_REAL u_xl=1e100, u_yl=1e100, u_zl=1e100; // Set to insane values to ensure they are overwritten.<br> if (gamma_equals2==1) {<br> check = font_fix_gamma_equals2(u_xl,u_yl,u_zl,CONSERVS,PRIMS,METRIC_PHYS,METRIC_LAP_PSI4,eos);<br> } else {<br> check = font_fix_general_gamma(u_xl,u_yl,u_zl,CONSERVS,PRIMS,METRIC_PHYS,METRIC_LAP_PSI4,eos);<br> }<br> //Translate to HARM primitive now:<br> prim[UTCON1] = METRIC_PHYS[GUPXX]*u_xl + METRIC_PHYS[GUPXY]*u_yl + METRIC_PHYS[GUPXZ]*u_zl;<br> prim[UTCON2] = METRIC_PHYS[GUPXY]*u_xl + METRIC_PHYS[GUPYY]*u_yl + METRIC_PHYS[GUPYZ]*u_zl;<br> prim[UTCON3] = METRIC_PHYS[GUPXZ]*u_xl + METRIC_PHYS[GUPYZ]*u_yl + METRIC_PHYS[GUPZZ]*u_zl;<br> if (check==1) {<br> CCTK_VInfo(CCTK_THORNSTRING,"Font fix failed!");<br> CCTK_VInfo(CCTK_THORNSTRING,"i,j,k = %d %d %d, stats.failure_checker = %d x,y,z = %e %e %e , index=%d st_i = %e %e %e, rhostar = %e, Bi = %e %e %e, gij = %e %e %e %e %e %e, Psi6 = %e",i,j,k,stats.failure_checker,X[index],Y[index],Z[index],index,mhd_st_x_orig,mhd_st_y_orig,mhd_st_z_orig,rho_star_orig,PRIMS[BX_CENTER],PRIMS[BY_CENTER],PRIMS[BZ_CENTER],METRIC_PHYS[GXX],METRIC_PHYS[GXY],METRIC_PHYS[GXZ],METRIC_PHYS[GYY],METRIC_PHYS[GYZ],METRIC_PHYS[GZZ],METRIC_LAP_PSI4[PSI6]);<br> exit(1); // Let's exit instead of printing potentially GBs of log files. Uncomment if you really want to deal with a mess.<br> }<br> }<br> stats.failure_checker+=font_fix_applied*10000;<br> stats.font_fixed=font_fix_applied;<br></blockquote> <br><br>Can I do anything that may help pinpoint the cause of this error?<br><br>Thanks in advance,<br><br>Federico</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Il giorno gio 15 set 2022 alle ore 15:30 Erik Schnetter <<a href="mailto:schnetter@gmail.com" target="_blank" rel="noreferrer">schnetter@gmail.com</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Federico<br>
<br>
Thanks for including the output, that is helpful.<br>
<br>
There are parameters "Carpet::verbose" and "Carpet::veryverbose". You<br>
can set them to "yes" and recover from a checkpoint. This gives more<br>
information about what the code is doing, and thus where it crashes.<br>
<br>
The output you attached is only from the first MPI process. Other<br>
processes' output might contain a clue. You can add the command line<br>
option "-roe" to Cactus when you run the simulation. This will collect<br>
output from all processes.<br>
<br>
-erik<br>
<br>
On Thu, Sep 15, 2022 at 9:20 AM Federico Cattorini<br>
<<a href="mailto:f.cattorini@campus.unimib.it" target="_blank" rel="noreferrer">f.cattorini@campus.unimib.it</a>> wrote:<br>
><br>
> Hello everyone,<br>
><br>
> I am experiencing some issue in a number of GRMHD simulations of black hole binaries employing IllinoisGRMHD.<br>
><br>
> As an example, I will write about an unequal-mass BHB configuration (with q = 2) that I'm running.<br>
><br>
> After approximately ten orbits, the run stops with no error codes or any other message that could help me identify the issue. The last lines of the standard output are<br>
><br>
> INFO (IllinoisGRMHD): ***** Iter. # 353949, Lev: 9, Integrating to time: 3.160260e+03 *****<br>
> INFO (IllinoisGRMHD): C2P: Lev: 9 NumPts= 569160 | Fixes: Font= 393 VL= 179 rho*= 2 | Failures: 0 InHoriz= 0 / 0 | Error: 7.124e-02, ErrDenom: 4.838e+13 | 4.51 iters/gridpt<br>
> INFO (IllinoisGRMHD): ***** Iter. # 353949, Lev: 9, Integrating to time: 3.160269e+03 *****<br>
> Simfactory Done at date: gio 04 ago 2022 11:43:01 CEST<br>
><br>
><br>
><br>
> I tried restarting my simulation from the latest checkpoint, but the same sudden stop occurred at the same timestep.<br>
><br>
> At first, I thought about some problem with IGM. The last INFO is printed by IllinoisGRMHD_driver_evaluate_MHD_rhs.C, so I put some prints in it to identify the spot where the error occurs.<br>
> Unfortunately, I drew a blank, since the stop seems to occur just after the end of IllinoisGRMHD_driver_evaluate_MHD_rhs:<br>
><br>
> INFO (IllinoisGRMHD): ***** line 52: entering IllinoisGRMHD_driver_evaluate_MHD_rhs *****<br>
> INFO (IllinoisGRMHD): ***** Iter. # 353949, Lev: 10, Integrating to time: 3.160251e+03 *****<br>
> INFO (IllinoisGRMHD): ***** line 100: IllinoisGRMHD_driver_evaluate_MHD_rhs *****<br>
> INFO (IllinoisGRMHD): ***** line 204: just before reconstruct_set_of_prims_PPM *****<br>
> INFO (IllinoisGRMHD): ***** DEBUG END of IllinoisGRMHD_driver_evaluate_MHD_rhs *****<br>
> Simfactory Done at date: gio 04 ago 2022 19:44:55 CEST<br>
><br>
><br>
> I tried to restart the simulation and run it on pure MPI. It ran for a few more iterations, then stopped as well:<br>
><br>
> INFO (IllinoisGRMHD): ***** line 52: entering IllinoisGRMHD_driver_evaluate_MHD_rhs *****<br>
> INFO (IllinoisGRMHD): ***** Iter. # 353565, Lev: 10, Integrating to time: 3.156831e+03 *****<br>
> INFO (IllinoisGRMHD): ***** line 100: IllinoisGRMHD_driver_evaluate_MHD_rhs *****<br>
> INFO (IllinoisGRMHD): ***** line 204: just before reconstruct_set_of_prims_PPM *****<br>
> INFO (IllinoisGRMHD): ***** DEBUG END of IllinoisGRMHD_driver_evaluate_MHD_rhs *****<br>
> Simfactory Done at date: ven 05 ago 2022 19:00:13 CEST<br>
><br>
><br>
> The simulation setup is as follows:<br>
><br>
> Allocated:<br>
> Nodes: 10<br>
> Cores per node: 48<br>
> SLURM setting<br>
> SLURM_NNODES : 10<br>
> SLURM_NPROCS : 20<br>
> SLURM_NTASKS : 20<br>
> SLURM_CPUS_ON_NODE : 48<br>
> SLURM_CPUS_PER_TASK : 24<br>
> SLURM_TASKS_PER_NODE: 2(x10)<br>
> Running:<br>
> MPI processes: 20<br>
> OpenMP threads per process: 24<br>
> MPI processes per node: 2.0<br>
> OpenMP threads per core: 1.0<br>
> OpenMP threads per node: 48<br>
><br>
><br>
> while the pure-MPI setup is<br>
><br>
> Allocated:<br>
> Nodes: 10<br>
> Cores per node: 48<br>
> SLURM setting<br>
> SLURM_NNODES : 10<br>
> SLURM_NPROCS : 480<br>
> SLURM_NTASKS : 480<br>
> SLURM_CPUS_ON_NODE : 48<br>
> SLURM_CPUS_PER_TASK : 1<br>
> SLURM_TASKS_PER_NODE: 48(x10)<br>
> Running:<br>
> MPI processes: 480<br>
> OpenMP threads per process: 1<br>
> MPI processes per node: 48.0<br>
> OpenMP threads per core: 1.0<br>
> OpenMP threads per node: 48<br>
><br>
><br>
> I am using The Lorentz version of ET.<br>
><br>
> I've had this issue for two binary BH simulations, both unequal-mass with q = 2. My colleague Giacomo Fedrigo experienced the same problem running an equal-mass simulation.<br>
><br>
> I attach the q = 2 (s_UUmis_Q2) parameter file and the ET config-info file. Also, I attach the st. error and output of my q = 2 run and of Giacomo's run (b1_UUmis_a12b_pol3_r56_gauss_9). The st. outputs were cut for readability reasons.<br>
><br>
> Can someone please help me with this?<br>
><br>
> Thanks in advance,<br>
><br>
> Federico<br>
> _______________________________________________<br>
> Users mailing list<br>
> <a href="mailto:Users@einsteintoolkit.org" target="_blank" rel="noreferrer">Users@einsteintoolkit.org</a><br>
> <a href="http://lists.einsteintoolkit.org/mailman/listinfo/users" rel="noreferrer noreferrer" target="_blank">http://lists.einsteintoolkit.org/mailman/listinfo/users</a><br>
<br>
<br>
<br>
-- <br>
Erik Schnetter <<a href="mailto:schnetter@gmail.com" target="_blank" rel="noreferrer">schnetter@gmail.com</a>><br>
<a href="http://www.perimeterinstitute.ca/personal/eschnetter/" rel="noreferrer noreferrer" target="_blank">http://www.perimeterinstitute.ca/personal/eschnetter/</a><br>
</blockquote></div>
</div></div></div>