[Users] Thorn setup taking too much time in cluster

Steven R. Brandt sbrandt at cct.lsu.edu
Tue Apr 4 09:28:00 CDT 2023


Shamim,

I'm glad to see it ran on your next attempt. Did you do anything 
differently?

--Steve

On 4/4/2023 3:08 AM, Shamim Haque 1910511 wrote:
> Dear Steven,
>
> I assure you that I submitted the simulation for the first time only. 
> I used "sim create-submit" to submit the simulation, which would not 
> submit the job if the same name was executed earlier.
>
> Secondly, I found this same message appearing in the output files from 
> debug queue (1 node, with GRHydro) and high memory node (3 nodes, with 
> IllinoisGRMHD), here the simulation ran successfully. I have attached 
> the output files for reference.
>
> Regards
> Shamim Haque
> Senior Research Fellow (SRF)
> Department of Physics
> IISER Bhopal
>
>>
> On Tue, Apr 4, 2023 at 12:35 AM Steven R. Brandt <sbrandt at cct.lsu.edu> 
> wrote:
>
>     I see this error message in your output:
>
>       -> [0m No HDF5 checkpoint files with basefilename
>     'checkpoint.chkpt' and file extension '.h5' found in recovery
>     directory 'nsns_toy1.2_DDME2BPS_quark_1.2vs1.6M_40km_g25'
>
>     I suspect you did a "sim submit" for a job, got a failure, and did
>     a second "sim submit" without purging. That immediately triggered
>     the error. Then, for some reason, MPI didn't shut down cleanly and
>     the processes hung doing nothing until they used up the walltime.
>
>     --Steve
>
>     On 4/2/2023 5:16 AM, Shamim Haque 1910511 wrote:
>>     Hello,
>>
>>     I am trying to run BNSM using IllinoisGRMHD on HPC Kanad at IISER
>>     Bhopal. While I have tested the parfile to be running fine on
>>     debug queue (1 node) and high memory queue (3 nodes), I am unable
>>     to run the simulation in a queue with 9 nodes (144 cores).
>>
>>     The output file suggests that the setup of listed thorns is not
>>     complete within 24 hours, which is the max walltime for this queue.
>>
>>     Is there a way to sort out this issue? I have attached the
>>     parfile and outfile for reference.
>>
>>     Regards
>>     Shamim Haque
>>     Senior Research Fellow (SRF)
>>     Department of Physics
>>     IISER Bhopal
>>>>
>>     _______________________________________________
>>     Users mailing list
>>     Users at einsteintoolkit.org
>>     http://lists.einsteintoolkit.org/mailman/listinfo/users
>     _______________________________________________
>     Users mailing list
>     Users at einsteintoolkit.org
>     http://lists.einsteintoolkit.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20230404/c389d4e7/attachment.html 


More information about the Users mailing list