[Users] Number of processors and ppn used

Nisa Amir nisaamir at math.qau.edu.pk
Tue Mar 29 20:44:49 CDT 2022


This happens on the laptop that I autofigured by following the tutorial.
After the warning the the job has been started I think the issue is that
the memory has been killed and it stops further processing of the
simulation.

On Wed, 30 Mar 2022, 4:35 am Roland Haas, <rhaas at illinois.edu> wrote:

> Hello Nisa,
>
> > When I submit my simulation
> > %%bash
> > # start simulation segment
> > ./simfactory/bin/sim submit NH --cores=1 --ppn-used=8 --walltime=0:2:00
> > Also tried this %%bash
> >
> > ./simfactory/bin/sim submit NH --cores=2 --num-threads=1
> --walltime=0:20:00
> > again it gives the same warning
> > it gives the warning that Total number of threads and number of cores per
> > node are inconsistent: procs=1, ppn-used=8 (procs must be an integer
> > multiple of ppn-used)
> > and after that when i run the parameter file it does not run completely
> and
> > shows only half or more than half output.
> > How can I resolve this issue so that I get the complete output.
>
> If there is output missing then most likely the job was killed by the
> queuing system since it ran out of walltime. Note that the first
> command requested only 2 minutes of walltime which is almost certainly
> too short for any "real" run.
>
> Usually this will show up at the bottom of the *.err file.
>
> You can either let simfactory print both the *.out and the *.err file
> to screen (or pipe into less) using:
>
> ./simfactory/bin/sim show-output NH  | less
>
> or query where the simulation output directory is:
>
> ./simfactory/bin/sim get-output-dir NH
>
> then use cd to go there and less to take a look at the err file.
>
> The other option is that the job hung, which will usually also show up
> as the queueing system killing your run due to it running out of
> walltime, but also will typically mean that the last output (timestamp
> of the output files eg *.asc visible via ls -l) is much older than the
> time the job was killed by the queuing system.
>
> If there is no queueing system (laptop) then something else could kill
> the job (eg runs out of memory).
>
> The warning about ppn-use is due to inconsistent options. Namely you
> are claiming via ppn-used=8 to use 8 cores per node but then are
> requesting only 1 core. It is just a warning though, if the job started
> then you do not have to worry. If you would like to avoid the warning
> you could use --cores 1 --ppn-used 1. Does his happen on a cluster
> (private? One officially supported by the ET?)? Or you laptop that you
> auto-configured via "sim setup-silent" or on the tutorial server?
>
> Yours,
> Roland
>
> --
> My email is as private as my paper mail. I therefore support encrypting
> and signing email messages. Get my PGP key from http://keys.gnupg.net.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20220330/de04c6f5/attachment.html 


More information about the Users mailing list