[Users] Inconsistency warnings: cores/threads mismatch [Leonardo cluster]
Steven Brandt
sbrandt at cct.lsu.edu
Fri Oct 24 13:49:04 CDT 2025
On 10/24/2025 9:16 AM, IOSIF PANAGIOTIS wrote:
> Dear Steve,
>
> Thank you for your reply.
>
> At this point, I reckon it makes sense to just use a full node (of 112
> cores) for plain tests.
> In any case, they will finish quickly enough and not consume a lot of
> my allocated time.
Probably so. Maybe even a fraction of a node.
>
> I will leave the performance/convergence study varying the number of
> cores for the time being.
>
> Regarding the --memory" option, I read the minutes from the last ETK
> meeting
> <https://lists.einsteintoolkit.org/pipermail/users/2025-October/009786.html>,
> and it seems that there is no obvious answer.
Yes. No one seems to know. I suspect there's some way to pass it along
to the submitscript where it can be given to slurm, but I'd have to try
and read through the source code to figure that out.
--Steve
>
> In case it is useful, I summarise my current understanding below:
>
> *
> this wiki link
> <https://docs.einsteintoolkit.org/et-docs/Configuring_a_new_machine> says
> that "|memory| is currently only used by simfactory's |distribute|
> utility script"
> *
> The |distribute| script seems to be a testing script only, so it
> does not seem to be relevant when we submit a job
> *
> There doesn't seem to be any source documenting exactly how
> simfactory handles memory
> *
> Without taking a look into the respective source code, my guess
> is that simfactory will try to use as much memory as needed by the
> job (but not necessarily the whole memory of the node)
> *
> In any case, if we request a full node, we will get billed
> accordingly
> *
> /Only if we request less than a full node/, would the --memory
> option possibly come into play and affect the billing
>
>
> Best,
> Panagiotis
>
>
> ------------------------------------------------------------------------
> *From:* Users <users-bounces at einsteintoolkit.org> on behalf of Steven
> Brandt via Users <users at einsteintoolkit.org>
> *Sent:* Thursday, October 23, 2025 4:28 PM
> *To:* users at einsteintoolkit.org <users at einsteintoolkit.org>
> *Subject:* Re: [Users] Inconsistency warnings: cores/threads mismatch
> [Leonardo cluster]
>
>
> On 10/14/2025 3:55 AM, IOSIF PANAGIOTIS wrote:
>> Hi all,
>>
>> I am sending a reminder regarding two unanswered questions on the
>> mailing list, in case someone has a suggestion.
>>
>> 1.
>> Clarification about how SimFactory handles the "--memory" option
>> and how this affects how one should navigate the cluster's
>> billing policy:
>> https://lists.einsteintoolkit.org/pipermail/users/2025-September/009761.html
>> <https://lists.einsteintoolkit.org/pipermail/users/2025-September/009761.html>
>>
>> 2.
>> Using 'leonardo-dcgp.ini' and understanding how to properly
>> request one full node:
>> https://lists.einsteintoolkit.org/pipermail/users/2025-September/009762.html
>> <https://lists.einsteintoolkit.org/pipermail/users/2025-September/009762.html>
>>
> Normally, one requests --procs equal to the number of cores on the node.
>
>
> So, imagine one has a machine with nodes that have 32 cores each.
>
>
> One could say --procs 32, and that should be an entire node. However,
> maybe you want to run with 8 threads per MPI task. In that case, you
> would say --procs 32 --num-threads 8.
>
>
> If you want to run on N nodes, then the number of procs would be 32*N,
> and Simfactory will figure it out.
>
>
> --Steve
>
>> 1.
>>
>>
>> Thanks,
>> Panayotis
>> ------------------------------------------------------------------------
>> *From:* Users <users-bounces at einsteintoolkit.org>
>> <mailto:users-bounces at einsteintoolkit.org> on behalf of IOSIF
>> PANAGIOTIS <PANAGIOTIS.IOSIF at units.it> <mailto:PANAGIOTIS.IOSIF at units.it>
>> *Sent:* Monday, September 29, 2025 12:24 PM
>> *To:* Roland Haas <rhaas at mail.ubc.ca> <mailto:rhaas at mail.ubc.ca>;
>> Bruno Giacomazzo <bruno.giacomazzo at unimib.it>
>> <mailto:bruno.giacomazzo at unimib.it>
>> *Cc:* Einstein Toolkit Users <users at einsteintoolkit.org>
>> <mailto:users at einsteintoolkit.org>
>> *Subject:* Re: [Users] Inconsistency warnings: cores/threads mismatch
>> [Leonardo cluster]
>> Hi Roland,
>>
>> Thanks for your reply.
>> You touch on an important point, i.e the *cluster's* *billing
>> policy*, that hadn't crossed my mind.
>>
>> From the billing policy of Leonardo, it seems that *it is* *possible
>> to use only a fraction of a node's total CPUs.*
>> https://docs.hpc.cineca.it/hpc/hpc_intro.html#billing-policy
>> <https://docs.hpc.cineca.it/hpc/hpc_intro.html#billing-policy>
>>
>> *However*, the documentation also stresses that:
>> /...if a job reserves all of a node’s RAM — even without utilizing
>> all its CPUs — the node becomes unusable for other jobs and is
>> therefore billed accordingly.
>>
>> /
>> So, apart from the cores requested, *should I also try to calculate
>> the RAM requirements?*
>> For example, I see that Bruno's "leonardo-dcgp.ini" file specifies:
>> |memory = 494000|
>> And the respective submitscript also has this line:
>> |#SBATCH --mem 494000MB|
>>
>> I note that each node in Leonardo has 512GB of RAM, so that means
>> that *the script requests ~94.2% of the RAM.*
>> I am not sure I follow the reasoning behind this.
>>
>> What is the default behavior of SimFactory if I were to remove the
>> above specifications from the config files?
>> Because, if by default Simfactory requests/uses all the RAM available
>> in a node, then as far as I understand, it does not make sense to
>> request fewer cores than a full node.
>> Let me know what you think.
>>
>> Best,
>> Panayotis
>>
>>
>> ------------------------------------------------------------------------
>> *From:* Roland Haas <rhaas at mail.ubc.ca> <mailto:rhaas at mail.ubc.ca>
>> *Sent:* Friday, September 26, 2025 4:31 PM
>> *To:* Bruno Giacomazzo <bruno.giacomazzo at unimib.it>
>> <mailto:bruno.giacomazzo at unimib.it>
>> *Cc:* IOSIF PANAGIOTIS <PANAGIOTIS.IOSIF at units.it>
>> <mailto:PANAGIOTIS.IOSIF at units.it>; Einstein Toolkit Users
>> <users at einsteintoolkit.org> <mailto:users at einsteintoolkit.org>
>> *Subject:* Re: [Users] Inconsistency warnings: cores/threads mismatch
>> [Leonardo cluster]
>> Hello all,
>>
>> > I never used --cores and I don't know the difference with procs.
>>
>> --cores is a synonym for --procs in simfactory. The hope was to avoid
>> the confusion of "procs" being "Processes" or "Processors". Though it
>> has been pointed out that the best name would actually be "--threads"
>> since that is what simfactory actually starts, which then collides with
>> "--num-threads" (threads per process).
>>
>> Does Leonardo actually charge you for partial nodes if you do no use a
>> full one? Simfactory is mostly written under the assumption (true at
>> the time) that HPC systems would give you full nodes all the time, so
>> if you use 1 core or 112 cores of a node, the charge would be the same
>> (though shared node systems are becoming more common for HPC now [or
>> again]).
>>
>> Yours,
>> Roland
>>
>> --
>> My email is as private as my paper mail. I therefore support encrypting
>> and signing email messages. Get my PGP key from http://pgp.mit.edu
>> <http://pgp.mit.edu> .
>>
>> _______________________________________________
>> Users mailing list
>> Users at einsteintoolkit.org <mailto:Users at einsteintoolkit.org>
>> http://lists.einsteintoolkit.org/mailman/listinfo/users <http://lists.einsteintoolkit.org/mailman/listinfo/users>
>
> _______________________________________________
> Users mailing list
> Users at einsteintoolkit.org
> http://lists.einsteintoolkit.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einsteintoolkit.org/pipermail/users/attachments/20251024/5e5c1ad0/attachment-0001.htm>
More information about the Users
mailing list