[Users] Einstein toolkit with Sun Grid Engine

Chris Stevens chris.stevens at canterbury.ac.nz
Thu Oct 7 16:19:18 CDT 2021


Hi Erik,

Thanks for your suggestion.

I am happy using these in the scripts, but I think the problem is how to pass these expressions to SGE. From what I can tell, the output of @(@PPN_USED@/@NUM_THREADS@)@way is, for example, "6way", given @PPN_USED@=48 and @NUM_THREADS@=8. This means that I have requested the parallel environment called 6way with @PROCS_REQUESTED@ slots. If I requested 48 slots, then I would use mpirun -np 6. Thus, from what I gather, for this to work, this specific parallel environment 6way needs to exist. I am now figuring out how to configure parallel environments in such a way, most likely by changing the allocation rule.

Let me know if you think this is wrong, as it does seem rather stupid to not be able to just set -ncpus-per-task like in Slurm in the submission script.

Cheers,

Chris


[cid:a15eadec-0c22-4043-a2d0-71a6b3315588]


[cid:4fc33753-4ae0-41e8-b028-448a2cefbf61]



Dr Chris Stevens

Lecturer in Applied Mathematics

Rm 602, Jack Erskine building

School of Mathematics and Statistics

T: +64 3 369 0396 (Internal 90396)

University of Canterbury | Te Whare Wānanga o Waitaha

Private Bag 4800, Christchurch 8140, New Zealand

http://www.chrisdoesmaths.com<http://www.chrisdoesmaths.com/>


Director
SCRI Ltd
http://www.scri.co.nz<http://www.scri.co.nz/>

________________________________
From: Erik Schnetter <schnetter at cct.lsu.edu>
Sent: 08 October 2021 09:40
To: Chris Stevens <chris.stevens at canterbury.ac.nz>
Cc: users at einsteintoolkit.org <users at einsteintoolkit.org>
Subject: Re: [Users] Einstein toolkit with Sun Grid Engine

Chris

It might not be necessary to hard-code the number of threads. You can use arbitrary Python expressions via "@( ... )@" in the templates. See e.g. the template for Blue Waters which uses this to choose between CPU and GPU queues.

-erik


On Thu, Oct 7, 2021 at 4:04 PM Chris Stevens <chris.stevens at canterbury.ac.nz<mailto:chris.stevens at canterbury.ac.nz>> wrote:
Hi Roland,

That's fantastic, thanks for linking those files.

It works as expected with only MPI processes. I am careful in compiling and running with the same (and only) OpenMPI installation on the cluster, so this should be OK.

Finding a Slurm to SGE conversion table, there is no SGE equivalent to ncpus-per-task from Slurm, rather it is the allocation type of the given parallel environment that does this. I.e. the backend.

https://srcc.stanford.edu/sge-slurm-conversion

Further, in the submit script of ranger, the crucial line

#$ -pe @(@PPN_USED@/@NUM_THREADS@)@way @PROCS_REQUESTED@

shows that you request @PROCS_REQUESTED@ slots (as I currently have) and the first argument shows that the name of the parallel environment is dependent upon @NUM_THREADS at . From what I take from this, I need to set up a parallel environment that has hardcoded the number of threads I want per MPI process and then use that parallel environment. I'll see how I go there, but it isn't initially obvious how to do this!

Cheers,

Chris

[cid:17c5c7bc2fe96d487e11]


[cid:17c5c7bc2fe3db642932]



Dr Chris Stevens

Lecturer in Applied Mathematics

Rm 602, Jack Erskine building

School of Mathematics and Statistics

T: +64 3 369 0396 (Internal 90396)

University of Canterbury | Te Whare Wānanga o Waitaha

Private Bag 4800, Christchurch 8140, New Zealand

http://www.chrisdoesmaths.com<http://www.chrisdoesmaths.com/>


Director
SCRI Ltd
http://www.scri.co.nz<http://www.scri.co.nz/>


________________________________
From: Roland Haas
Sent: Thursday, October 07, 2021 06:22
To: Chris Stevens
Cc: users at einsteintoolkit.org<mailto:users at einsteintoolkit.org>
Subject: Re: [Users] Einstein toolkit with Sun Grid Engine

Hello Chris,

We used SGE a long time ago on some of the TACC machines.

You can find an old setup for TACC's Ranger cluster in an old commit
like so:

git checkout fed9f8d6fae4c52ed2d0a688fcc99e51b94e608e

and then look at the "ranger" files in OUTDATED subdirectories of
machines, runscripts, submitscripts.

Having all MPI ranks on a single node might also be caused by using
different MPI stacks when compiling and when running so you must make
sure that the "mpirun" (or equivalent command) you use is the one that
belongs to the MPI library that you used when linking your code.

Finally you may also have to check if this is an issue with threads and
MPI ranks. Ie I would check if things are still wrong if you use only
MPI processes and no OpenMP threads at all (in that case you would have
to check what SGE counts: threads (cores) or MPI ranks (processes)).

Yours,
Roland

> Hi everyone,
>
> I have set up the Einstein toolkit on a local cluster of 20 nodes with the SGE scheduler. I have not seen any examples of this scheduler being used with the Einstein toolkit.
>
> I have managed to get it working; however it seems if I ask for a certain number of slots that requires more than one node, it correctly allocates these, however all processes and threads are run on the one node and is oversubscribed.
>
> My question is whether anybody has used SGE with the Einstein toolkit and if this is a good thing or not? If it is possible, I can send more details if there are people willing to help solve this inter-node communication problem.
>
> Thanks in advance,
>
> Chris
>
> [cid:29d54967-59c8-486e-adea-80af7ce2cc49]
>
>
> [cid:55ebbbb5-1e12-45a2-8d51-206c70460c36]
>
>
>
> Dr Chris Stevens
>
> Lecturer in Applied Mathematics
>
> Rm 602, Jack Erskine building
>
> School of Mathematics and Statistics
>
> T: +64 3 369 0396 (Internal 90396)
>
> University of Canterbury | Te Whare Wānanga o Waitaha
>
> Private Bag 4800, Christchurch 8140, New Zealand
>
> https://urldefense.com/v3/__http://www.chrisdoesmaths.com__;!!DZ3fjg!rvExVfoK3iWdskfjDNUxwMCUktw9L_Wt8NTaikC7HLu245hE370Ok_JYsZduIoBu$ <https://urldefense.com/v3/__http://www.chrisdoesmaths.com/__;!!DZ3fjg!rvExVfoK3iWdskfjDNUxwMCUktw9L_Wt8NTaikC7HLu245hE370Ok_JYsfTVv_dN$ >
>
>
> Director
> SCRI Ltd
> https://urldefense.com/v3/__http://www.scri.co.nz__;!!DZ3fjg!rvExVfoK3iWdskfjDNUxwMCUktw9L_Wt8NTaikC7HLu245hE370Ok_JYsaY3VCkl$ <https://urldefense.com/v3/__http://www.scri.co.nz/__;!!DZ3fjg!rvExVfoK3iWdskfjDNUxwMCUktw9L_Wt8NTaikC7HLu245hE370Ok_JYsSEV4xVt$ >
>



--
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://pgp.mit.edu .
_______________________________________________
Users mailing list
Users at einsteintoolkit.org<mailto:Users at einsteintoolkit.org>
http://lists.einsteintoolkit.org/mailman/listinfo/users


--
Erik Schnetter <schnetter at cct.lsu.edu<mailto:schnetter at cct.lsu.edu>>
http://www.perimeterinstitute.ca/personal/eschnetter/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20211007/4ff8a257/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-tnb5nb1v.png
Type: image/png
Size: 13436 bytes
Desc: Outlook-tnb5nb1v.png
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20211007/4ff8a257/attachment-0004.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-qeafgbn2.png
Type: image/png
Size: 17337 bytes
Desc: Outlook-qeafgbn2.png
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20211007/4ff8a257/attachment-0005.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-3koij4s0.png
Type: image/png
Size: 13436 bytes
Desc: Outlook-3koij4s0.png
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20211007/4ff8a257/attachment-0006.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-ylxfrace.png
Type: image/png
Size: 17337 bytes
Desc: Outlook-ylxfrace.png
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20211007/4ff8a257/attachment-0007.png 


More information about the Users mailing list