[Users] Failure to run static_tov simulation using more than 1 node on cluster

Roland Haas rhaas at illinois.edu
Fri Jul 7 10:29:28 CDT 2023


Hello Wei,

sorry for the long delay in responding.

The automated setup command `setup` or `setup-silent`  is only designed
to set up simfactory on a workstation, not a cluster.

Clusters are complex and individual enough so that a simple script does
not work and you will have to manually adjust or set up things (eg the
queuing  system).

See eg:

https://www.einsteintoolkit.org/seminars/2022_02_24/index.html

and the wiki documentation

https://docs.einsteintoolkit.org/et-docs/Configuring_a_new_machine

Note that under the hood this all just uses your cluster's queuing
system, and you can also "run your own" sbatch script using eg an
example provided in the cluster's documentation:

--8<--

#!/bin/bash
# various SBATCH or similar options

# eg 4 threads per MPI rank
export OMP_NUM_THREADS=4 

# 16 MPI ranks
srun -n 16 --cpus-per-task $OMP_NUM_THREADS cactus_sim myparfile.par

--8<--

Yours,
Roland

> Hello,
> 
> I am trying to run the static_tov simulation on my university's cluster (CRC
> Notre Dame <https://urldefense.com/v3/__https://docs.crc.nd.edu/new_user/quick_start.html__;!!DZ3fjg!-sYI9Zqv8_S71fhBqUWidQWvB3BwACN6OuZFtoqKiFlIXj1D5iGbE8jooNX1u2M1-l5VWm5qyUA6Rw$ >). However, I
> got this error when acquiring 48 cores:
> 
> Error: Too many nodes specified: nodes=2 (maxnodes is 1)
> 
> I am assuming this is due to the *#Source tree management* module in the
> Cactus/simfactory/mdb/machines/<host>.ini file. In my case, the automatic
> *.ini file I got has these configuration:
> 
> ppn             = 24
> 
> max-num-threads = 24
> 
> num-threads     = 12
> 
> nodes           = 1
> 
> 
> However, the machine I use has 24 cores for each node, and the number of
> threads for each core is 1. That means if I run the simulation using 48
> cores, I need 2 nodes.
> 
> Although I didn't get 2 hosts successfully(I got only 1), I am wondering if
> I have to edit the <host>.ini file after the command ./simfactory/bin/sim
> setup-silent by hand, for example: num-threads=1, nodes=2? Or is there any
> other correct way for getting more than 1 node?
> 
> I would be appreciate it if you could give me some ideas on this problem.
> 
> Best regards,
> Wei


Yours,
Roland

-- 
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://pgp.mit.edu .
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.einsteintoolkit.org/pipermail/users/attachments/20230707/113fb7f5/attachment.sig>


More information about the Users mailing list