From sbrandt at cct.lsu.edu Fri Nov 1 14:35:03 2024 From: sbrandt at cct.lsu.edu (Steven Brandt) Date: Fri, 1 Nov 2024 14:35:03 -0500 Subject: [Users] Unable to set Cactus to use all threads in hypter-threaded cores In-Reply-To: <1c0862b8-59de-4054-b2f8-a4422b658367@ua.pt> References: <1c0862b8-59de-4054-b2f8-a4422b658367@ua.pt> Message-ID: <9182c454-ddf3-4653-9a4a-5c5ce96ed834@cct.lsu.edu> On 10/31/2024 12:03 PM, Jos? Ferreira wrote: > > Dear Toolkit Community, > > > I?m struggling to make use of all of the available threads in the > toolkit when running on a machine that has hypter-threading enabled. > > On my local machine, which does not have hypter-threading, if I invoke > the toolkit?s binary using |OMP_NUM_THREADS=2 mpirun -np 4 -- exe/base > -p par/parfile.par|, it outputs > > |INFO (Carpet): MPI is enabled INFO (Carpet): Carpet is running on 4 > processes INFO (Carpet): This is process 0 INFO (Carpet): OpenMP is > enabled INFO (Carpet): This process contains 2 threads, this is thread > 0 INFO (Carpet): There are 8 threads in total INFO (Carpet): There are > 2 threads per process | > > This creates 4 processes with 2 threads each and uses all 8 of the > available threads in my CPU, as expected. > > I am now free to change the number of processes and threads as I see > fit, in order to look for the configuration that minimizes the > physical time per hour. > > > However, most of my computations are performed in Marenostrum5, where > each machine has 2 sockets, each with 56 physical cores with > hyper-threading enabled, totaling to 112 physical cores or 224 threads > per machine. For some reason, the toolkit does not use all of the > available threads. > > To replicate the scenario above, I use the following Slurm submission > script > > |#!/usr/bin/env bash #SBATCH -N 1 #SBATCH -n 4 #SBATCH -c 1 #SBATCH -t > 30 export OMP_NUM_THREADS=2 srun --cpu-bind=none exe/base par/parfile.par | > > where I ask for a single machine, 4 tasks (to me, task = a process) > per machine and 1 CPU per task, which due to hyper-threading should > provide 2 threads per task. > > The output of the toolkit is > > |754 INFO (Carpet): MPI is enabled 755 INFO (Carpet): Carpet is > running on 4 processes 756 INFO (Carpet): This is process 0 757 INFO > (Carpet): OpenMP is enabled 758 INFO (Carpet): This process contains 1 > threads, this is thread 0 759 INFO (Carpet): There are 4 threads in > total 760 INFO (Carpet): There are 1 threads per process 761 INFO > (Carpet): This process runs on host gs22r3b16, pid=1514092 762 INFO > (Carpet): This process runs on 8 cores: 54-55, 97, 105, 166-167, > 209,217 763 INFO (Carpet): Thread 0 runs on 8 cores: 54-55, 97, 105, > 166-167, 209, 217 | > > From the output above you can see that I have been provided with 8 > cores, even though I have requested 4 CPUs in total, which means thar > the toolkit can see the available threads coming from hyper-threading. > > It also shows that it ignored my request for 2 threads per process, > which I set via the environmental variable |OMP_NUM_THREADS|. > > If I force |CACTUS_NUM_THREADS=2|, it crashes with the error > > |INFO (Carpet): MPI is enabled INFO (Carpet): Carpet is running on 4 > processes INFO (Carpet): This is process 0 INFO (Carpet): OpenMP is > enabled INFO (Carpet): This process contains 1 threads, this is thread > 0 WARNING level 0 from host gs06r3b13 process 1 in thorn Carpet, file > /gpfs/home/uapt/uapt015213/projects/cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:187: > -> The environment variable CACTUS_NUM_THREADS is set to 2, but there > are 1 threads on this process. This may indicate a severe problem with > the OpenMP startup mechanism. | > > which leads me to believe that it is MPI that is refusing to > initialize more threads, and not the toolkit itself. > I believe you are correct about this. > > > My questions are: > > 1. > > Is there a performance gain by making use of hyper-threading > knowing that the toolkit is memory bound and the different threads > share the same cache? > I'm inclined to doubt it. > > 1. > > > 2. > > If yes, how can I adapt my submission scripts to tell Cactus to > make use of hyper-threading? > Maybe ask your sysadmins if there is some magic forumla to give to Slurm? --Steve > 1. > > > > Thank you in advance, > > Best regards, > > Jos? Ferreira > > > > _______________________________________________ > Users mailing list > Users at einsteintoolkit.org > http://lists.einsteintoolkit.org/mailman/listinfo/users -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhaas at illinois.edu Mon Nov 4 15:18:02 2024 From: rhaas at illinois.edu (rhaas at illinois.edu) Date: Mon, 04 Nov 2024 15:18:02 -0600 Subject: [Users] Agenda for Thursday's Meeting Message-ID: Please update the Wiki with agenda items for Thursday's meeting. Thanks! https://docs.einsteintoolkit.org/et-docs/meeting_agenda --The Maintainers From physik at fangwolg.de Mon Nov 4 16:12:24 2024 From: physik at fangwolg.de (Wolfgang Kastaun) Date: Mon, 04 Nov 2024 23:12:24 +0100 Subject: [Users] hybrid star setup in the TOVSolver thorn In-Reply-To: References: Message-ID: <4339edc80184e60fa00388f815d3673d35264a54.camel@fangwolg.de> Hi, there is another TOV solver available in the "reprimand" thorn. It only solves the NS equations, however, but does not set up any initial data. It provides ordinary C++ functions for EOS handling and TOV solving, but there is no cactus interface.?To set up initial data, you would need to write a small thorn or replace the TOV solver part in the available TOVSolver thorn with calls to reprimand. The thorn is a based on the standalone C++ library also named reprimand (which also has a python interface that can be pip-installed). https://github.com/wokast/RePrimAnd It is documented?here https://wokast.github.io/RePrimAnd/index.html It might be well suited for your special use case since it is designed for dealing also with discontinuous EOS, as described in this article: https://arxiv.org/abs/2404.11346 Cheers, Wolfgang. On Thu, 2024-10-31 at 20:58 +0000, CJ Osakwe wrote: > > Hello, > > > I am trying?to set up a stable hybrid star (with a quark matter core > and hadronic crust, each with their own equation of state) in the > Einstein Toolkit. My initial approach was to adapt the TOVSolver > thorn to achieve this, but the code is removing the density > discontinuity at the hadronic-quark matter interface when it > interpolates from 1D to 3D. As well the discontinuity doesn't seem to > be reflected in the pressure anyway. I'm wondering if there is a > thorn that is better suited for setting up a stable hybrid star? > > > Cheers, > CJ Osakwe > PhD candidate, Department of Physics and Astronomy > University?of Calgary > > > _______________________________________________ > Users mailing list > Users at einsteintoolkit.org > http://lists.einsteintoolkit.org/mailman/listinfo/users From rhaas at illinois.edu Wed Nov 6 17:15:01 2024 From: rhaas at illinois.edu (rhaas at illinois.edu) Date: Wed, 06 Nov 2024 17:15:01 -0600 Subject: [Users] Einstein Toolkit Meeting Reminder Message-ID: Hello, Please consider joining the weekly Einstein Toolkit phone call at 9:00 am US central time on Thursdays. For details on how to connect and what agenda items are to be discussed, use the link below. https://docs.einsteintoolkit.org/et-docs/Main_Page#Weekly_Users_Call --The Maintainers From rhaas at illinois.edu Thu Nov 7 08:52:04 2024 From: rhaas at illinois.edu (Roland Haas) Date: Thu, 7 Nov 2024 09:52:04 -0500 Subject: [Users] upcoming November 2024 Einstein Toolkit release Message-ID: <20241107095204.1f0ab4b8@illinois.edu> Hello all, we are please to announce the anticipated release of the next version of the Einstein Toolkit code-named "Annie Jump Cannon" for 2024-11-29. Yours, Roland -- My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from http://pgp.mit.edu . -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From rhaas at illinois.edu Thu Nov 7 09:50:55 2024 From: rhaas at illinois.edu (Roland Haas) Date: Thu, 7 Nov 2024 10:50:55 -0500 Subject: [Users] meeting minutes for 2024-11-07 Message-ID: <20241107105055.116f2b9e@illinois.edu> Present: Roland, Leo, Peter, Bill, Johnny, Lucas, Steve, Zach ET release ========== * Z4c will postponed due to lack of test * TOVola reviewed positively, will be included * testsuite status: none run yet ** Johnny will run on Frontera ** Roland mentioned failing IGM test, Leo will look at it, Sam may be aware * lagging behind for release timeline, need to regenerate codes and contact contributors * Leo will test on macOS using homebrew, Roland will try and provide documentation Updated tickets =============== * https://bitbucket.org/einsteintoolkit/tickets/issues/2773/make-carpetx-thorndoc-is-confused-in Steve will take a look * https://bitbucket.org/einsteintoolkit/tickets/issues/2282/gallery-examples-use-low-order-integration could update BNS example easily, BBH example is set to match Zenodo, Steve has checked that gallery and Zenodo match, Peter will take a look * https://bitbucket.org/einsteintoolkit/tickets/issues/2606/inconsistent-lapack-versions-in Zach found that latest version seems to now work fine, will update to latest version * https://bitbucket.org/einsteintoolkit/tickets/issues/2633/summationbyparts-diff_gv-aliased-function Peter will take a look * https://bitbucket.org/einsteintoolkit/tickets/issues/2503/various-small-problems-with Zach will take a look at Gabriele's comments * https://bitbucket.org/einsteintoolkit/tickets/issues/2591/new-centering-syntax-not-documented Lucas will add documentation to flesh based on CarpetX docs https://bitbucket.org/einsteintoolkit/tickets/issues/2833/null-pointer-dereference-in-aeilocalinterp Steve will approve * https://bitbucket.org/einsteintoolkit/tickets/issues/2832/possible-race-condition-in-loopcontrol is no longer being observed with the patch applied in ff99c0d16 * https://bitbucket.org/einsteintoolkit/tickets/issues/2818/failing-tests-with-gcc-14 Steve suggest to apply Misc ==== * Peter will not make to the call next week Yours, Roland -- My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from http://pgp.mit.edu . -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From rhaas at illinois.edu Thu Nov 7 11:49:08 2024 From: rhaas at illinois.edu (Roland Haas) Date: Thu, 7 Nov 2024 12:49:08 -0500 Subject: [Users] hybrid star setup in the TOVSolver thorn In-Reply-To: <4339edc80184e60fa00388f815d3673d35264a54.camel@fangwolg.de> References: <4339edc80184e60fa00388f815d3673d35264a54.camel@fangwolg.de> Message-ID: <20241107124908.7187a596@illinois.edu> Hello all, There's also David Boyer's TOVola TOV initial data thorn (scheduled to be included in the November release of the ET), which may work. I have added David Boyer, its author, would you be able to comment on this? Yours, ROland > Hi, > > there is another TOV solver available in the "reprimand" thorn. It only > solves the NS equations, however, but does not set up any initial data. > It provides ordinary C++ functions for EOS handling and TOV solving, > but there is no cactus interface.?To set up initial data, you would > need to write a small thorn or replace the TOV solver part in the > available TOVSolver thorn with calls to reprimand. > The thorn is a based on the standalone C++ library also named reprimand > (which also has a python interface that can be pip-installed). > https://urldefense.com/v3/__https://github.com/wokast/RePrimAnd__;!!DZ3fjg!4WBu1NbiMkSGYMbSU8W8qB8IVFzdObuye5dNzEsoJwj9G2Q8ePTVScADQ-LAVcLuS83gGIfHME_Jd__wJwI$ > It is documented?here > https://urldefense.com/v3/__https://wokast.github.io/RePrimAnd/index.html__;!!DZ3fjg!4WBu1NbiMkSGYMbSU8W8qB8IVFzdObuye5dNzEsoJwj9G2Q8ePTVScADQ-LAVcLuS83gGIfHME_JdzDq5BM$ > > It might be well suited for your special use case since it is designed > for dealing also with discontinuous EOS, as described in this article: > https://urldefense.com/v3/__https://arxiv.org/abs/2404.11346__;!!DZ3fjg!4WBu1NbiMkSGYMbSU8W8qB8IVFzdObuye5dNzEsoJwj9G2Q8ePTVScADQ-LAVcLuS83gGIfHME_JE0V6tP0$ > > > Cheers, > Wolfgang. > > On Thu, 2024-10-31 at 20:58 +0000, CJ Osakwe wrote: > > > > Hello, > > > > > > I am trying?to set up a stable hybrid star (with a quark matter core > > and hadronic crust, each with their own equation of state) in the > > Einstein Toolkit. My initial approach was to adapt the TOVSolver > > thorn to achieve this, but the code is removing the density > > discontinuity at the hadronic-quark matter interface when it > > interpolates from 1D to 3D. As well the discontinuity doesn't seem to > > be reflected in the pressure anyway. I'm wondering if there is a > > thorn that is better suited for setting up a stable hybrid star? > > > > > > Cheers, > > CJ Osakwe > > PhD candidate, Department of Physics and Astronomy > > University?of Calgary > > > > > > _______________________________________________ > > Users mailing list > > Users at einsteintoolkit.org > > https://urldefense.com/v3/__http://lists.einsteintoolkit.org/mailman/listinfo/users__;!!DZ3fjg!4WBu1NbiMkSGYMbSU8W8qB8IVFzdObuye5dNzEsoJwj9G2Q8ePTVScADQ-LAVcLuS83gGIfHME_JHsjpI-0$ > > _______________________________________________ > Users mailing list > Users at einsteintoolkit.org > https://urldefense.com/v3/__http://lists.einsteintoolkit.org/mailman/listinfo/users__;!!DZ3fjg!4WBu1NbiMkSGYMbSU8W8qB8IVFzdObuye5dNzEsoJwj9G2Q8ePTVScADQ-LAVcLuS83gGIfHME_JHsjpI-0$ Yours, Roland -- My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from http://pgp.mit.edu . -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From rhaas at illinois.edu Mon Nov 11 15:18:01 2024 From: rhaas at illinois.edu (rhaas at illinois.edu) Date: Mon, 11 Nov 2024 15:18:01 -0600 Subject: [Users] Agenda for Thursday's Meeting Message-ID: Please update the Wiki with agenda items for Thursday's meeting. Thanks! https://docs.einsteintoolkit.org/et-docs/meeting_agenda --The Maintainers From dblas at ifae.es Tue Nov 12 08:52:26 2024 From: dblas at ifae.es (Diego Blas) Date: Tue, 12 Nov 2024 08:52:26 -0600 Subject: [Users] ERC funded Postdoctoral and PhD positions at IFAE Message-ID: <67336baa.br7cfjnYI8v+9Z7O%dblas@ifae.es> Dear colleagues, Please find below the announcement for 2 postdoctoral positions and 2 PhD positions in theoretical aspects of the detection of high-frequency gravitational waves at IFAE, Barcelona, Spain. PhDs: https://inspirehep.net/jobs/2845489 Postdocs: https://inspirehep.net/jobs/2845228 I wold appreciate if you could distribute it among students, or lists that may reach potentially interested candidates. I apologize if you get this mail from different sources. Best regards, Diego - Dr. Diego Blas, ICREA Research Professor Institut de Fisica d'Altes Energies (IFAE) The Barcelona Institute of Science and Technology (BIST) Edifici Cn, Universitat Autonoma de Barcelona 08193 Bellaterra (Barcelona), Spain Phone: +34-93-581-3095 http://www.icrea.cat/ -- Avis - Aviso - Legal Notice - (LOPD) - http://legal.ifae.es/ From rhaas at illinois.edu Wed Nov 13 17:15:01 2024 From: rhaas at illinois.edu (rhaas at illinois.edu) Date: Wed, 13 Nov 2024 17:15:01 -0600 Subject: [Users] Einstein Toolkit Meeting Reminder Message-ID: Hello, Please consider joining the weekly Einstein Toolkit phone call at 9:00 am US central time on Thursdays. For details on how to connect and what agenda items are to be discussed, use the link below. https://docs.einsteintoolkit.org/et-docs/Main_Page#Weekly_Users_Call --The Maintainers From b.gabella at Vanderbilt.Edu Thu Nov 14 10:27:12 2024 From: b.gabella at Vanderbilt.Edu (Gabella, William E) Date: Thu, 14 Nov 2024 16:27:12 +0000 Subject: [Users] Meeting Minutes 2024-11-14 Message-ID: https://docs.einsteintoolkit.org/et-docs/Main_Page#Weekly_Users_Call Minutes ET 20241114 9am CST Attending: Zach E, Lucas TS, David B, Leo RW, Maxwell R, Bill G, Steve B Chair Leo RW,? Minutes Bill G Agenda https://docs.einsteintoolkit.org/et-docs/Meeting_agenda * Relase timeline https://docs.einsteintoolkit.org/et-docs/Release_Details#Schedule **? Announcement ? Roland, just finished draft of release announcement.? Check it out and send any comments/corrections. https://www.einsteintoolkit.org/about/releases/ET_2024_11_announcement.html **? Contributors ? Roland, A little behind on contacting new contributors.? David needs to tell us all the contributors to TOVola. **? Tests ? Roland, Behind on testing of the different clusters. ? Steve on Docker containers. Maxwell also testing, OpenPPD failed. ? Steve, had to switch form OpenMPI to MPIch, OpenMPI not working on Docker.? Has one test that fails.? Docker containers are all good.? And can share the install command for each of those.? Roland, then we can update the Jupyter notebook.. ? Roland, can compile on his clusters but not run tests yet. ? Grahyl has a failing test.? Leo is looking into it. ? Roland, issues with HDF5, can set to use external library. Should put in that in the Jupyter notebook.? Leo, probably not looking for different values in the list.? Maybe chaning the list is too invasive for a few weeks before release.? Roalnd, Could back port it into the release after testing it further. ? Steve, testsuite on any LSU machines?? Roland, Maybe DB1.? Steve is signed up for all the LSU machines.? Roland not able to login to other machines at LSU.? Steve should receive requests for permissions.? Steve added LONI allocation for Roland to run on those LSU machines.? Steve still assigned the testing. ? Roland, trying to install more of the CarpetX so moving from Intel backend to LLVM-based so oneAPI compilers that are Intel can compile this, have better C++ support. * Unanswered questions on mailing list ?? None. * Open Tickets #2830, CarpetX depends on BLOSC ? Not in the Toolkit.? Roland has poked Erik. #2173, #2174, #2176, #2172? Poisson, Multi-Patch, BNS, GW150914 Gallery examples ? David needs write access, to upload the tar file to downloads folder.? Roland will check on login to BitBucket and give access. #2818 Failing tests with GCC14. ? Roland, will close it, there is a warning coming out for AMRex. #2520 update gauge settings in TOV example ? David, will be uploading the gauge examples too. #2171, and other old tickets ? Roland poked them to bring them high on the list. * Tickets for Review #1566 Update Cactus autoconf ? Roland, do not want to make the change this close to release. Currently using very old 2.13 (1999) and new is 2.72 and using the automatic script still needs some manual changes.? So needs more testing.? New Autoconf has advantages for many machines. Review the configure.ac . [ https://www.gnu.org/software/automake/history/automake-history.html January 1999, Automake 1.4 and Autoconf 2.13 https://ftp.gnu.org/gnu/autoconf/ December 2023, Autoconf 2.72 ] #2761? some ExternalLibraries require cmake to build ? Roland, several clusters do not have new enough cmake, wants 3.22 version.? Steve, for Docker had to install cmake package for each of them. #2920 NewRadX boundary condition conflicts with symmetry condition ? Roland, would like to have had it in.? A two commit issue, one for NewRadX and one for CarpetX. #963 Improve McLachlan accuracy ? Roland we should get this in next time. * Any Other Business ? None * Next Week: ? chair Zach,? minutes Leo ? Steve is at SuperComputing next week. -- Dr. William Gabella From rhaas at illinois.edu Mon Nov 18 15:18:01 2024 From: rhaas at illinois.edu (rhaas at illinois.edu) Date: Mon, 18 Nov 2024 15:18:01 -0600 Subject: [Users] Agenda for Thursday's Meeting Message-ID: Please update the Wiki with agenda items for Thursday's meeting. Thanks! https://docs.einsteintoolkit.org/et-docs/meeting_agenda --The Maintainers From rhaas at illinois.edu Wed Nov 20 17:15:01 2024 From: rhaas at illinois.edu (rhaas at illinois.edu) Date: Wed, 20 Nov 2024 17:15:01 -0600 Subject: [Users] Einstein Toolkit Meeting Reminder Message-ID: Hello, Please consider joining the weekly Einstein Toolkit phone call at 9:00 am US central time on Thursdays. For details on how to connect and what agenda items are to be discussed, use the link below. https://docs.einsteintoolkit.org/et-docs/Main_Page#Weekly_Users_Call --The Maintainers From maya.baireddy at gmail.com Sat Nov 30 17:44:44 2024 From: maya.baireddy at gmail.com (Maya Baireddy) Date: Sat, 30 Nov 2024 18:44:44 -0500 Subject: [Users] Fwd: need help running simulation on slurm In-Reply-To: References: Message-ID: Hello Everyone, I am new to ETK. I am working on my high school research project trying to run the simulation of BNS merger on amarel supercomputer from my local university. Could you please help me to start my simulation on SLURM. I have followed the ETK gallery example for BNS simulation steps 1-5. But I am not able to proceed to successfully create a machine to run the simulation. I run the following steps /home/sb1554/BNS/simfactory/bin/sim create bns --parfile /home/sb1554/BNS/bns.par --machine slurmbns srun bns.sh -o slurm.bns.%N.%j.out and got the error: **** An error occurred in MPI_Init_thread*** on a NULL communicator*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*** and potentially your MPI job)* I am attaching my machine, submit script, run script, log files. I would appreciate any pointers from you. Or if you could point me to the right person. I was trying to post this on EKT forum, but need one credit to post. Thank you, Maya -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- + set -e + cd /home/sb1554/simulations/bns/output-0000-active + echo Checking: + pwd + hostname + date + echo Environment: + export CACTUS_NUM_PROCS=1 + CACTUS_NUM_PROCS=1 + export CACTUS_NUM_THREADS=8 + CACTUS_NUM_THREADS=8 + export GMON_OUT_PREFIX=gmon.out + GMON_OUT_PREFIX=gmon.out + export OMP_NUM_THREADS=8 + OMP_NUM_THREADS=8 + sort + env + echo Starting: ++ date +%s + export CACTUS_STARTTIME=1732921316 + CACTUS_STARTTIME=1732921316 + '[' 1 = 1 ']' + '[' 0 -eq 0 ']' + /home/sb1554/simulations/bns/SIMFACTORY/exe/cactus_sim -L 3 /home/sb1554/simulations/bns/output-0000/bns.par *** An error occurred in MPI_Init_thread *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, *** and potentially your MPI job) [slepner085.amarel.rutgers.edu:07624] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! -------------- next part -------------- [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::Creating simulation bns [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::Simulation directory: /home/sb1554/simulations/bns [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::Simulation Properties: [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile):: [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::[properties] [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::machine = slurmbns [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::simulationid = simulation-bns-slurmbns-amarel1.amarel.rutgers.edu-sb1554-2024.11.29-18.01.08-16276 [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::sourcedir = /home/sb1554/BNS [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::configuration = sim [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::configid = config-sim-slepner088.amarel.rutgers.edu-cache-home-sb1554-BNS [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::buildid = build-sim-slepner088.amarel.rutgers.edu-sb1554-2024.11.15-02.32.38-2196 [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::testsuite = False [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::executable = /home/sb1554/simulations/bns/SIMFACTORY/exe/cactus_sim [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::optionlist = /home/sb1554/simulations/bns/SIMFACTORY/cfg/OptionList [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::submitscript = /home/sb1554/simulations/bns/SIMFACTORY/run/SubmitScript [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::runscript = /home/sb1554/simulations/bns/SIMFACTORY/run/RunScript [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::parfile = /home/sb1554/simulations/bns/SIMFACTORY/par/bns.par [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile):: [LOG:2024-11-29 18:01:08] restart.create(simulationName, parfile)::Simulation bns created [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::Creating new properties because this is an independant run, not a run following a submit [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::Determined the following properties [LOG:2024-11-29 18:01:56] restart.userRun(simulationName):: [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::[properties] [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::machine = slurmbns [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::simulationid = simulation-bns-slurmbns-amarel1.amarel.rutgers.edu-sb1554-2024.11.29-18.01.08-16276 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::sourcedir = /home/sb1554/BNS [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::configuration = sim [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::configid = config-sim-slepner088.amarel.rutgers.edu-cache-home-sb1554-BNS [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::buildid = build-sim-slepner088.amarel.rutgers.edu-sb1554-2024.11.15-02.32.38-2196 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::testsuite = False [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::executable = /home/sb1554/simulations/bns/SIMFACTORY/exe/cactus_sim [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::optionlist = /home/sb1554/simulations/bns/SIMFACTORY/cfg/OptionList [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::submitscript = /home/sb1554/simulations/bns/SIMFACTORY/run/SubmitScript [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::runscript = /home/sb1554/simulations/bns/SIMFACTORY/run/RunScript [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::parfile = /home/sb1554/simulations/bns/SIMFACTORY/par/bns.par [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::nodes = 1 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::procsrequested = 8 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::ppn = 8 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::numprocs = 1 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::nodeprocs = 1 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::procs = 1 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::numthreads = 8 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::ppnused = 8 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::numsmt = 1 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::hostname = amarel1.amarel.rutgers.edu [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::user = sb1554 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::memory = 124000 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::cpufreq = [LOG:2024-11-29 18:01:56] restart.userRun(simulationName)::pbsSimulationName= bns-0000 [LOG:2024-11-29 18:01:56] restart.userRun(simulationName):: [LOG:2024-11-29 18:01:56] self.makeActive()::Simulation bns with restart-id 0 has been made active [LOG:2024-11-29 18:01:56] self.run(debug)::Prepping for execution/run [LOG:2024-11-29 18:01:56] checkpointing = self.PrepareCheckpointing(recover_id)::PrepareCheckpointing: max_restart_id: -1 [LOG:2024-11-29 18:01:56] self.run(debug)::Defined substitution properties for execution/run [LOG:2024-11-29 18:01:56] self.run(debug)::{'MACHINE': 'slurmbns', 'SOURCEDIR': '/home/sb1554/BNS', 'SIMULATION_NAME': 'bns', 'SHORT_SIMULATION_NAME': 'bns-0000', 'SIMULATION_ID': 'simulation-bns-slurmbns-amarel1.amarel.rutgers.edu-sb1554-2024.11.29-18.01.08-16276', 'RESTART_ID': 0, 'SCRIPTFILE': '/home/sb1554/simulations/bns/SIMFACTORY/run/SubmitScript', 'SUBMITSCRIPT': '/home/sb1554/simulations/bns/SIMFACTORY/run/SubmitScript', 'CONFIGURATION': 'sim', 'EXECUTABLE': '/home/sb1554/simulations/bns/SIMFACTORY/exe/cactus_sim', 'PARFILE': '/home/sb1554/simulations/bns/output-0000/bns.par', 'RUNDIR': '/home/sb1554/simulations/bns/output-0000', 'HOSTNAME': 'amarel1.amarel.rutgers.edu', 'USER': 'sb1554', 'ALLOCATION': 'NO_ALLOCATION', 'NODES': 1, 'PROCS_REQUESTED': 8, 'PPN': 8, 'NUM_PROCS': 1, 'NODE_PROCS': 1, 'PROCS': 1, 'NUM_THREADS': 8, 'PPN_USED': 8, 'NUM_SMT': 1, 'MEMORY': '124000', 'CPUFREQ': None, 'RUNDEBUG': 0} [LOG:2024-11-29 18:01:56] self.run(debug)::Executing run command: /home/sb1554/simulations/bns/output-0000/SIMFACTORY/RunScript [LOG:2024-11-29 18:05:43] restart.load(simulationName, active_id)::For simulation bns, loaded restart id 0, long restart id 0000 [LOG:2024-11-29 18:29:40] ret = restart.load(sim, restart_id)::For simulation bns, loaded restart id 0, long restart id 0000 [LOG:2024-11-29 18:29:40] ret = restart.load(sim, restart_id)::For simulation bns, loaded restart id 0, long restart id 0000 -------------- next part -------------- 1 #! /bin/bash 2 3 export SIMFACTORY=/home/sb1554/Cactus/simafactory/bin 4 export SOURCE_DIR=/home/sb1554/Cactus 5 export CACTUS_PATH=/home/sb1554/BNS 6 7 #BATCH --partition=main # Partition (job queue) 8 9 #SBATCH --requeue # Return job to the queue if preempted 10 11 #SBATCH --job-name=bnsnew # Assign a short name to your job 12 13 #SBATCH --nodes=1 # Number of nodes you require 14 15 #SBATCH --ntasks=1 # Total # of tasks across all nodes 16 17 #SBATCH --ntasks-per-node=1 18 19 #SBATCH --cpus-per-task=1 # Cores per task (>1 if multithread tasks) 20 21 #SBATCH --mem=124000 # Real memory (RAM) required (MB) 22 23 #SBATCH --time=70:00:00 # Total run time limit (HH:MM:SS) 24 25 #SBATCH --output=slurm.bns.%N.%j.out # STDOUT output file 26 27 #SBATCH --error=slurm.bns.%N.%j.err # STDERR output file (optional) 28 29 30 module use /projects/community/modulefiles 31 #module load gcc/10.2.0/openmpi/4.0.5-bz186 32 module load gcc/11.2/openmpi/4.1.3-kholodvl 33 module load libnl/3.2.25-sb1554 34 module load rdma-core/54.0-sb1554 35 module load gsl/2.5-bd387 36 37 38 39 cd /home/sb1554/BNS 40 /home/sb1554/BNS/simfactory/mdb/runscripts/slurmbns.run -------------- next part -------------- [slurmbns] # This machine description file is used internally by simfactory as a template # during the sim setup and sim setup-silent commands # Edit at your own risk # Machine description nickname = slurmbns name = slurmbns location = LSU description = CCT status = production # Access to this machine hostname = amarel1.amarel.rutgers.edu aliaspattern = ^\w+(\.amarel\.rutgers\.edu)?$ # Source tree management sourcebasedir = /home/sb1554 optionlist = generic.cfg submitscript = slurmbns.sub runscript = slurmbns.run make = make -j at MAKEJOBS@ basedir = /home/sb1554/simulations ppn = 8 max-num-threads = 128 num-threads = 8 memory = 124000 nodes = 2 num-smt = 1 #procs = 16 submit = sbatch /home/sb1554/BNS/simfactory/mdb/runscripts/slurmbns.run getstatus = squeue -j @JOB_ID@ # need to kill the whole set of processes descending from @JOB_ID@, not just the # (simfactory) top-level process stop = scancel @JOB_ID@ submitpattern = 'Submitted batch job (\d+)' statuspattern = '@JOB_ID@ ' queuedpattern = ' PD ' queue = checkpt runningpattern = ' (CF|CG|R|TO) ' holdingpattern = '\(JobHeldUser\)' [sb1554 at amarel1 machines]$ exechostpattern = (.*) stdout = cat @SIMULATION_NAME at .out stderr = cat @SIMULATION_NAME at .err stdout-follow = sleep 10 ; sattach @JOB_ID at .0 # stdout-follow = while ! scontrol >/dev/null wait_job @JOB_ID@ ; do sleep 5 ; done ; tail -n 100 -f @SIMULATION_NAME at .out @SIMULATION_NAME at .err maxwalltime = 72:00:00 disabled-thorns = CactusUtils/SystemTopology [slurmbns] # This machine description file is used internally by simfactory as a template # during the sim setup and sim setup-silent commands # Edit at your own risk # Machine description nickname = slurmbns name = slurmbns location = LSU description = CCT status = production # Access to this machine hostname = amarel1.amarel.rutgers.edu aliaspattern = ^\w+(\.amarel\.rutgers\.edu)?$ # Source tree management sourcebasedir = /home/sb1554 optionlist = generic.cfg submitscript = slurmbns.sub runscript = slurmbns.run make = make -j at MAKEJOBS@ basedir = /home/sb1554/simulations ppn = 8 max-num-threads = 128 num-threads = 8 memory = 124000 nodes = 33 submit = sbatch /home/sb1554/BNS/simfactory/mdb/runscripts/slurmbns.run getstatus = squeue -j @JOB_ID@ # need to kill the whole set of processes descending from @JOB_ID@, not just the # (simfactory) top-level process stop = scancel @JOB_ID@ submitpattern = 'Submitted batch job (\d+)' statuspattern = '@JOB_ID@ ' queuedpattern = ' PD ' queue = checkpt runningpattern = ' (CF|CG|R|TO) ' holdingpattern = '\(JobHeldUser\)' exechost = hostname -s exechostpattern = (.*) stdout = cat @SIMULATION_NAME at .out stderr = cat @SIMULATION_NAME at .err stdout-follow = sleep 10 ; sattach @JOB_ID at .0 # stdout-follow = while ! scontrol >/dev/null wait_job @JOB_ID@ ; do sleep 5 ; done ; tail -n 100 -f @SIMULATION_NAME at .out @SIMULATION_NAME at .err maxwalltime = 72:00:00 disabled-thorns = CactusUtils/SystemTopology -------------- next part -------------- 1 #! /bin/bash 2 3 echo "Preparing:" 4 set -x # Output commands 5 set -e # Abort on errors 6 7 cd /home/sb1554/ 8 9 echo "Checking:" 10 pwd 11 hostname 12 date 13 14 echo "Environment:" 15 export CACTUS_PATH=/home/sb1554/BNS 16 export CACTUS_NUM_PROCS=2 17 export CACTUS_NUM_THREADS=8 18 export GMON_OUT_PREFIX=gmon.out 19 export OMP_NUM_THREADS=8 20 export OMP_PLACES=cores # TODO: maybe use threads when smt is used? 21 # https://github.com/open-mpi/ompi/issues/4948 22 export OMPI_MCA_btl_vader_single_copy_mechanism=none 23 env | sort > /home/sb1554/BNS/simfactory/ENVIRONMENT 24 25 echo "Starting:" 26 export CACTUS_STARTTIME=$(date +%s) 27 #time srun -n ${CACTUS_NUM_PROCS} @EXECUTABLE@ -L 3 /home/sb1554/BNS/bns.par 28 time /home/sb1554/BNS/simfactory/bin/sim run bns --parfile /home/sb1554/BNS/bns. par --machine slurmbns 29 #time srun @EXECUTABLE@ -L 3 /home/sb1554/BNS/bns.par 30 echo "Stopping:" 31 date 32 33 echo "Done." -------------- next part -------------- A non-text attachment was scrubbed... Name: bns.sh Type: application/x-sh Size: 607 bytes Desc: not available URL: