[Users] Setting up ETK on an AMD Rome cluster

Roland Haas rhaas at illinois.edu
Thu Mar 16 09:35:23 CDT 2023


Hello Vaishak,

Thanks for the extra information.

I will try and see if I can get anything figured out on SDSC's Anvil,
which I think is also AMD Rome and where I have an allocation.

Yours,
Roland

> Respected Sir,
> 
> I succeeded in running ETK on the machine. I conducted a few experiments
> and I found the following:
> 
> 1. Intrinsic vectorization has to be disabled. gcc's auto vectorization
> with avx2 enabled then leads to successful evolution (at slower speeds).
> 2. Turning on intrinsic vectorization + disabling avx in gcc works for a
> short period of time. The evolution then stops due to punctures going to
> inf.
> 3. Intrinsic vectorization + gcc avx does not work due Seg faults at the
> testing vectorization stage.
> 
> I have read the page at
> 
> https://urldefense.com/v3/__https://docs.einsteintoolkit.org/et-docs/Vectorisation__;!!DZ3fjg!-BTTmvRWIVwqWnFLv9rK7Teae_M2MKPIGq_PIud68t-e2wrTUYoBtVxUP3Xj6-UERv8zI_Qda23-iRttSVKYH6Q$ 
> 
> Is this outdated? Does intrinsic vectorization add the capabilities to use
> 256bit wide data types on avx2 capable machines?
> 
> 
> Thanks and regards
> 
> On Thu, Mar 2, 2023 at 11:24 PM Roland Haas <rhaas at illinois.edu> wrote:
> 
> > Hello Vaishak,
> >
> > Sorry for the delay, and thank you for including the various log files.
> >
> > I have been running on a new AMD based system (NCSA Delta, Milan, not
> > Rome) during the last week (with Vectors active), though it is a
> > slightly older ET code (no changes to Vectors though). I also ran on
> > SDSC Expanse (Rome, Epyc 7742) for the ET testsuite for the 2022_11
> > release (https://urldefense.com/v3/__http://einsteintoolkit.org/testsuite_results/index.php__;!!DZ3fjg!-BTTmvRWIVwqWnFLv9rK7Teae_M2MKPIGq_PIud68t-e2wrTUYoBtVxUP3Xj6-UERv8zI_Qda23-iRttzVQuivk$ )
> > without SEGFAULT failures.
> >
> > This unfortunately makes debugging the issue that you are facing harder.
> >
> > One (possible) issue could be related to using  -march=native in you
> > compilation flags. Since this instructs GCC to compile for the CPU
> > architecture it finds itself running on, I would double check that
> > indeed the login nodes on sonic use the same CPU as the compute nodes.
> >
> > Yours,
> > Roland
> >  
> > > Dear All,
> > >
> > > Greetings from India. I am trying to get the ETK working on an AMD Rome
> > > powered supercomputer at ICTS, India. I am working with gcc (11.1.0,
> > > 12.2.0) and openmpi. The compilation is successful but every one of the
> > > tests and runs fails due to seg faults at the vectorization stage. On
> > > recompiling the toolkit without vectorization, the tests run  (except for
> > > one test of ML_BSSN which fails due to a relative error ~ 1e-14). I am
> > > attaching the backtrace (from the gallery BBH run), make log (with
> > > vectorization) and the optionlist herewith.
> > >
> > > Requesting help!
> > >
> > >
> > > With regards
> > >  
> >
> >
> >
> > --
> > My email is as private as my paper mail. I therefore support encrypting
> > and signing email messages. Get my PGP key from https://urldefense.com/v3/__http://keys.gnupg.net__;!!DZ3fjg!-BTTmvRWIVwqWnFLv9rK7Teae_M2MKPIGq_PIud68t-e2wrTUYoBtVxUP3Xj6-UERv8zI_Qda23-iRtta9n0k5o$ .
> >  
> 
> 


-- 
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://pgp.mit.edu .
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20230316/98cf7861/attachment.bin 


More information about the Users mailing list