[Users] Possible performance issue

Wed Oct 9 07:09:47 CDT 2019

Dear Sir,

I am surprised with the differences between results of -O3 and -Ofast!

1. In the link (
https://docs.einsteintoolkit.org/et-docs/Configuring_a_new_machine), I see
that not only optimization flags but also CFLAGS, CXXFLAGS, FPPFLAGS are
set to ( -g3 -march=native ...) . Would it help to change these too?

2. Also,  in CFLAGS and CXXFLAGS, "-std=gnu99" and "-std=gnu++0x " are used
( -std=gnu++11 in mine). Other alternatives I am aware of are "c99" and
"c++11" . What are  good safe choices for these keeping in mind that I am
using intel libraries??

3. I am leaving CPP_OPTIMIZE_FLAGS blank. On the link (configurina a new
machine), they use "-DKRANC_VECTORS". Should I consider this?

Thanking you ,

Yours Sincerely
Vaishak

On Tue, Oct 8, 2019 at 11:21 PM Haas, Roland <rhaas at illinois.edu> wrote:

> Hello Vaishak,
>
> > I have made the changes as suggested. In fact I compiled using a Intel
> MPI
> > ( which was compiled by myself locally) using the optionlist of Stanpede
> 2
> > cluster as suggested by you , without OpenMP and with appropriate library
> > paths.I am glad that the speed has improved. I am now getting around 25
> > physical units per hour instead of 1.5 for a simulation running on 128
> mpi
> > procs.
> Glad to hear that.
>
> > The optimization parameters I am using are same as in the Stanpede2
> > cluster ( -Ofast -march=native) and not "-O3 -march=native" . Would it
> make
> > any difference?
> -Ofast can be faster than -O3 (likely it is) at the cost of reducing
> gcc's adherence to the IEEE floating point standard for C/C++.
> Basically it allows gcc to perform value unsafe optimizations (see
> https://gcc.gnu.org/wiki/FloatingPointMath) eg turning a+(b+c) into
> (a+b)+c or assuming that no infinities or NaNs will happen.
>
> These optimizations are often fine (and the Intel compiler performs a
> subset of these by default) but can occasionally (see
>
> https://bitbucket.org/einsteintoolkit/tickets/issues/2119/the-binary-neutron-star-gallery-example#comment-54050358)
> lead to unexpected results.
>
> You should be fine using them I would think. Though you may want to run
> at least one simulation with both options and compare results.
>
> > I have not tried these changes on mpich-3.3.1 or openmpi yet.
> If Intel MPI works for you, then using it is fine.
>
> > Also where can I find more information to know about these optimization
> > parameters?
> The gcc wiki link I provided above has an discussion about what floating
> point optimization parameters the compiler offers. There's also
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html which goes
> into great detail.
>
> If you need the last bit of performance you can also set NDEBUG=1
> (which disables assert() statements) and CARPET_OPTIMIZE=1 in your
> CPPFLAGS which will skip some internal consistency checks in Carpet,
> and you can try and turn of Carpet's poisoning of uninitialized data
> (see the poison parameters in Carpet and CarpetLib's param.ccl files).
>
> For options that you can set in the option list, I would try and have a
> look at:
> https://docs.einsteintoolkit.org/et-docs/Configuring_a_new_machine and
> the references therein.
>
> > Thankyou very much for your time. This was really helpful!!
> No problem.
>
> Yours,
> Roland
>
> --
> My email is as private as my paper mail. I therefore support encrypting
> and signing email messages. Get my PGP key from http://pgp.mit.edu .
>

-- 
Regards,
Vaishak P

PhD Scholar,
Shyama Prasad Mukherjee Fellow
Inter-University Center for Astronomy and Astrophysics (IUCAA)
Pune, India
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20191009/5ac4ba18/attachment.html