[Users] Benchmarking results for McLachlan rewrite

Erik Schnetter schnetter at cct.lsu.edu
Wed Jul 8 08:14:42 CDT 2015


I added a second benchmark, using a Thornburg04 patch system, 8th order
finite differencing, and 4th order patch interpolation. The results are

original: 8.53935e-06 sec

rewrite:  8.55188e-06 sec


this time with 1 thread per MPI process, since that was most efficient in
both cases. Most of the time is spent in inter-patch interpolation, which
is much more expensive than in a "regular" case since this benchmark is run
on a single node and hence with very small grids.


With these numbers under our belt, can we merge the rewrite branch?


-erik



On Sat, Jul 4, 2015 at 5:19 PM, Ian Hinder <ian.hinder at aei.mpg.de> wrote:

> hi Erik,
>
> You could try the ones at
>
>
> https://bitbucket.org/ianhinder/cactusbench/src/faea4e13ed4232968e81edd1bbc80519198fe2b2/examples/ML_BSSN_Test/benchmark/?at=master
>
> I haven't updated them in a while, but hopefully the ET is sufficiently
> backward compatible for them to still work.
>
> --
> Ian Hinder
> http://members.aei.mpg.de/ianhin
>
> On 4 Jul 2015, at 17:04, Erik Schnetter <schnetter at cct.lsu.edu> wrote:
>
> On Sat, Jul 4, 2015 at 10:21 AM, Ian Hinder <ian.hinder at aei.mpg.de> wrote:
>
>>
>> On 3 Jul 2015, at 22:38, Erik Schnetter <schnetter at cct.lsu.edu> wrote:
>>
>> I ran the Simfactory benchmark for ML_BSSN on both the current version
>> and the "rewrite" branch to see whether this branch is ready for production
>> use. I ran this benchmark on a single node of Shelob at LSU. In both cases,
>> using 2 OpenMP threads and 8 MPI processes per node was fastest, so I am
>> reporting these results below. Since I was interested in the performance of
>> McLachlan, this is a unigrid vacuum benchmark using fourth order
>> differencing.
>>
>> One noteworthy difference is that dissipation as implemented in the
>> "rewrite" branch is finally approximately as fast as thorn Dissipation, and
>> I have thus used this option for the "rewrite" branch.
>>
>> Here are the high-level results:
>>
>> current: 3.03136e-06 sec per grid point
>> rewrite: 2.85734e-06 sec per grid point
>>
>> That is, the rewrite branch is about 5% faster.
>>
>>
>> Hi Erik,
>>
>> That is very reassuring!  However, for production use, I would be more
>> interested in 6th or 8th order finite differencing (where the advection
>> stencils become very large), and with Jacobians.  If 8th order with
>> Jacobians is at least a similar speed with the rewrite branch, then I would
>> be happy with switching.
>>
>
> Ian
>
> Do you want to suggest a particular benchmark parameter file?
>
> -erik
>
> --
> Erik Schnetter <schnetter at cct.lsu.edu>
> http://www.perimeterinstitute.ca/personal/eschnetter/
>
>


-- 
Erik Schnetter <schnetter at cct.lsu.edu>
http://www.perimeterinstitute.ca/personal/eschnetter/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20150708/cede8457/attachment.html 


More information about the Users mailing list