[Users] reported vs real memory usage + CarpetRegrid2?
Ian Hinder
ian.hinder at aei.mpg.de
Wed Sep 11 03:23:38 CDT 2013
On 10 Sep 2013, at 18:34, "Kelly, Bernard J. (GSFC-660.0)[UNIVERSITY OF MARYLAND BALTIMORE COUNTY]" <bernard.j.kelly at nasa.gov> wrote:
> Hi.
>
> I'm running a vacuum BHB evolution with a larger-than-usual set of inner
> refinement regions (levels 8, 9, 10 11 have radii of 12M, 8M, 6M, and 4M,
> respectively) and consequently the memory usage is a bit higher than
> normal. But I'm finding that it jumps up almost 100% after the first
> regridding, and stays there.
>
> My diagnostic for this is the result of top on each of the nodes (via
> "qtop.pl", a script on the machine I'm using). Sampled before the first
> regridding, it shows each core using ~ 1.5% of the node's total memory,
> while after regridding, it's more like 2.9% (these are Sandy Bridge nodes,
> with 16 available cores).
>
> However, the periodic output message from Carpet reporting the Grid
> structure etc. shows regions only marginally larger than before, and ---
> crucially for me --- has a marginally larger "Total required memory" (164
> GB -> 167 GB, for instance).
>
> So (a) what's using the extra memory, and (b) why isn't Carpet reporting
> it? How seriously should I be taking that "Total required memory" message?
>
> I see this with executables generated from both the last (ET_2012_11) and
> current (ET_2013_05) stable releases, BTW. I'm attaching the current
> parameter file and SCROUT from a run.
I know there was a problem related to drastically increased memory usage, but I thought that this was introduced to the trunk after ET_2013_05, and that Erik had already fixed it. There was another problem in (I believe) ET_2012_11 where Carpet was always collapsing multiple grids on a refinement level into the smallest enclosing box, leading to huge memory usage, but that was also fixed, and I believe backported to ET_2012_11. Have you tried using the SystemStatistics thorn to monitor memory usage? This should be easier than using top on the nodes.
How is the grid distributed among the nodes? Even if the total required memory is roughly constant, it's possible that the grid is distributed unevenly between nodes. Is every node showing the increased memory usage? Given that you are still using a very small amount of memory, it's possible that Carpet is just "overallocating" on the first regridding as it anticipates that it might need more memory later, and memory allocation might be expensive. The amount of overallocation is presumably small in comparison to the total available memory, but might be of the order of 1%, as you are seeing.
I wouldn't worry about this small amount of increased memory usage unless you can reproduce the problem on a more heavily-loaded system. From your description, I suspect you are not using OpenMP. Why is that? Using pure MPI leads to an unnecessary memory overhead.
--
Ian Hinder
http://numrel.aei.mpg.de/people/hinder
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20130911/4a4a4e18/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20130911/4a4a4e18/attachment.bin
More information about the Users
mailing list