[Users] memory leak in Carpet?
Miguel Zilhão
miguel.zilhao.nogueira at tecnico.ulisboa.pt
Thu Jul 19 05:58:06 CDT 2018
hi all,
i've noticed that my runs (using latest ET release) with CarpetRegrid2 exhibit a significant
increase in memory during runtime. this seems to happen immediately after some non-trivial
regridding operation is done. the increase is steady, and at some point i run out of memory and the
simulation crashes. this is happening both on my workstation (running Ubuntu 18.04) as well as our
local cluster (running Debian 9). i was wondering if someone has seen something like this?
i have not seen this happen for simulations without CarpetRegrid2. i show below some relevant
portions of the stdout file for a standard inspiral BH run (note the last column--maxrss_mb):
------------------------------------------------------------------------------------
Iteration Time | *me_per_hour | LEANBSSNMOL::conf_fac | *TISTICS::maxrss_mb
| | minimum maximum | minimum maximum
------------------------------------------------------------------------------------
0 0.000 | 0.0000000 | 0.2213400 0.9977828 | 1057 1359
4 0.025 | 1.8911444 | 0.2213388 0.9977828 | 1060 1361
8 0.050 | 2.8049414 | 0.2213330 0.9977828 | 1060 1361
12 0.075 | 3.2859229 | 0.2213195 0.9977828 | 1060 1361
16 0.100 | 3.6219375 | 0.2212959 0.9977828 | 1061 1361
20 0.125 | 3.7521230 | 0.2212596 0.9977828 | 1064 1361
24 0.150 | 3.9448186 | 0.2212081 0.9977828 | 1064 1361
28 0.175 | 4.0652624 | 0.2211358 0.9977828 | 1064 1361
32 0.200 | 4.1492619 | 0.2210378 0.9977828 | 1064 1361
36 0.225 | 4.1272708 | 0.2209119 0.9977828 | 1064 1361
40 0.250 | 4.2206476 | 0.2207447 0.9977828 | 1064 1361
44 0.275 | 4.2801401 | 0.2205343 0.9977828 | 1064 1361
48 0.300 | 4.3460198 | 0.2202766 0.9977828 | 1064 1361
52 0.325 | 4.3485616 | 0.2199651 0.9977828 | 1064 1361
56 0.350 | 4.4054911 | 0.2195938 0.9977828 | 1065 1361
60 0.375 | 4.4398502 | 0.2191577 0.9977828 | 1065 1361
64 0.400 | 4.3724216 | 0.2186524 0.9977828 | 1065 1361
(...)
INFO (QuasiLocalMeasures): Weinberg angular momentum z: 0.405200
384 2.400 | 4.2719649 | 0.3525426 0.9977828 | 1068 1379
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (Carpet): Grid structure (superregions, grid points):
[0][0][0] exterior: [0,0,0] : [165,324,165] ([166,325,166] + PADDING) 8955700
[1][0][0] exterior: [3,155,3] : [179,493,175] ([177,339,173] + PADDING) 10380519
[2][0][0] exterior: [9,558,9] : [108,738,101] ([100,181,93] + PADDING) 1683300
[3][0][0] exterior: [21,1206,21] : [128,1386,113] ([108,181,93] + PADDING) 1817964
[4][0][0] exterior: [45,2521,45] : [147,2663,117] ([103,143,73] + PADDING) 1075217
[5][0][0] exterior: [93,5111,93] : [225,5257,165] ([133,147,73] + PADDING) 1427223
[6][0][0] exterior: [242,10307,189] : [380,10445,261] ([139,139,73] + PADDING) 1410433
[7][0][0] exterior: [554,20684,381] : [692,20822,453] ([139,139,73] + PADDING) 1410433
INFO (Carpet): Grid structure (superregions, coordinates):
[0][0][0] exterior: [-4.800000000000001,-259.199999999999989,-4.800000000000001] :
[259.199999999999989,259.199999999999989,259.199999999999989] :
[1.600000000000000,1.600000000000000,1.600000000000000]
[1][0][0] exterior: [-2.400000000000000,-135.199999999999989,-2.400000000000000] :
[138.400000000000006,135.199999999999989,135.199999999999989] :
[0.800000000000000,0.800000000000000,0.800000000000000]
[2][0][0] exterior: [-1.200000000000001,-36.000000000000000,-1.200000000000001] :
[38.400000000000006,36.000000000000000,35.600000000000009] :
[0.400000000000000,0.400000000000000,0.400000000000000]
[3][0][0] exterior: [-0.600000000000001,-18.000000000000000,-0.600000000000001] :
[20.800000000000001,18.000000000000000,17.800000000000001] :
[0.200000000000000,0.200000000000000,0.200000000000000]
[4][0][0] exterior: [-0.300000000000001,-7.100000000000023,-0.300000000000001] :
[9.900000000000000,7.099999999999966,6.900000000000000] :
[0.100000000000000,0.100000000000000,0.100000000000000]
[5][0][0] exterior: [-0.150000000000000,-3.650000000000006,-0.150000000000000] :
[6.449999999999999,3.649999999999977,3.449999999999999] :
[0.050000000000000,0.050000000000000,0.050000000000000]
[6][0][0] exterior: [1.250000000000000,-1.525000000000034,-0.075000000000000] :
[4.699999999999999,1.925000000000011,1.725000000000000] :
[0.025000000000000,0.025000000000000,0.025000000000000]
[7][0][0] exterior: [2.125000000000000,-0.650000000000034,-0.037500000000001] :
[3.850000000000000,1.074999999999989,0.862500000000000] :
[0.012500000000000,0.012500000000000,0.012500000000000]
INFO (Carpet): Global grid structure statistics:
INFO (Carpet): GF: rhs: 2551k active, 3661k owned (+44%), 6190k total (+69%), 320 steps/time
INFO (Carpet): GF: vars: 161, pts: 3631M active, 4271M owned (+18%), 6261M total (+47%), 1.0 comp/proc
INFO (Carpet): GA: vars: 1044, pts: 82M active, 82M total (+0%)
INFO (Carpet): Total required memory: 50.821 GByte (for GAs and currently active GFs)
INFO (Carpet): Load balance: min avg max sdv max/avg-1
INFO (Carpet): Level 0: 25M 27M 31M 1M owned 12%
INFO (Carpet): Level 1: 31M 34M 36M 1M owned 6%
INFO (Carpet): Level 2: 5M 5M 6M 0M owned 8%
INFO (Carpet): Level 3: 5M 6M 6M 0M owned 8%
INFO (Carpet): Level 4: 3M 3M 4M 0M owned 8%
INFO (Carpet): Level 5: 4M 4M 5M 0M owned 9%
INFO (Carpet): Level 6: 4M 5M 5M 0M owned 4%
INFO (Carpet): Level 7: 4M 5M 5M 0M owned 4%
388 2.425 | 4.1930426 | 0.3508349 0.9977828 | 1149 1450
392 2.450 | 4.2022566 | 0.3491189 0.9977828 | 1149 1450
396 2.475 | 4.2091891 | 0.3471720 0.9977828 | 1149 1450
------------------------------------------------------------------------------------
Iteration Time | *me_per_hour | LEANBSSNMOL::conf_fac | *TISTICS::maxrss_mb
| | minimum maximum | minimum maximum
------------------------------------------------------------------------------------
400 2.500 | 4.2177082 | 0.3450791 0.9977828 | 1149 1450
404 2.525 | 4.2195973 | 0.3429667 0.9977828 | 1149 1450
(...)
1344 8.400 | 4.4943523 | 0.3648129 0.9977828 | 1156 1455
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 0
INFO (CarpetRegrid2): Enforcing grid structure properties, iteration 1
INFO (Carpet): Grid structure (superregions, grid points):
[0][0][0] exterior: [0,0,0] : [165,324,165] ([166,325,166] + PADDING) 8955700
[1][0][0] exterior: [3,154,3] : [178,494,175] ([176,341,173] + PADDING) 10382768
[2][0][0] exterior: [9,556,9] : [108,740,101] ([100,185,93] + PADDING) 1720500
[3][0][0] exterior: [21,1202,21] : [127,1390,113] ([107,189,93] + PADDING) 1880739
[4][0][0] exterior: [45,2512,45] : [145,2672,117] ([101,161,73] + PADDING) 1187053
[5][0][0] exterior: [93,5094,93] : [110,5135,165] ([18,42,73] + PADDING) 55188
[5][0][1] exterior: [93,5136,93] : [220,5274,165] ([128,139,73] + PADDING) 1298816
[6][0][0] exterior: [234,10342,189] : [372,10480,261] ([139,139,73] + PADDING) 1410433
[7][0][0] exterior: [537,20752,381] : [675,20890,453] ([139,139,73] + PADDING) 1410433
INFO (Carpet): Grid structure (superregions, coordinates):
[0][0][0] exterior: [-4.800000000000001,-259.199999999999989,-4.800000000000001] :
[259.199999999999989,259.199999999999989,259.199999999999989] :
[1.600000000000000,1.600000000000000,1.600000000000000]
[1][0][0] exterior: [-2.400000000000000,-136.000000000000000,-2.400000000000000] :
[137.599999999999994,136.000000000000000,135.199999999999989] :
[0.800000000000000,0.800000000000000,0.800000000000000]
[2][0][0] exterior: [-1.200000000000001,-36.800000000000011,-1.200000000000001] :
[38.400000000000006,36.800000000000011,35.600000000000009] :
[0.400000000000000,0.400000000000000,0.400000000000000]
[3][0][0] exterior: [-0.600000000000001,-18.800000000000011,-0.600000000000001] :
[20.600000000000001,18.800000000000011,17.800000000000001] :
[0.200000000000000,0.200000000000000,0.200000000000000]
[4][0][0] exterior: [-0.300000000000001,-8.000000000000000,-0.300000000000001] :
[9.699999999999999,8.000000000000000,6.900000000000000] :
[0.100000000000000,0.100000000000000,0.100000000000000]
[5][0][0] exterior: [-0.150000000000000,-4.500000000000000,-0.150000000000000] :
[0.699999999999999,-2.449999999999989,3.449999999999999] :
[0.050000000000000,0.050000000000000,0.050000000000000]
[5][0][1] exterior: [-0.150000000000000,-2.400000000000034,-0.150000000000000] :
[6.199999999999999,4.500000000000000,3.449999999999999] :
[0.050000000000000,0.050000000000000,0.050000000000000]
[6][0][0] exterior: [1.050000000000000,-0.650000000000034,-0.075000000000000] :
[4.500000000000000,2.800000000000011,1.725000000000000] :
[0.025000000000000,0.025000000000000,0.025000000000000]
[7][0][0] exterior: [1.912500000000000,0.199999999999989,-0.037500000000001] :
[3.637499999999999,1.925000000000011,0.862500000000000] :
[0.012500000000000,0.012500000000000,0.012500000000000]
INFO (Carpet): Global grid structure statistics:
INFO (Carpet): GF: rhs: 2544k active, 3659k owned (+44%), 6187k total (+69%), 320 steps/time
INFO (Carpet): GF: vars: 161, pts: 3644M active, 4290M owned (+18%), 6288M total (+47%), 1.0 comp/proc
INFO (Carpet): GA: vars: 1044, pts: 82M active, 82M total (+0%)
INFO (Carpet): Total required memory: 51.040 GByte (for GAs and currently active GFs)
INFO (Carpet): Load balance: min avg max sdv max/avg-1
INFO (Carpet): Level 0: 25M 27M 31M 1M owned 12%
INFO (Carpet): Level 1: 31M 34M 35M 1M owned 4%
INFO (Carpet): Level 2: 5M 5M 6M 0M owned 6%
INFO (Carpet): Level 3: 5M 6M 6M 0M owned 8%
INFO (Carpet): Level 4: 3M 4M 4M 0M owned 10%
INFO (Carpet): Level 5: 3M 4M 5M 0M owned 11%
INFO (Carpet): Level 6: 4M 5M 5M 0M owned 4%
INFO (Carpet): Level 7: 4M 5M 5M 0M owned 4%
1348 8.425 | 4.4749366 | 0.3620783 0.9977828 | 1417 1726
1352 8.450 | 4.4793418 | 0.3593239 0.9977828 | 1417 1727
1356 8.475 | 4.4831132 | 0.3565493 0.9977828 | 1417 1727
------------------------------------------------------------------------------------
Iteration Time | *me_per_hour | LEANBSSNMOL::conf_fac | *TISTICS::maxrss_mb
| | minimum maximum | minimum maximum
------------------------------------------------------------------------------------
1360 8.500 | 4.4873776 | 0.3537546 0.9977828 | 1417 1727
1364 8.525 | 4.4896835 | 0.3509394 0.9977828 | 1417 1727
and the pattern continues such that at iteration 9000 we're at maximum(maxrss_mb) = 2722... is this
normal, or expected? it becomes very inconvenient since, for some high-resolutions runs, i
inevitably run out of memory.
thanks,
Miguel
More information about the Users
mailing list