GC page profiler #52567

d-netto · 2023-12-17T18:13:23Z

Piggybacks in the sweeping phase of the GC to pretty-print a JSON representation of every page in the pool allocator.

Usage:

julia> using Profile

julia> Profile.take_page_profile("/tmp/test-profile.out")
"/tmp/test-profile.out"

Output (truncated to one page & after pretty printing):

{
	"address": "0x109dd0000",
	"object_size": 400,
	"objects": [
		"Task",
		"Task",
		"Task",
		"garbage",
		"GenericMemory",
		"garbage",
		"GenericMemory",
		"GenericMemory",
		"garbage",
		"GenericMemory",
		"GenericMemory",
		"Task",
		"Task",
		"Task",
		"garbage",
		"garbage",
		"garbage",
		"String",
		"garbage",
		"garbage",
		"String",
		"GenericMemory",
		"GenericMemory",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
		"garbage",
	]
}

This proved particularly useful for us when studying fragmentation in some of our workloads, though this profiler may possibly have some repeated functionality compared to the other profilers we already have in stdlib.

d-netto · 2023-12-17T22:46:46Z

In particular, we've been using this profile to compute page utilization per object type, as done in https://github.com/d-netto/page-profile-parser:

profile = PageProfile(947 pages, 14.80 MiB)
  SimpleVector: 4.74% utilization 395 pages (6.17 MiB)
  GenericMemory: 35.51% utilization 303 pages (4.73 MiB)
  DataType: 27.53% utilization 94 pages (1.47 MiB)
  String: 16.89% utilization 75 pages (1.17 MiB)
  MethodInstance: 40.96% utilization 67 pages (1.05 MiB)
  TypeMapEntry: 77.40% utilization 56 pages (0.88 MiB)
  Array: 34.06% utilization 52 pages (0.81 MiB)
  TypeofVararg: 0.27% utilization 21 pages (0.33 MiB)
  TypeMapLevel: 0.99% utilization 19 pages (0.30 MiB)
  IntrusiveLinkedList: 1.18% utilization 15 pages (0.23 MiB)
...

d-netto · 2023-12-21T23:43:42Z

Instrumentation above doesn't seem to introduce significant regressions in GC time:

before:

category = "TimeZones"
bench = "TimeZones.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │        572 │     111 │       100 │         11 │          111 │                 9 │     4645 │         19 │
│  median │        585 │     118 │       105 │         13 │          118 │                10 │     4648 │         20 │
│ maximum │        601 │     128 │       111 │         20 │          128 │                12 │     4651 │         21 │
│   stdev │         11 │       7 │         4 │          3 │            7 │                 1 │        2 │          1 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "append"
bench = "append.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │        866 │     246 │       158 │         88 │           57 │                84 │     4031 │         28 │
│  median │        872 │     247 │       158 │         90 │           58 │                92 │     4031 │         28 │
│ maximum │        906 │     262 │       165 │         97 │           61 │               103 │     4031 │         29 │
│   stdev │         16 │       7 │         3 │          4 │            2 │                 8 │        0 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "bigint"
bench = "pollard.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │        490 │     159 │       142 │         17 │           50 │                42 │      134 │         32 │
│  median │        511 │     170 │       151 │         18 │           52 │                58 │      135 │         33 │
│ maximum │        558 │     186 │       166 │         20 │           58 │                65 │      141 │         33 │
│   stdev │         27 │      11 │        10 │          1 │            3 │                10 │        3 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "linked"
bench = "list.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │       2808 │    2153 │      1955 │        199 │          906 │                47 │     2962 │         76 │
│  median │       2834 │    2164 │      1965 │        200 │          911 │                50 │     2962 │         77 │
│ maximum │       2923 │    2234 │      2011 │        223 │          931 │                82 │     2964 │         77 │
│   stdev │         49 │      36 │        25 │         11 │           10 │                16 │        1 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
bench = "tree.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │       2181 │     113 │       109 │          3 │           74 │                12 │      100 │          5 │
│  median │       2808 │     142 │       139 │          3 │          102 │                14 │      101 │          5 │
│ maximum │       2919 │     167 │       163 │          4 │          128 │                16 │      103 │          6 │
│   stdev │        346 │      25 │        25 │          0 │           25 │                 2 │        1 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "obj_arrays"
bench = "many_refs.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │       1173 │     844 │       787 │         57 │          277 │                 9 │      869 │         72 │
│  median │       1193 │     855 │       795 │         57 │          282 │                10 │      870 │         72 │
│ maximum │       1196 │     863 │       805 │         60 │          286 │                25 │      872 │         72 │
│   stdev │         10 │       7 │         7 │          1 │            4 │                 7 │        1 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
bench = "single_ref.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │        465 │     335 │       325 │         10 │          119 │                11 │      870 │         72 │
│  median │        466 │     338 │       328 │         10 │          122 │                16 │      873 │         73 │
│ maximum │        483 │     352 │       341 │         11 │          129 │                21 │      873 │         73 │
│   stdev │          8 │       8 │         7 │          0 │            4 │                 4 │        1 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "strings"
bench = "strings.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │      14304 │    1880 │      1681 │        199 │          587 │                75 │      782 │         13 │
│  median │      14966 │    2008 │      1796 │        212 │          595 │                84 │      801 │         13 │
│ maximum │      15996 │    2119 │      1894 │        225 │          647 │                91 │      814 │         14 │
│   stdev │        701 │      95 │        85 │         11 │           29 │                 6 │       13 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘

after (edited to include the latest commit):

category = "TimeZones"
bench = "TimeZones.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │        471 │       0 │         0 │          0 │           51 │                 0 │     4647 │          0 │
│  median │        490 │       0 │         0 │          0 │           54 │                 0 │     4649 │          0 │
│ maximum │        594 │     128 │       116 │         12 │          128 │                10 │     4649 │         22 │
│   stdev │         62 │      68 │        62 │          7 │           40 │                 5 │        1 │         12 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "append"
bench = "append.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │        878 │     244 │       156 │         85 │           56 │                74 │     4031 │         28 │
│  median │        889 │     255 │       162 │         89 │           57 │                95 │     4031 │         28 │
│ maximum │        927 │     262 │       170 │        100 │           61 │               109 │     4031 │         29 │
│   stdev │         21 │       7 │         5 │          6 │            2 │                13 │        0 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "bigint"
bench = "pollard.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │        482 │     156 │       139 │         16 │           47 │                37 │      132 │         32 │
│  median │        485 │     157 │       141 │         16 │           49 │                42 │      140 │         32 │
│ maximum │        515 │     163 │       146 │         18 │           50 │                65 │      140 │         33 │
│   stdev │         14 │       3 │         2 │          1 │            2 │                11 │        4 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "linked"
bench = "list.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │       2862 │    2187 │      1981 │        206 │          910 │                56 │     2962 │         76 │
│  median │       2895 │    2219 │      2000 │        210 │          921 │                61 │     2966 │         77 │
│ maximum │       2911 │    2232 │      2021 │        218 │          924 │                69 │     2967 │         77 │
│   stdev │         23 │      20 │        19 │          5 │            6 │                 5 │        2 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
bench = "tree.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │       2342 │     113 │       109 │          3 │           74 │                10 │      101 │          5 │
│  median │       2480 │     131 │       127 │          3 │           92 │                15 │      102 │          5 │
│ maximum │       2617 │     158 │       154 │          4 │          119 │                18 │      104 │          6 │
│   stdev │        100 │      18 │        18 │          0 │           20 │                 3 │        1 │          1 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "obj_arrays"
bench = "many_refs.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │       1189 │     854 │       797 │         57 │          278 │                11 │      870 │         71 │
│  median │       1200 │     860 │       802 │         57 │          279 │                14 │      871 │         72 │
│ maximum │       1221 │     876 │       818 │         58 │          292 │                22 │      874 │         72 │
│   stdev │         12 │      10 │        10 │          0 │            6 │                 4 │        2 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
bench = "single_ref.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │        469 │     332 │       322 │         10 │          118 │                13 │      869 │         70 │
│  median │        476 │     338 │       328 │         10 │          121 │                16 │      873 │         71 │
│ maximum │        481 │     340 │       330 │         11 │          121 │                19 │      873 │         71 │
│   stdev │          4 │       3 │         3 │          0 │            1 │                 3 │        2 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘
category = "strings"
bench = "strings.jl"
┌─────────┬────────────┬─────────┬───────────┬────────────┬──────────────┬───────────────────┬──────────┬────────────┐
│         │ total time │ gc time │ mark time │ sweep time │ max GC pause │ time to safepoint │ max heap │ percent gc │
│         │         ms │      ms │        ms │         ms │           ms │                us │       MB │          % │
├─────────┼────────────┼─────────┼───────────┼────────────┼──────────────┼───────────────────┼──────────┼────────────┤
│ minimum │      14860 │    1971 │      1758 │        210 │          600 │                70 │      794 │         13 │
│  median │      14900 │    1984 │      1774 │        213 │          621 │                88 │      799 │         13 │
│ maximum │      15195 │    2099 │      1878 │        221 │          623 │                93 │      814 │         14 │
│   stdev │        151 │      55 │        51 │          5 │           10 │                 9 │        8 │          0 │
└─────────┴────────────┴─────────┴───────────┴────────────┴──────────────┴───────────────────┴──────────┴────────────┘

vchuravy

I think this is fine generally, but I do wonder if we really need to run GC to gather these.
In some sense we could just scan the pages without running GC first.

d-netto · 2023-12-22T16:46:18Z

I think this is fine generally, but I do wonder if we really need to run GC to gather these.
In some sense we could just scan the pages without running GC first.

Examining the pages while the mutators are running is likely not thread-safe.

In principle we could use an alternative stop-the-world mechanism to make sure all threads are in a safe state while we're examining their pages, but piggybacking on the GC stop-the-world phase to do so is more convenient from an implementation perspective.

kpamnany

Looks fine to me and the benchmarks suggest there's no problem, but I'd still prefer to have all this under a build-time setting.

vtjnash · 2023-12-22T19:22:29Z

I already added a stop-a-thread mechanism recently (piggy backing on the safepoint code as you guessed), so you can do that part in a simple round-robin style if you wanted

vchuravy · 2024-02-09T15:22:45Z

Would have been great to also add docs

NHDaly · 2024-04-16T04:03:59Z

@d-netto can you please come back and add documentation for this? I agree w/ Valentin: A feature is not done until it is well tested and well documented.

(IMO The best approach is to merge in smaller changes at a time, with documentation and tests coming along with each part, so that even if you run out of time and have to set it down, at least what you've merged is high-quality.)

d-netto · 2024-04-22T13:10:53Z

Will add some docs once #53256 gets merged.

d-netto added GC Garbage collector profiler labels Dec 17, 2023

d-netto force-pushed the dcn-page-profile branch 2 times, most recently from 7b153c0 to 01465a9 Compare December 17, 2023 18:26

d-netto requested review from NHDaly, kpamnany and vchuravy December 17, 2023 18:32

d-netto force-pushed the dcn-page-profile branch 8 times, most recently from 5214188 to 4c78f4f Compare December 17, 2023 22:03

d-netto mentioned this pull request Dec 17, 2023

page profile RelationalAI/julia#123

Merged

3 tasks

d-netto force-pushed the dcn-page-profile branch 6 times, most recently from 195f86b to 6393ca1 Compare December 21, 2023 23:39

d-netto force-pushed the dcn-page-profile branch 6 times, most recently from 417ee2b to 81bdfa5 Compare December 22, 2023 01:52

d-netto force-pushed the dcn-page-profile branch 2 times, most recently from deab1fa to 2deb6cf Compare December 22, 2023 02:18

vchuravy reviewed Dec 22, 2023

View reviewed changes

d-netto force-pushed the dcn-page-profile branch from 2deb6cf to 81917c9 Compare December 22, 2023 15:10

d-netto force-pushed the dcn-page-profile branch from 81917c9 to aca69a7 Compare December 22, 2023 18:20

kpamnany approved these changes Dec 22, 2023

View reviewed changes

page profile

dd2ebef

d-netto force-pushed the dcn-page-profile branch from aca69a7 to dd2ebef Compare December 22, 2023 20:22

d-netto merged commit 4975a78 into master Dec 23, 2023

d-netto deleted the dcn-page-profile branch December 23, 2023 00:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GC page profiler #52567

GC page profiler #52567

Uh oh!

d-netto commented Dec 17, 2023 •

edited

Loading

Uh oh!

d-netto commented Dec 17, 2023

Uh oh!

d-netto commented Dec 21, 2023 •

edited

Loading

Uh oh!

vchuravy left a comment

Uh oh!

d-netto commented Dec 22, 2023

Uh oh!

kpamnany left a comment

Uh oh!

vtjnash commented Dec 22, 2023

Uh oh!

vchuravy commented Feb 9, 2024

Uh oh!

NHDaly commented Apr 16, 2024 •

edited

Loading

Uh oh!

d-netto commented Apr 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

GC page profiler #52567

GC page profiler #52567

Uh oh!

Conversation

d-netto commented Dec 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d-netto commented Dec 17, 2023

Uh oh!

d-netto commented Dec 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vchuravy left a comment

Choose a reason for hiding this comment

Uh oh!

d-netto commented Dec 22, 2023

Uh oh!

kpamnany left a comment

Choose a reason for hiding this comment

Uh oh!

vtjnash commented Dec 22, 2023

Uh oh!

vchuravy commented Feb 9, 2024

Uh oh!

NHDaly commented Apr 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d-netto commented Apr 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

d-netto commented Dec 17, 2023 •

edited

Loading

d-netto commented Dec 21, 2023 •

edited

Loading

NHDaly commented Apr 16, 2024 •

edited

Loading