|
| 1 | +# Instrumenting Julia with DTrace, and bpftrace |
| 2 | + |
| 3 | +DTrace and bpftrace are tools that enable lightweight instrumentation of processes. |
| 4 | +You can turn the instrumentation on and off while the process is running, |
| 5 | +and with instrumentation off the overhead is minimal. |
| 6 | + |
| 7 | +!!! compat "Julia 1.8" |
| 8 | + Support for probes was added in Julia 1.8 |
| 9 | + |
| 10 | +!!! note |
| 11 | + This documentation has been written from a Linux perspective, most of this |
| 12 | + should hold on Mac OS/Darwin and FreeBSD. |
| 13 | + |
| 14 | +## Enabling support |
| 15 | + |
| 16 | +On Linux install the `systemtap` package that has a version of `dtrace`. |
| 17 | + |
| 18 | +``` |
| 19 | +WITH_DTRACE=1 |
| 20 | +``` |
| 21 | + |
| 22 | +### Verifying |
| 23 | + |
| 24 | +``` |
| 25 | +> readelf -n usr/lib/libjulia-internal.so.1 |
| 26 | +
|
| 27 | +Displaying notes found in: .note.gnu.build-id |
| 28 | + Owner Data size Description |
| 29 | + GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) |
| 30 | + Build ID: 57161002f35548772a87418d2385c284ceb3ead8 |
| 31 | +
|
| 32 | +Displaying notes found in: .note.stapsdt |
| 33 | + Owner Data size Description |
| 34 | + stapsdt 0x00000029 NT_STAPSDT (SystemTap probe descriptors) |
| 35 | + Provider: julia |
| 36 | + Name: gc__begin |
| 37 | + Location: 0x000000000013213e, Base: 0x00000000002bb4da, Semaphore: 0x0000000000346cac |
| 38 | + Arguments: |
| 39 | + stapsdt 0x00000032 NT_STAPSDT (SystemTap probe descriptors) |
| 40 | + Provider: julia |
| 41 | + Name: gc__stop_the_world |
| 42 | + Location: 0x0000000000132144, Base: 0x00000000002bb4da, Semaphore: 0x0000000000346cae |
| 43 | + Arguments: |
| 44 | + stapsdt 0x00000027 NT_STAPSDT (SystemTap probe descriptors) |
| 45 | + Provider: julia |
| 46 | + Name: gc__end |
| 47 | + Location: 0x000000000013214a, Base: 0x00000000002bb4da, Semaphore: 0x0000000000346cb0 |
| 48 | + Arguments: |
| 49 | + stapsdt 0x0000002d NT_STAPSDT (SystemTap probe descriptors) |
| 50 | + Provider: julia |
| 51 | + Name: gc__finalizer |
| 52 | + Location: 0x0000000000132150, Base: 0x00000000002bb4da, Semaphore: 0x0000000000346cb2 |
| 53 | + Arguments: |
| 54 | +``` |
| 55 | + |
| 56 | +## Adding probes in libjulia |
| 57 | + |
| 58 | +Probes are declared in dtraces format in the file `src/uprobes.d`. The generated |
| 59 | +header file is included in `src/julia_internal.h` and if you add probes you should |
| 60 | +provide a noop implementation there. |
| 61 | + |
| 62 | +The header will contain a semaphore `*_ENABLED` and the actual call to the probe. |
| 63 | +If the probe arguments are expensive to compute you should first check if the |
| 64 | +probe is enabled and then compute the arguments and call the probe. |
| 65 | + |
| 66 | +```c |
| 67 | + if (JL_PROBE_{PROBE}_ENABLED()) |
| 68 | + auto expensive_arg = ...; |
| 69 | + JL_PROBE_{PROBE}(expensive_arg); |
| 70 | +``` |
| 71 | +
|
| 72 | +If your probe has no arguments it is preferred to not include the semaphore check. |
| 73 | +With USDT probes enabled the cost of a semaphore is a memory load, irrespective of |
| 74 | +the fact that the probe is enabled or not. |
| 75 | +
|
| 76 | +```c |
| 77 | +#define JL_PROBE_GC_BEGIN_ENABLED() __builtin_expect (julia_gc__begin_semaphore, 0) |
| 78 | +__extension__ extern unsigned short julia_gc__begin_semaphore __attribute__ ((unused)) __attribute__ ((section (".probes"))); |
| 79 | +``` |
| 80 | + |
| 81 | +Whereas the probe itself is a noop sled that will be patched to a trampoline to |
| 82 | +the probe handler. |
| 83 | + |
| 84 | +## Available probes |
| 85 | + |
| 86 | +### GC probes |
| 87 | + |
| 88 | +1. `julia:gc__begin`: GC begins running on one thread and triggers stop-the-world. |
| 89 | +2. `julia:gc__stop_the_world`: All threads have reached a safepoint and GC runs. |
| 90 | +3. `julia:gc__mark__begin`: Beginning the mark phase |
| 91 | +4. `julia:gc__mark_end(scanned_bytes, perm_scanned)`: Mark phase ended |
| 92 | +5. `julia:gc__sweep_begin(full)`: Starting sweep |
| 93 | +6. `julia:gc__sweep_end()`: Sweep phase finished |
| 94 | +7. `julia:gc__end`: GC is finished, other threads continue work |
| 95 | +8. `julia:gc__finalizer`: Initial GC thread has finished running finalizers |
| 96 | + |
| 97 | +#### GC stop-the-world latency |
| 98 | + |
| 99 | +An example `bpftrace` script is given in `contrib/gc_stop_the_world_latency.bt` |
| 100 | +and it creates a histogram of the latency for all threads to reach a safepoint. |
| 101 | + |
| 102 | +Running this Julia code, with `julia -t 2` |
| 103 | + |
| 104 | +``` |
| 105 | +using Base.Threads |
| 106 | +
|
| 107 | +fib(x) = x <= 1 ? 1 : fib(x-1) + fib(x-2) |
| 108 | +
|
| 109 | +beaver = @spawn begin |
| 110 | + while true |
| 111 | + fib(30) |
| 112 | + # This safepoint is necessary until #41616, since otherwise this |
| 113 | + # loop will never yield to GC. |
| 114 | + GC.safepoint() |
| 115 | + end |
| 116 | +end |
| 117 | +
|
| 118 | +allocator = @spawn begin |
| 119 | + while true |
| 120 | + zeros(1024) |
| 121 | + end |
| 122 | +end |
| 123 | +
|
| 124 | +wait(allocator) |
| 125 | +``` |
| 126 | + |
| 127 | +and in a second terminal |
| 128 | + |
| 129 | +``` |
| 130 | +> sudo contrib/bpftrace/gc_stop_the_world_latency.bt |
| 131 | +Attaching 4 probes... |
| 132 | +Tracing Julia GC Stop-The-World Latency... Hit Ctrl-C to end. |
| 133 | +^C |
| 134 | +
|
| 135 | +
|
| 136 | +@usecs[1743412]: |
| 137 | +[4, 8) 971 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| |
| 138 | +[8, 16) 837 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | |
| 139 | +[16, 32) 129 |@@@@@@ | |
| 140 | +[32, 64) 10 | | |
| 141 | +[64, 128) 1 | | |
| 142 | +``` |
| 143 | + |
| 144 | +We can see the latency distribution of the stop-the-world phase in the executed Julia process. |
| 145 | + |
| 146 | +## Notes on using `bpftrace` |
| 147 | + |
| 148 | +An example probe in the bpftrace format looks like: |
| 149 | + |
| 150 | +``` |
| 151 | +usdt:usr/lib/libjulia-internal.so:julia:gc__begin |
| 152 | +{ |
| 153 | + @start[pid] = nsecs; |
| 154 | +} |
| 155 | +``` |
| 156 | + |
| 157 | +The probe declaration takes the kind `usdt`, then either the |
| 158 | +path to the library or the PID, the provider name `julia` |
| 159 | +and the probe name `gc__begin`. Note that I am using a |
| 160 | +relative path to the `libjulia-internal.so`, but this might |
| 161 | +need to be an absolute path on a production system. |
| 162 | + |
| 163 | +## Useful references: |
| 164 | + |
| 165 | +- [Julia Evans blog on Linux tracing systems](https://jvns.ca/blog/2017/07/05/linux-tracing-systems) |
| 166 | +- [LWN article on USDT and BPF](https://lwn.net/Articles/753601/) |
| 167 | +- [GDB support for probes](https://sourceware.org/gdb/onlinedocs/gdb/Static-Probe-Points.html) |
| 168 | +- [Brendan Gregg -- Linux Performance](https://www.brendangregg.com/linuxperf.html) |
0 commit comments