Extract SparkPlan product and append to trace #9783

charlesmyu · 2025-10-16T19:37:48Z

What Does This Do

Instruments the fromSparkPlan function to:
- Parse the plan parameter into a map of String properties
- Update the meta field of returned SparkPlanInfo with those properties
Creates a Spark21XPlanUtils class with a extractPlanProduct method that parses a SparkPlan object and returns the properties as a <String, String> map
Creates a AbstractSparkPlanUtils class with a parsePlanProduct method that parses the various Objects extracted by extractPlanProduct to return a comprehensible string representation
- This is extended by Spark21XPlanUtils
- Updates the toJson function in SparkSQLUtils to write a JSON object if possible, otherwise just write a string
Updates tests to reflect these changes & additions

Motivation

The SparkPlan houses additional details about its execution that is useful to visualize for operators to use. Extract these into spans so they can be ingested.

Additional Notes

This PR leverages the existing meta field in the SparkPlanInfo class. This should be safe as we don't overwrite the field if any data exists, and it is currently only used for ScanExec node details. Furthermore since this class appears to be primarily intended as an abstraction for informational purposes, any faulty updates to the object shouldn't result in any breaking issues.

Also note that we use the Product API to obtain the key names (using productElementName), however this was only made available in Scala 2.13. As a result the Scala 2.12 instrumentation uses arbitrary _dd.unknown_key.X names for the keys, so the values can at least be extracted.

Worth mentioning that this PR does not introduce traversal of the physical plan itself into the tracer - this is left to Spark itself. This is because the recursive fromSparkPlan method is instrumented, meaning as each node is built the tracer is invoked to parse it, and we expressly filter out any potential QueryPlan nodes when performing the parsing.

Contributor Checklist

Format the title according the contribution guidelines
Assign the type: and (comp: or inst:) labels in addition to any useful labels
Don't use close, fix or any linking keywords when referencing an issue.
Use solves instead, and assign the PR milestone to the issue
Update the CODEOWNERS file on source file addition, move, or deletion
Update the public documentation in case of new configuration flag or behavior

Jira ticket: DJM-974

…ort more types and use JSON arrays

datadog-official · 2025-10-16T19:46:24Z

🎯 Code Coverage
• Patch Coverage: 0.00%
• Total Coverage: 59.89% (+0.12%)

View detailed report

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 3619b77 | Docs | Was this helpful? Give us feedback!}

pr-commenter · 2025-10-16T20:01:50Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	charles.yu/djm-974/extract-spark-plan-product
git_commit_date	1761174241	1761189637
git_commit_sha	`f42a6e9`	`3619b77`
release_version	1.55.0-SNAPSHOT~f42a6e9909	1.55.0-SNAPSHOT~3619b77e1b

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1761191611	1761191611
ci_job_id	1193103858	1193103858
ci_pipeline_id	80044964	80044964
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-f0hiff2a 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-f0hiff2a 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 1 performance improvements and 3 performance regressions! Performance is the same for 51 metrics, 10 unstable metrics.

scenario	Δ mean execution_time	candidate mean execution_time	baseline mean execution_time
scenario:startup:insecure-bank:iast:Flare Poller	worse [+133.708µs; +296.406µs] or [+3.240%; +7.182%]	4.342ms	4.127ms
scenario:startup:petclinic:iast:IAST	better [-9.568ms; -6.188ms] or [-27.892%; -18.037%]	26.426ms	34.304ms
scenario:startup:petclinic:iast:Telemetry	worse [+169.044µs; +468.848µs] or [+2.028%; +5.623%]	8.656ms	8.337ms
scenario:startup:petclinic:profiling:Telemetry	worse [+7.645ms; +8.240ms] or [+92.876%; +100.105%]	16.174ms	8.231ms

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.031 s) : 0, 1031329
Total [baseline] (8.685 s) : 0, 8685047
Agent [candidate] (1.016 s) : 0, 1015991
Total [candidate] (8.684 s) : 0, 8684116
section iast
Agent [baseline] (1.166 s) : 0, 1165936
Total [baseline] (9.347 s) : 0, 9346832
Agent [candidate] (1.159 s) : 0, 1159400
Total [candidate] (9.351 s) : 0, 9350953

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.031 s	-
Agent	iast	1.166 s	134.606 ms (13.1%)
Total	tracing	8.685 s	-
Total	iast	9.347 s	661.785 ms (7.6%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.016 s	-
Agent	iast	1.159 s	143.408 ms (14.1%)
Total	tracing	8.684 s	-
Total	iast	9.351 s	666.838 ms (7.7%)

gantt
    title insecure-bank - break down per module: candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.462 ms) : 0, 1462
crashtracking [candidate] (1.451 ms) : 0, 1451
BytebuddyAgent [baseline] (703.77 ms) : 0, 703770
BytebuddyAgent [candidate] (693.636 ms) : 0, 693636
GlobalTracer [baseline] (244.777 ms) : 0, 244777
GlobalTracer [candidate] (241.736 ms) : 0, 241736
AppSec [baseline] (32.338 ms) : 0, 32338
AppSec [candidate] (32.179 ms) : 0, 32179
Debugger [baseline] (6.355 ms) : 0, 6355
Debugger [candidate] (6.413 ms) : 0, 6413
Remote Config [baseline] (679.085 µs) : 0, 679
Remote Config [candidate] (709.569 µs) : 0, 710
Telemetry [baseline] (15.713 ms) : 0, 15713
Telemetry [candidate] (9.271 ms) : 0, 9271
Flare Poller [baseline] (5.02 ms) : 0, 5020
Flare Poller [candidate] (9.426 ms) : 0, 9426
section iast
crashtracking [baseline] (1.489 ms) : 0, 1489
crashtracking [candidate] (1.491 ms) : 0, 1491
BytebuddyAgent [baseline] (828.654 ms) : 0, 828654
BytebuddyAgent [candidate] (822.226 ms) : 0, 822226
GlobalTracer [baseline] (233.55 ms) : 0, 233550
GlobalTracer [candidate] (232.785 ms) : 0, 232785
IAST [baseline] (30.964 ms) : 0, 30964
IAST [candidate] (26.696 ms) : 0, 26696
AppSec [baseline] (30.697 ms) : 0, 30697
AppSec [candidate] (34.822 ms) : 0, 34822
Debugger [baseline] (6.06 ms) : 0, 6060
Debugger [candidate] (6.124 ms) : 0, 6124
Remote Config [baseline] (613.287 µs) : 0, 613
Remote Config [candidate] (607.359 µs) : 0, 607
Telemetry [baseline] (8.396 ms) : 0, 8396
Telemetry [candidate] (8.644 ms) : 0, 8644
Flare Poller [baseline] (4.127 ms) : 0, 4127
Flare Poller [candidate] (4.342 ms) : 0, 4342

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.04 s) : 0, 1039502
Total [baseline] (10.925 s) : 0, 10925125
Agent [candidate] (1.017 s) : 0, 1017157
Total [candidate] (10.673 s) : 0, 10673226
section appsec
Agent [baseline] (1.21 s) : 0, 1209602
Total [baseline] (11.027 s) : 0, 11027076
Agent [candidate] (1.196 s) : 0, 1195832
Total [candidate] (11.067 s) : 0, 11067278
section iast
Agent [baseline] (1.163 s) : 0, 1163406
Total [baseline] (11.123 s) : 0, 11123026
Agent [candidate] (1.151 s) : 0, 1150827
Total [candidate] (10.967 s) : 0, 10966845
section profiling
Agent [baseline] (1.182 s) : 0, 1182351
Total [baseline] (10.908 s) : 0, 10908413
Agent [candidate] (1.161 s) : 0, 1161389
Total [candidate] (11.093 s) : 0, 11092681

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.04 s	-
Agent	appsec	1.21 s	170.099 ms (16.4%)
Agent	iast	1.163 s	123.904 ms (11.9%)
Agent	profiling	1.182 s	142.848 ms (13.7%)
Total	tracing	10.925 s	-
Total	appsec	11.027 s	101.951 ms (0.9%)
Total	iast	11.123 s	197.901 ms (1.8%)
Total	profiling	10.908 s	-16.712 ms (-0.2%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.017 s	-
Agent	appsec	1.196 s	178.675 ms (17.6%)
Agent	iast	1.151 s	133.67 ms (13.1%)
Agent	profiling	1.161 s	144.232 ms (14.2%)
Total	tracing	10.673 s	-
Total	appsec	11.067 s	394.052 ms (3.7%)
Total	iast	10.967 s	293.619 ms (2.8%)
Total	profiling	11.093 s	419.455 ms (3.9%)

gantt
    title petclinic - break down per module: candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.499 ms) : 0, 1499
crashtracking [candidate] (1.454 ms) : 0, 1454
BytebuddyAgent [baseline] (709.023 ms) : 0, 709023
BytebuddyAgent [candidate] (694.349 ms) : 0, 694349
GlobalTracer [baseline] (246.837 ms) : 0, 246837
GlobalTracer [candidate] (241.917 ms) : 0, 241917
AppSec [baseline] (32.656 ms) : 0, 32656
AppSec [candidate] (32.326 ms) : 0, 32326
Debugger [baseline] (6.433 ms) : 0, 6433
Debugger [candidate] (6.431 ms) : 0, 6431
Remote Config [baseline] (692.873 µs) : 0, 693
Remote Config [candidate] (707.406 µs) : 0, 707
Telemetry [baseline] (13.145 ms) : 0, 13145
Telemetry [candidate] (9.378 ms) : 0, 9378
Flare Poller [baseline] (7.901 ms) : 0, 7901
Flare Poller [candidate] (9.351 ms) : 0, 9351
section appsec
crashtracking [baseline] (1.496 ms) : 0, 1496
crashtracking [candidate] (1.453 ms) : 0, 1453
BytebuddyAgent [baseline] (730.078 ms) : 0, 730078
BytebuddyAgent [candidate] (719.667 ms) : 0, 719667
GlobalTracer [baseline] (237.478 ms) : 0, 237478
GlobalTracer [candidate] (234.565 ms) : 0, 234565
IAST [baseline] (25.143 ms) : 0, 25143
IAST [candidate] (24.898 ms) : 0, 24898
AppSec [baseline] (175.208 ms) : 0, 175208
AppSec [candidate] (175.002 ms) : 0, 175002
Debugger [baseline] (5.954 ms) : 0, 5954
Debugger [candidate] (6.142 ms) : 0, 6142
Remote Config [baseline] (638.798 µs) : 0, 639
Remote Config [candidate] (628.454 µs) : 0, 628
Telemetry [baseline] (8.496 ms) : 0, 8496
Telemetry [candidate] (8.438 ms) : 0, 8438
Flare Poller [baseline] (3.916 ms) : 0, 3916
Flare Poller [candidate] (3.932 ms) : 0, 3932
section iast
crashtracking [baseline] (1.465 ms) : 0, 1465
crashtracking [candidate] (1.455 ms) : 0, 1455
BytebuddyAgent [baseline] (825.578 ms) : 0, 825578
BytebuddyAgent [candidate] (815.506 ms) : 0, 815506
GlobalTracer [baseline] (234.457 ms) : 0, 234457
GlobalTracer [candidate] (231.245 ms) : 0, 231245
IAST [baseline] (34.304 ms) : 0, 34304
IAST [candidate] (26.426 ms) : 0, 26426
AppSec [baseline] (27.057 ms) : 0, 27057
AppSec [candidate] (34.994 ms) : 0, 34994
Debugger [baseline] (6.129 ms) : 0, 6129
Debugger [candidate] (6.111 ms) : 0, 6111
Remote Config [baseline] (593.674 µs) : 0, 594
Remote Config [candidate] (610.057 µs) : 0, 610
Telemetry [baseline] (8.337 ms) : 0, 8337
Telemetry [candidate] (8.656 ms) : 0, 8656
Flare Poller [baseline] (4.156 ms) : 0, 4156
Flare Poller [candidate] (4.219 ms) : 0, 4219
section profiling
crashtracking [baseline] (1.463 ms) : 0, 1463
crashtracking [candidate] (1.424 ms) : 0, 1424
BytebuddyAgent [baseline] (732.04 ms) : 0, 732040
BytebuddyAgent [candidate] (721.012 ms) : 0, 721012
GlobalTracer [baseline] (221.871 ms) : 0, 221871
GlobalTracer [candidate] (217.694 ms) : 0, 217694
AppSec [baseline] (33.427 ms) : 0, 33427
AppSec [candidate] (32.253 ms) : 0, 32253
Debugger [baseline] (13.907 ms) : 0, 13907
Debugger [candidate] (6.433 ms) : 0, 6433
Remote Config [baseline] (705.226 µs) : 0, 705
Remote Config [candidate] (720.404 µs) : 0, 720
Telemetry [baseline] (8.231 ms) : 0, 8231
Telemetry [candidate] (16.174 ms) : 0, 16174
Flare Poller [baseline] (4.163 ms) : 0, 4163
Flare Poller [candidate] (4.091 ms) : 0, 4091
ProfilingAgent [baseline] (110.712 ms) : 0, 110712
ProfilingAgent [candidate] (108.027 ms) : 0, 108027
Profiling [baseline] (111.325 ms) : 0, 111325
Profiling [candidate] (109.481 ms) : 0, 109481

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	charles.yu/djm-974/extract-spark-plan-product
git_commit_date	1761174241	1761189637
git_commit_sha	`f42a6e9`	`3619b77`
release_version	1.55.0-SNAPSHOT~f42a6e9909	1.55.0-SNAPSHOT~3619b77e1b

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1761191263	1761191263
ci_job_id	1193103859	1193103859
ci_pipeline_id	80044964	80044964
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-1-i9h8uurg 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-1-i9h8uurg 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 2 performance improvements and 4 performance regressions! Performance is the same for 6 metrics, 12 unstable metrics.

scenario	Δ mean http_req_duration	Δ mean throughput	candidate mean http_req_duration	candidate mean throughput	baseline mean http_req_duration	baseline mean throughput
scenario:load:insecure-bank:iast:high_load	better [-1.300ms; -0.955ms] or [-12.893%; -9.467%]	unstable [-6.147op/s; +121.084op/s] or [-1.334%; +26.283%]	8.955ms	518.156op/s	10.083ms	460.688op/s
scenario:load:insecure-bank:profiling:high_load	better [-860.951µs; -540.774µs] or [-9.060%; -5.691%]	unstable [-33.875op/s; +111.313op/s] or [-6.934%; +22.785%]	8.802ms	527.250op/s	9.503ms	488.531op/s
scenario:load:insecure-bank:iast_GLOBAL:high_load	worse [+773.539µs; +1182.866µs] or [+7.588%; +11.603%]	unstable [-95.466op/s; +16.341op/s] or [-20.944%; +3.585%]	11.173ms	416.250op/s	10.195ms	455.812op/s
scenario:load:petclinic:iast:high_load	worse [+0.945ms; +1.756ms] or [+2.180%; +4.052%]	unstable [-10.284op/s; +6.376op/s] or [-9.640%; +5.977%]	44.684ms	104.725op/s	43.334ms	106.679op/s
scenario:load:petclinic:tracing:high_load	worse [+1.803ms; +2.615ms] or [+4.100%; +5.945%]	unstable [-13.344op/s; +0.576op/s] or [-12.390%; +0.535%]	46.195ms	101.312op/s	43.986ms	107.696op/s
scenario:load:petclinic:profiling:high_load	worse [+2.300ms; +3.305ms] or [+4.885%; +7.021%]	unstable [-13.631op/s; +0.014op/s] or [-13.537%; +0.014%]	49.877ms	93.888op/s	47.075ms	100.696op/s

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
    dateFormat X
    axisFormat %s
section baseline
no_agent (36.628 ms) : 36334, 36923
.   : milestone, 36628,
appsec (48.039 ms) : 47615, 48464
.   : milestone, 48039,
code_origins (43.38 ms) : 43021, 43739
.   : milestone, 43380,
iast (43.334 ms) : 42967, 43701
.   : milestone, 43334,
profiling (47.075 ms) : 46647, 47502
.   : milestone, 47075,
tracing (43.986 ms) : 43628, 44345
.   : milestone, 43986,
section candidate
no_agent (35.98 ms) : 35693, 36266
.   : milestone, 35980,
appsec (49.449 ms) : 49013, 49885
.   : milestone, 49449,
code_origins (43.018 ms) : 42651, 43384
.   : milestone, 43018,
iast (44.684 ms) : 44298, 45071
.   : milestone, 44684,
profiling (49.877 ms) : 49374, 50381
.   : milestone, 49877,
tracing (46.195 ms) : 45801, 46590
.   : milestone, 46195,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	36.628 ms [36.334 ms, 36.923 ms]	-
appsec	48.039 ms [47.615 ms, 48.464 ms]	11.411 ms (31.2%)
code_origins	43.38 ms [43.021 ms, 43.739 ms]	6.751 ms (18.4%)
iast	43.334 ms [42.967 ms, 43.701 ms]	6.706 ms (18.3%)
profiling	47.075 ms [46.647 ms, 47.502 ms]	10.447 ms (28.5%)
tracing	43.986 ms [43.628 ms, 44.345 ms]	7.358 ms (20.1%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	35.98 ms [35.693 ms, 36.266 ms]	-
appsec	49.449 ms [49.013 ms, 49.885 ms]	13.47 ms (37.4%)
code_origins	43.018 ms [42.651 ms, 43.384 ms]	7.038 ms (19.6%)
iast	44.684 ms [44.298 ms, 45.071 ms]	8.705 ms (24.2%)
profiling	49.877 ms [49.374 ms, 50.381 ms]	13.898 ms (38.6%)
tracing	46.195 ms [45.801 ms, 46.59 ms]	10.216 ms (28.4%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
    dateFormat X
    axisFormat %s
section baseline
no_agent (4.392 ms) : 4338, 4446
.   : milestone, 4392,
iast (10.083 ms) : 9909, 10256
.   : milestone, 10083,
iast_FULL (14.516 ms) : 14220, 14812
.   : milestone, 14516,
iast_GLOBAL (10.195 ms) : 10015, 10375
.   : milestone, 10195,
profiling (9.503 ms) : 9343, 9662
.   : milestone, 9503,
tracing (7.979 ms) : 7854, 8104
.   : milestone, 7979,
section candidate
no_agent (4.279 ms) : 4231, 4326
.   : milestone, 4279,
iast (8.955 ms) : 8809, 9102
.   : milestone, 8955,
iast_FULL (14.243 ms) : 13956, 14530
.   : milestone, 14243,
iast_GLOBAL (11.173 ms) : 10973, 11373
.   : milestone, 11173,
profiling (8.802 ms) : 8665, 8939
.   : milestone, 8802,
tracing (8.18 ms) : 8054, 8305
.   : milestone, 8180,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	4.392 ms [4.338 ms, 4.446 ms]	-
iast	10.083 ms [9.909 ms, 10.256 ms]	5.691 ms (129.6%)
iast_FULL	14.516 ms [14.22 ms, 14.812 ms]	10.124 ms (230.5%)
iast_GLOBAL	10.195 ms [10.015 ms, 10.375 ms]	5.803 ms (132.1%)
profiling	9.503 ms [9.343 ms, 9.662 ms]	5.111 ms (116.4%)
tracing	7.979 ms [7.854 ms, 8.104 ms]	3.587 ms (81.7%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	4.279 ms [4.231 ms, 4.326 ms]	-
iast	8.955 ms [8.809 ms, 9.102 ms]	4.677 ms (109.3%)
iast_FULL	14.243 ms [13.956 ms, 14.53 ms]	9.965 ms (232.9%)
iast_GLOBAL	11.173 ms [10.973 ms, 11.373 ms]	6.894 ms (161.1%)
profiling	8.802 ms [8.665 ms, 8.939 ms]	4.523 ms (105.7%)
tracing	8.18 ms [8.054 ms, 8.305 ms]	3.901 ms (91.2%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	charles.yu/djm-974/extract-spark-plan-product
git_commit_date	1761174241	1761189637
git_commit_sha	`f42a6e9`	`3619b77`
release_version	1.55.0-SNAPSHOT~f42a6e9909	1.55.0-SNAPSHOT~3619b77e1b

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1761191834	1761191834
ci_job_id	1193103860	1193103860
ci_pipeline_id	80044964	80044964
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-2-7sb214wi 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-2-7sb214wi 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 1 unstable metrics.

scenario	Δ mean execution_time	candidate mean execution_time	baseline mean execution_time
scenario:dacapo:tomcat:appsec	better [-1.462ms; -1.118ms] or [-38.925%; -29.767%]	2.466ms	3.756ms

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.482 ms) : 1470, 1493
.   : milestone, 1482,
appsec (3.756 ms) : 3536, 3976
.   : milestone, 3756,
iast (2.223 ms) : 2159, 2286
.   : milestone, 2223,
iast_GLOBAL (2.259 ms) : 2195, 2322
.   : milestone, 2259,
profiling (2.078 ms) : 2025, 2130
.   : milestone, 2078,
tracing (2.042 ms) : 1992, 2092
.   : milestone, 2042,
section candidate
no_agent (1.481 ms) : 1469, 1492
.   : milestone, 1481,
appsec (2.466 ms) : 2415, 2517
.   : milestone, 2466,
iast (2.212 ms) : 2149, 2275
.   : milestone, 2212,
iast_GLOBAL (2.262 ms) : 2198, 2326
.   : milestone, 2262,
profiling (2.504 ms) : 2335, 2673
.   : milestone, 2504,
tracing (2.034 ms) : 1985, 2084
.   : milestone, 2034,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.482 ms [1.47 ms, 1.493 ms]	-
appsec	3.756 ms [3.536 ms, 3.976 ms]	2.274 ms (153.5%)
iast	2.223 ms [2.159 ms, 2.286 ms]	740.893 µs (50.0%)
iast_GLOBAL	2.259 ms [2.195 ms, 2.322 ms]	776.818 µs (52.4%)
profiling	2.078 ms [2.025 ms, 2.13 ms]	595.836 µs (40.2%)
tracing	2.042 ms [1.992 ms, 2.092 ms]	560.319 µs (37.8%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.481 ms [1.469 ms, 1.492 ms]	-
appsec	2.466 ms [2.415 ms, 2.517 ms]	985.232 µs (66.5%)
iast	2.212 ms [2.149 ms, 2.275 ms]	731.694 µs (49.4%)
iast_GLOBAL	2.262 ms [2.198 ms, 2.326 ms]	781.352 µs (52.8%)
profiling	2.504 ms [2.335 ms, 2.673 ms]	1.023 ms (69.1%)
tracing	2.034 ms [1.985 ms, 2.084 ms]	553.785 µs (37.4%)

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.346 s) : 15346000, 15346000
.   : milestone, 15346000,
appsec (14.833 s) : 14833000, 14833000
.   : milestone, 14833000,
iast (18.631 s) : 18631000, 18631000
.   : milestone, 18631000,
iast_GLOBAL (18.096 s) : 18096000, 18096000
.   : milestone, 18096000,
profiling (15.335 s) : 15335000, 15335000
.   : milestone, 15335000,
tracing (15.313 s) : 15313000, 15313000
.   : milestone, 15313000,
section candidate
no_agent (15.258 s) : 15258000, 15258000
.   : milestone, 15258000,
appsec (15.248 s) : 15248000, 15248000
.   : milestone, 15248000,
iast (18.587 s) : 18587000, 18587000
.   : milestone, 18587000,
iast_GLOBAL (17.583 s) : 17583000, 17583000
.   : milestone, 17583000,
profiling (15.244 s) : 15244000, 15244000
.   : milestone, 15244000,
tracing (15.342 s) : 15342000, 15342000
.   : milestone, 15342000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.346 s [15.346 s, 15.346 s]	-
appsec	14.833 s [14.833 s, 14.833 s]	-513.0 ms (-3.3%)
iast	18.631 s [18.631 s, 18.631 s]	3.285 s (21.4%)
iast_GLOBAL	18.096 s [18.096 s, 18.096 s]	2.75 s (17.9%)
profiling	15.335 s [15.335 s, 15.335 s]	-11.0 ms (-0.1%)
tracing	15.313 s [15.313 s, 15.313 s]	-33.0 ms (-0.2%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.258 s [15.258 s, 15.258 s]	-
appsec	15.248 s [15.248 s, 15.248 s]	-10.0 ms (-0.1%)
iast	18.587 s [18.587 s, 18.587 s]	3.329 s (21.8%)
iast_GLOBAL	17.583 s [17.583 s, 17.583 s]	2.325 s (15.2%)
profiling	15.244 s [15.244 s, 15.244 s]	-14.0 ms (-0.1%)
tracing	15.342 s [15.342 s, 15.342 s]	84.0 ms (0.6%)

charlesmyu · 2025-10-16T22:44:47Z

...entation/spark/src/main/java/datadog/trace/instrumentation/spark/AbstractSparkPlanUtils.java

+  // Should really only return valid JSON types (Array, Map, String, Boolean, Number, null)
+  public Object parsePlanProduct(Object value) {


I don't love that this method returns an Object instead of something definite like a JSON node (or even just a String). The end goal is to allow any JSON object (other than null, which we filter out) to be serialized into a string using writeObjectToString, and this seemed like the most straightforwards way to achieve that. There's probably some more idiomatic way I'm missing - happy to hear about it if anyone has ideas!

charlesmyu · 2025-10-17T13:08:51Z

...rk/spark_2.13/src/main/java/datadog/trace/instrumentation/spark/Spark213Instrumentation.java

+    public static void exit(
+        @Advice.Return(readOnly = false) SparkPlanInfo planInfo,
+        @Advice.Argument(0) SparkPlan plan) {
+      if (planInfo.metadata().size() == 0) {


By using the existing metadata on the DataSourceScanExec nodes, we open ourselves to a bit of inconsistency in the JSON parsing:

"meta": { "Format": "Parquet", "Batched": true, ..., "DataFilters": "[CASE WHEN PULocationID#28 IN (236,132,161) THEN true ELSE isnotnull(PULocationID#28) END]" },

Specifically the lists are not quoted & escaped, which means when we read out the field it's treated as a string rather than a JSON native array. Ideally we would parse this ourselves and upsert it so we can control that formatting, but obviously there's a risk of the parsing going wrong and impacting something that actually uses the field. Leaning slightly towards keeping the formatting as-is in favour of not touching existing fields but happy to hear any other thoughts on this...

pawel-big-lebowski · 2025-10-17T13:26:42Z

...on/spark/spark_2.13/src/main/java/datadog/trace/instrumentation/spark/Spark213PlanUtils.java

+// An extension of how Spark translates `SparkPlan`s to `SparkPlanInfo`, see here:
+// https://github.com/apache/spark/blob/v3.5.0/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanInfo.scala#L54
+public class Spark213PlanUtils extends AbstractSparkPlanUtils {
+  public Map<String, String> extractPlanProduct(TreeNode plan) {


In the OpenLineage connector we had a special facet for storing serialized LogicalPlan of the query. This was the most problematic feature we ever had. Because the plan can contain everything. For example, if a user creates in memory few gigabyte dataframe, then this becomes a node in a logical plan. And OpenLineage connector tried to serliaze it and failed the whole Spark driver.

This PR seems to be doing same thing for the physical plan. I think we shouldn't serialize the object when we don't know what's inside.

Chatted about this over a call, summarizing for posterity:

Worth clarifying that this function does not traverse the tree itself; we leave that up to Spark because we instrument the recursive fromSparkPlan method

We should avoid serializing anything we don't know about arbitrarily, especially using toString(). Since we are taking the full product of the TreeNode we could get some enormous structure (e.g. improbable, but maybe an array of all the data) and toString() would then attempt to serialize all of that data

Instead we should lean solely on simpleString() which is safe by default and default to not serializing otherwise. We could then only serialize other TreeNodes and leave out any unknown or unexpected data structures

With this change it would even be safe to parse the child QueryPlan nodes because it would no longer output the long physical plan, and instead print the one line string

51835fa (#9783)

…pth and array length

charlesmyu · 2025-10-20T21:49:19Z

...rk/src/testFixtures/groovy/datadog/trace/instrumentation/spark/AbstractSpark24SqlTest.groovy

+  // Parse any nested objects to a standard form that ignores the keys
+  // Right now we do this by just asserting the key set and none of the values
+  static Object parseNestedMetaObject(Object value) {
+    if (value instanceof Map) {
+      return value.keySet()
+    } else {
+      return value
+    }
+  }


This was driving me nuts - there must be a better way to accomplish this without a ton of additional code... The issue is that for the Spark32 suite of tests, the expectations for the meta fields use named keys, but when we run the tests using Scala 2.12 we expect those to all show up as _dd.unknown_key.*. I added a (not great) way around that in assertSQLPlanEquals, which worked fine until we started getting nested maps that can have unknown keys. e.g.:

"meta": { "_dd.unparsed" : "any", "outputPartitioning" : { "HashPartitioning" : { "numPartitions" : 2, "expressions" : [ "string_col#28" ] } }, "shuffleOrigin" : "ENSURE_REQUIREMENTS" },

Where the numPartitions and expressions keys would show up as _dd.unknown_key.* in Scala 2.12. Initially I went for a recursive approach but that ended up feeling very bloated, so I abandoned it in favour of a subpar keyset check (i.e. only check that HashPartitioning exists in the map).

No false impressions that this is any good - let me know if there's a better way I'm missing, if just the key check is okay (only applies to the test suite running Scala 2.12/Spark 3.2.0, the other two suites compare everything as expected), or if we just have to put up with the recursive approach...

Changed the approach to compare lists of values instead of whatever I had put before - a bit cleaner and simpler to follow. Has its own downsides (e.g. not perfect comparisons as some stable keys are eliminated, and the containsAll comparison can be fooled) but at least it attempts to compare values and is much easier to maintain. Given it's on an older version of Scala that will no longer be supported for new Spark versions, I think this should probably be fine.

c68b356 (#9783)

pawel-big-lebowski

Went through the first round of reading and left some comments.
Pls let me know do you think about it.

pawel-big-lebowski · 2025-10-21T07:07:32Z

...entation/spark/src/main/java/datadog/trace/instrumentation/spark/AbstractSparkPlanUtils.java

+public abstract class AbstractSparkPlanUtils {
+  private final int MAX_DEPTH = 4;
+  private final int MAX_LENGTH = 50;
+  private final ObjectMapper mapper =


AbstractDatadogSparkListener already contains ObjectMapper. Pls try reusing the existing one.

I was thinking the same thing and was looking at it briefly. From my understanding because the two classes are instantiated in separate advice classes it's fairly difficult to pass references to one another in any way.

I was even hoping to just find a potential way to share an ObjectMapper across all instantiations of AbstractSparkPlanUtils (maybe using Context Stores?) but it didn't seem super feasible... Would be very happy to be corrected on this, though

Just kidding, I did not realize you can just call static fields/methods directly from an abstract class...

8bf8488

pawel-big-lebowski · 2025-10-21T07:10:47Z

...ion/spark/src/main/java/datadog/trace/instrumentation/spark/AbstractSparkPlanSerializer.java

+  // Use reflection rather than native `instanceof` for classes added in later Spark versions
+  private boolean instanceOf(Object value, Class[] classes) {
+    for (Class cls : classes) {
+      if (cls != null && cls.isInstance(value)) {


To avoid any class loading issue, it may make sense to compare class names only

I agree, but am only concerned about how we would deal with inheritance, since we compare parent classes fairly frequently (e.g. BuildSide, Attribute, BroadcastMode, etc). I'm not sure it makes the most sense (especially as Spark evolves) to enumerate all the child classes explicitly, but maybe we could use reflection to traverse the tree? Not sure if that negates the benefits :/

Ended up settling on String comparison with traversal of interfaces & parent classes to avoid class loading errors. Added a negative cache to potentially help consistently traversing classes we know we won't find anything from - let me know if that makes any sense! 047880a (#9783)

pawel-big-lebowski · 2025-10-21T07:20:51Z

...rk/spark_2.12/src/main/java/datadog/trace/instrumentation/spark/Spark212Instrumentation.java

+
+  public static class SparkPlanInfoAdvice {
+    @Advice.OnMethodExit(suppress = Throwable.class, onThrowable = Throwable.class)
+    public static void exit(


Thinking loudly: would it make sense to hide this behind feature flague in config? Reasons to do so:

we can let it bake on dd internal jobs for some time,

users can turn it off in case of unexpected issues.

Agreed, added! 55b917b

pawel-big-lebowski · 2025-10-21T07:22:33Z

...ion/spark/src/main/java/datadog/trace/instrumentation/spark/AbstractSparkPlanSerializer.java

+
+    if (unparsed.size() > 0) {
+      // For now, place what we can't parse here with the types so we're aware of them
+      args.put("_dd.unparsed", unparsed);


Would it make sense to create separate SafeSparkPlanSerializer class? Seems like this is what this all is about.

Just to clarify, do we mean to use SafeSparkPlanSerializer to store the list of parsed and unparsed orgs to then serialize? Or for it to handle the actual serialization (e.g. parsePlanProduct) itself?

This did make me realize some of the function names are outdated, though. Updated them (extractPlanProduct -> safeParseTreeNode, parsePlanProduct -> safeParseObjectToJson)

Clarified over a quick call; renamed class to AbstractSparkPlanSerializer as well to address this!

8bf8488

charlesmyu added 5 commits October 10, 2025 16:37

Extract Spark Plan product with keys for Scala 2.13

be7ada4

Extract Spark Plan product values for Spark 2.12

9c19e11

Update tests for meta field in Spark SQL plans

3f9c26b

Remove unused logic to parse children, enrich product parsing to supp…

94f3139

…ort more types and use JSON arrays

Update tests to assert on meta values

beb4d5f

charlesmyu added type: enhancement Enhancements and improvements inst: apache spark Apache Spark instrumentation labels Oct 16, 2025

Use Abstract class for common functions

750c68f

charlesmyu force-pushed the charles.yu/djm-974/extract-spark-plan-product branch 3 times, most recently from dc41615 to d9d6213 Compare October 16, 2025 22:43

charlesmyu commented Oct 16, 2025

View reviewed changes

charlesmyu force-pushed the charles.yu/djm-974/extract-spark-plan-product branch from d9d6213 to 54ab1ad Compare October 16, 2025 22:47

Use Jackson JSON parser instead of rolling own parsing

0279fff

charlesmyu force-pushed the charles.yu/djm-974/extract-spark-plan-product branch from 54ab1ad to 0279fff Compare October 17, 2025 04:53

charlesmyu commented Oct 17, 2025

View reviewed changes

pawel-big-lebowski reviewed Oct 17, 2025

View reviewed changes

charlesmyu added 2 commits October 17, 2025 10:03

Refactor AbstractSparkPlanUtils to only require key generation on impl

50fa41a

Default to returning null if class not recognized, limit recursion de…

51835fa

…pth and array length

charlesmyu commented Oct 20, 2025

View reviewed changes

Improve testing scheme for Spark32 on Scala 212 with unknown keys

c68b356

pawel-big-lebowski reviewed Oct 21, 2025

View reviewed changes

charlesmyu added 3 commits October 21, 2025 11:22

Improve method & class naming, reuse ObjectMapper from listener

8bf8488

Gate Spark Plan parsing with flag

55b917b

Match classes by string comparison, add negative cache

047880a

charlesmyu requested a review from pawel-big-lebowski October 21, 2025 21:20

charlesmyu marked this pull request as ready for review October 21, 2025 21:20

charlesmyu requested a review from a team as a code owner October 21, 2025 21:20

charlesmyu requested a review from mhlidd October 21, 2025 21:20

Add unit tests for AbstractSparkPlanSerializer

3619b77

		// Should really only return valid JSON types (Array, Map, String, Boolean, Number, null)
		public Object parsePlanProduct(Object value) {

Extract SparkPlan product and append to trace #9783

Are you sure you want to change the base?

Extract SparkPlan product and append to trace #9783

Conversation

charlesmyu commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Uh oh!

datadog-official bot commented Oct 16, 2025 • edited by datadog-datadog-prod-us1 bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pr-commenter bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

charlesmyu Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlesmyu Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlesmyu Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pawel-big-lebowski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlesmyu Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

charlesmyu commented Oct 16, 2025 •

edited

Loading

datadog-official bot commented Oct 16, 2025 •

edited by datadog-datadog-prod-us1 bot

Loading

pr-commenter bot commented Oct 16, 2025 •

edited

Loading

charlesmyu Oct 16, 2025 •

edited

Loading

charlesmyu Oct 17, 2025 •

edited

Loading

charlesmyu Oct 20, 2025 •

edited

Loading

charlesmyu Oct 21, 2025 •

edited

Loading