-
Notifications
You must be signed in to change notification settings - Fork 314
Extract SparkPlan product and append to trace #9783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ort more types and use JSON arrays
🎯 Code Coverage 🔗 Commit SHA: 3619b77 | Docs | Was this helpful? Give us feedback! |
BenchmarksStartupParameters
See matching parameters
SummaryFound 1 performance improvements and 3 performance regressions! Performance is the same for 51 metrics, 10 unstable metrics.
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.031 s) : 0, 1031329
Total [baseline] (8.685 s) : 0, 8685047
Agent [candidate] (1.016 s) : 0, 1015991
Total [candidate] (8.684 s) : 0, 8684116
section iast
Agent [baseline] (1.166 s) : 0, 1165936
Total [baseline] (9.347 s) : 0, 9346832
Agent [candidate] (1.159 s) : 0, 1159400
Total [candidate] (9.351 s) : 0, 9350953
gantt
title insecure-bank - break down per module: candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.462 ms) : 0, 1462
crashtracking [candidate] (1.451 ms) : 0, 1451
BytebuddyAgent [baseline] (703.77 ms) : 0, 703770
BytebuddyAgent [candidate] (693.636 ms) : 0, 693636
GlobalTracer [baseline] (244.777 ms) : 0, 244777
GlobalTracer [candidate] (241.736 ms) : 0, 241736
AppSec [baseline] (32.338 ms) : 0, 32338
AppSec [candidate] (32.179 ms) : 0, 32179
Debugger [baseline] (6.355 ms) : 0, 6355
Debugger [candidate] (6.413 ms) : 0, 6413
Remote Config [baseline] (679.085 µs) : 0, 679
Remote Config [candidate] (709.569 µs) : 0, 710
Telemetry [baseline] (15.713 ms) : 0, 15713
Telemetry [candidate] (9.271 ms) : 0, 9271
Flare Poller [baseline] (5.02 ms) : 0, 5020
Flare Poller [candidate] (9.426 ms) : 0, 9426
section iast
crashtracking [baseline] (1.489 ms) : 0, 1489
crashtracking [candidate] (1.491 ms) : 0, 1491
BytebuddyAgent [baseline] (828.654 ms) : 0, 828654
BytebuddyAgent [candidate] (822.226 ms) : 0, 822226
GlobalTracer [baseline] (233.55 ms) : 0, 233550
GlobalTracer [candidate] (232.785 ms) : 0, 232785
IAST [baseline] (30.964 ms) : 0, 30964
IAST [candidate] (26.696 ms) : 0, 26696
AppSec [baseline] (30.697 ms) : 0, 30697
AppSec [candidate] (34.822 ms) : 0, 34822
Debugger [baseline] (6.06 ms) : 0, 6060
Debugger [candidate] (6.124 ms) : 0, 6124
Remote Config [baseline] (613.287 µs) : 0, 613
Remote Config [candidate] (607.359 µs) : 0, 607
Telemetry [baseline] (8.396 ms) : 0, 8396
Telemetry [candidate] (8.644 ms) : 0, 8644
Flare Poller [baseline] (4.127 ms) : 0, 4127
Flare Poller [candidate] (4.342 ms) : 0, 4342
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.04 s) : 0, 1039502
Total [baseline] (10.925 s) : 0, 10925125
Agent [candidate] (1.017 s) : 0, 1017157
Total [candidate] (10.673 s) : 0, 10673226
section appsec
Agent [baseline] (1.21 s) : 0, 1209602
Total [baseline] (11.027 s) : 0, 11027076
Agent [candidate] (1.196 s) : 0, 1195832
Total [candidate] (11.067 s) : 0, 11067278
section iast
Agent [baseline] (1.163 s) : 0, 1163406
Total [baseline] (11.123 s) : 0, 11123026
Agent [candidate] (1.151 s) : 0, 1150827
Total [candidate] (10.967 s) : 0, 10966845
section profiling
Agent [baseline] (1.182 s) : 0, 1182351
Total [baseline] (10.908 s) : 0, 10908413
Agent [candidate] (1.161 s) : 0, 1161389
Total [candidate] (11.093 s) : 0, 11092681
gantt
title petclinic - break down per module: candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.499 ms) : 0, 1499
crashtracking [candidate] (1.454 ms) : 0, 1454
BytebuddyAgent [baseline] (709.023 ms) : 0, 709023
BytebuddyAgent [candidate] (694.349 ms) : 0, 694349
GlobalTracer [baseline] (246.837 ms) : 0, 246837
GlobalTracer [candidate] (241.917 ms) : 0, 241917
AppSec [baseline] (32.656 ms) : 0, 32656
AppSec [candidate] (32.326 ms) : 0, 32326
Debugger [baseline] (6.433 ms) : 0, 6433
Debugger [candidate] (6.431 ms) : 0, 6431
Remote Config [baseline] (692.873 µs) : 0, 693
Remote Config [candidate] (707.406 µs) : 0, 707
Telemetry [baseline] (13.145 ms) : 0, 13145
Telemetry [candidate] (9.378 ms) : 0, 9378
Flare Poller [baseline] (7.901 ms) : 0, 7901
Flare Poller [candidate] (9.351 ms) : 0, 9351
section appsec
crashtracking [baseline] (1.496 ms) : 0, 1496
crashtracking [candidate] (1.453 ms) : 0, 1453
BytebuddyAgent [baseline] (730.078 ms) : 0, 730078
BytebuddyAgent [candidate] (719.667 ms) : 0, 719667
GlobalTracer [baseline] (237.478 ms) : 0, 237478
GlobalTracer [candidate] (234.565 ms) : 0, 234565
IAST [baseline] (25.143 ms) : 0, 25143
IAST [candidate] (24.898 ms) : 0, 24898
AppSec [baseline] (175.208 ms) : 0, 175208
AppSec [candidate] (175.002 ms) : 0, 175002
Debugger [baseline] (5.954 ms) : 0, 5954
Debugger [candidate] (6.142 ms) : 0, 6142
Remote Config [baseline] (638.798 µs) : 0, 639
Remote Config [candidate] (628.454 µs) : 0, 628
Telemetry [baseline] (8.496 ms) : 0, 8496
Telemetry [candidate] (8.438 ms) : 0, 8438
Flare Poller [baseline] (3.916 ms) : 0, 3916
Flare Poller [candidate] (3.932 ms) : 0, 3932
section iast
crashtracking [baseline] (1.465 ms) : 0, 1465
crashtracking [candidate] (1.455 ms) : 0, 1455
BytebuddyAgent [baseline] (825.578 ms) : 0, 825578
BytebuddyAgent [candidate] (815.506 ms) : 0, 815506
GlobalTracer [baseline] (234.457 ms) : 0, 234457
GlobalTracer [candidate] (231.245 ms) : 0, 231245
IAST [baseline] (34.304 ms) : 0, 34304
IAST [candidate] (26.426 ms) : 0, 26426
AppSec [baseline] (27.057 ms) : 0, 27057
AppSec [candidate] (34.994 ms) : 0, 34994
Debugger [baseline] (6.129 ms) : 0, 6129
Debugger [candidate] (6.111 ms) : 0, 6111
Remote Config [baseline] (593.674 µs) : 0, 594
Remote Config [candidate] (610.057 µs) : 0, 610
Telemetry [baseline] (8.337 ms) : 0, 8337
Telemetry [candidate] (8.656 ms) : 0, 8656
Flare Poller [baseline] (4.156 ms) : 0, 4156
Flare Poller [candidate] (4.219 ms) : 0, 4219
section profiling
crashtracking [baseline] (1.463 ms) : 0, 1463
crashtracking [candidate] (1.424 ms) : 0, 1424
BytebuddyAgent [baseline] (732.04 ms) : 0, 732040
BytebuddyAgent [candidate] (721.012 ms) : 0, 721012
GlobalTracer [baseline] (221.871 ms) : 0, 221871
GlobalTracer [candidate] (217.694 ms) : 0, 217694
AppSec [baseline] (33.427 ms) : 0, 33427
AppSec [candidate] (32.253 ms) : 0, 32253
Debugger [baseline] (13.907 ms) : 0, 13907
Debugger [candidate] (6.433 ms) : 0, 6433
Remote Config [baseline] (705.226 µs) : 0, 705
Remote Config [candidate] (720.404 µs) : 0, 720
Telemetry [baseline] (8.231 ms) : 0, 8231
Telemetry [candidate] (16.174 ms) : 0, 16174
Flare Poller [baseline] (4.163 ms) : 0, 4163
Flare Poller [candidate] (4.091 ms) : 0, 4091
ProfilingAgent [baseline] (110.712 ms) : 0, 110712
ProfilingAgent [candidate] (108.027 ms) : 0, 108027
Profiling [baseline] (111.325 ms) : 0, 111325
Profiling [candidate] (109.481 ms) : 0, 109481
LoadParameters
See matching parameters
SummaryFound 2 performance improvements and 4 performance regressions! Performance is the same for 6 metrics, 12 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
dateFormat X
axisFormat %s
section baseline
no_agent (36.628 ms) : 36334, 36923
. : milestone, 36628,
appsec (48.039 ms) : 47615, 48464
. : milestone, 48039,
code_origins (43.38 ms) : 43021, 43739
. : milestone, 43380,
iast (43.334 ms) : 42967, 43701
. : milestone, 43334,
profiling (47.075 ms) : 46647, 47502
. : milestone, 47075,
tracing (43.986 ms) : 43628, 44345
. : milestone, 43986,
section candidate
no_agent (35.98 ms) : 35693, 36266
. : milestone, 35980,
appsec (49.449 ms) : 49013, 49885
. : milestone, 49449,
code_origins (43.018 ms) : 42651, 43384
. : milestone, 43018,
iast (44.684 ms) : 44298, 45071
. : milestone, 44684,
profiling (49.877 ms) : 49374, 50381
. : milestone, 49877,
tracing (46.195 ms) : 45801, 46590
. : milestone, 46195,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
dateFormat X
axisFormat %s
section baseline
no_agent (4.392 ms) : 4338, 4446
. : milestone, 4392,
iast (10.083 ms) : 9909, 10256
. : milestone, 10083,
iast_FULL (14.516 ms) : 14220, 14812
. : milestone, 14516,
iast_GLOBAL (10.195 ms) : 10015, 10375
. : milestone, 10195,
profiling (9.503 ms) : 9343, 9662
. : milestone, 9503,
tracing (7.979 ms) : 7854, 8104
. : milestone, 7979,
section candidate
no_agent (4.279 ms) : 4231, 4326
. : milestone, 4279,
iast (8.955 ms) : 8809, 9102
. : milestone, 8955,
iast_FULL (14.243 ms) : 13956, 14530
. : milestone, 14243,
iast_GLOBAL (11.173 ms) : 10973, 11373
. : milestone, 11173,
profiling (8.802 ms) : 8665, 8939
. : milestone, 8802,
tracing (8.18 ms) : 8054, 8305
. : milestone, 8180,
DacapoParameters
See matching parameters
SummaryFound 1 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 1 unstable metrics.
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
dateFormat X
axisFormat %s
section baseline
no_agent (1.482 ms) : 1470, 1493
. : milestone, 1482,
appsec (3.756 ms) : 3536, 3976
. : milestone, 3756,
iast (2.223 ms) : 2159, 2286
. : milestone, 2223,
iast_GLOBAL (2.259 ms) : 2195, 2322
. : milestone, 2259,
profiling (2.078 ms) : 2025, 2130
. : milestone, 2078,
tracing (2.042 ms) : 1992, 2092
. : milestone, 2042,
section candidate
no_agent (1.481 ms) : 1469, 1492
. : milestone, 1481,
appsec (2.466 ms) : 2415, 2517
. : milestone, 2466,
iast (2.212 ms) : 2149, 2275
. : milestone, 2212,
iast_GLOBAL (2.262 ms) : 2198, 2326
. : milestone, 2262,
profiling (2.504 ms) : 2335, 2673
. : milestone, 2504,
tracing (2.034 ms) : 1985, 2084
. : milestone, 2034,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.55.0-SNAPSHOT~3619b77e1b, baseline=1.55.0-SNAPSHOT~f42a6e9909
dateFormat X
axisFormat %s
section baseline
no_agent (15.346 s) : 15346000, 15346000
. : milestone, 15346000,
appsec (14.833 s) : 14833000, 14833000
. : milestone, 14833000,
iast (18.631 s) : 18631000, 18631000
. : milestone, 18631000,
iast_GLOBAL (18.096 s) : 18096000, 18096000
. : milestone, 18096000,
profiling (15.335 s) : 15335000, 15335000
. : milestone, 15335000,
tracing (15.313 s) : 15313000, 15313000
. : milestone, 15313000,
section candidate
no_agent (15.258 s) : 15258000, 15258000
. : milestone, 15258000,
appsec (15.248 s) : 15248000, 15248000
. : milestone, 15248000,
iast (18.587 s) : 18587000, 18587000
. : milestone, 18587000,
iast_GLOBAL (17.583 s) : 17583000, 17583000
. : milestone, 17583000,
profiling (15.244 s) : 15244000, 15244000
. : milestone, 15244000,
tracing (15.342 s) : 15342000, 15342000
. : milestone, 15342000,
|
dc41615
to
d9d6213
Compare
// Should really only return valid JSON types (Array, Map, String, Boolean, Number, null) | ||
public Object parsePlanProduct(Object value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love that this method returns an Object instead of something definite like a JSON node (or even just a String). The end goal is to allow any JSON object (other than null, which we filter out) to be serialized into a string using writeObjectToString
, and this seemed like the most straightforwards way to achieve that. There's probably some more idiomatic way I'm missing - happy to hear about it if anyone has ideas!
d9d6213
to
54ab1ad
Compare
54ab1ad
to
0279fff
Compare
public static void exit( | ||
@Advice.Return(readOnly = false) SparkPlanInfo planInfo, | ||
@Advice.Argument(0) SparkPlan plan) { | ||
if (planInfo.metadata().size() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By using the existing metadata on the DataSourceScanExec
nodes, we open ourselves to a bit of inconsistency in the JSON parsing:
"meta": {
"Format": "Parquet",
"Batched": true,
...,
"DataFilters": "[CASE WHEN PULocationID#28 IN (236,132,161) THEN true ELSE isnotnull(PULocationID#28) END]"
},
Specifically the lists are not quoted & escaped, which means when we read out the field it's treated as a string rather than a JSON native array. Ideally we would parse this ourselves and upsert it so we can control that formatting, but obviously there's a risk of the parsing going wrong and impacting something that actually uses the field. Leaning slightly towards keeping the formatting as-is in favour of not touching existing fields but happy to hear any other thoughts on this...
// An extension of how Spark translates `SparkPlan`s to `SparkPlanInfo`, see here: | ||
// https://github.com/apache/spark/blob/v3.5.0/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanInfo.scala#L54 | ||
public class Spark213PlanUtils extends AbstractSparkPlanUtils { | ||
public Map<String, String> extractPlanProduct(TreeNode plan) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the OpenLineage connector we had a special facet for storing serialized LogicalPlan of the query. This was the most problematic feature we ever had. Because the plan can contain everything. For example, if a user creates in memory few gigabyte dataframe, then this becomes a node in a logical plan. And OpenLineage connector tried to serliaze it and failed the whole Spark driver.
This PR seems to be doing same thing for the physical plan. I think we shouldn't serialize the object when we don't know what's inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted about this over a call, summarizing for posterity:
- Worth clarifying that this function does not traverse the tree itself; we leave that up to Spark because we instrument the recursive
fromSparkPlan
method - We should avoid serializing anything we don't know about arbitrarily, especially using
toString()
. Since we are taking the full product of theTreeNode
we could get some enormous structure (e.g. improbable, but maybe an array of all the data) andtoString()
would then attempt to serialize all of that data- Instead we should lean solely on
simpleString()
which is safe by default and default to not serializing otherwise. We could then only serialize otherTreeNode
s and leave out any unknown or unexpected data structures - With this change it would even be safe to parse the child
QueryPlan
nodes because it would no longer output the long physical plan, and instead print the one line string
- Instead we should lean solely on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Parse any nested objects to a standard form that ignores the keys | ||
// Right now we do this by just asserting the key set and none of the values | ||
static Object parseNestedMetaObject(Object value) { | ||
if (value instanceof Map) { | ||
return value.keySet() | ||
} else { | ||
return value | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was driving me nuts - there must be a better way to accomplish this without a ton of additional code... The issue is that for the Spark32 suite of tests, the expectations for the meta fields use named keys, but when we run the tests using Scala 2.12 we expect those to all show up as _dd.unknown_key.*
. I added a (not great) way around that in assertSQLPlanEquals
, which worked fine until we started getting nested maps that can have unknown keys. e.g.:
"meta": {
"_dd.unparsed" : "any",
"outputPartitioning" : {
"HashPartitioning" : {
"numPartitions" : 2,
"expressions" : [ "string_col#28" ]
}
},
"shuffleOrigin" : "ENSURE_REQUIREMENTS"
},
Where the numPartitions
and expressions
keys would show up as _dd.unknown_key.*
in Scala 2.12. Initially I went for a recursive approach but that ended up feeling very bloated, so I abandoned it in favour of a subpar keyset check (i.e. only check that HashPartitioning
exists in the map).
No false impressions that this is any good - let me know if there's a better way I'm missing, if just the key check is okay (only applies to the test suite running Scala 2.12/Spark 3.2.0, the other two suites compare everything as expected), or if we just have to put up with the recursive approach...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the approach to compare lists of values instead of whatever I had put before - a bit cleaner and simpler to follow. Has its own downsides (e.g. not perfect comparisons as some stable keys are eliminated, and the containsAll
comparison can be fooled) but at least it attempts to compare values and is much easier to maintain. Given it's on an older version of Scala that will no longer be supported for new Spark versions, I think this should probably be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through the first round of reading and left some comments.
Pls let me know do you think about it.
public abstract class AbstractSparkPlanUtils { | ||
private final int MAX_DEPTH = 4; | ||
private final int MAX_LENGTH = 50; | ||
private final ObjectMapper mapper = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AbstractDatadogSparkListener
already contains ObjectMapper. Pls try reusing the existing one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the same thing and was looking at it briefly. From my understanding because the two classes are instantiated in separate advice classes it's fairly difficult to pass references to one another in any way.
I was even hoping to just find a potential way to share an ObjectMapper
across all instantiations of AbstractSparkPlanUtils
(maybe using Context Stores?) but it didn't seem super feasible... Would be very happy to be corrected on this, though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just kidding, I did not realize you can just call static fields/methods directly from an abstract class...
// Use reflection rather than native `instanceof` for classes added in later Spark versions | ||
private boolean instanceOf(Object value, Class[] classes) { | ||
for (Class cls : classes) { | ||
if (cls != null && cls.isInstance(value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid any class loading issue, it may make sense to compare class names only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but am only concerned about how we would deal with inheritance, since we compare parent classes fairly frequently (e.g. BuildSide
, Attribute
, BroadcastMode
, etc). I'm not sure it makes the most sense (especially as Spark evolves) to enumerate all the child classes explicitly, but maybe we could use reflection to traverse the tree? Not sure if that negates the benefits :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ended up settling on String comparison with traversal of interfaces & parent classes to avoid class loading errors. Added a negative cache to potentially help consistently traversing classes we know we won't find anything from - let me know if that makes any sense! 047880a
(#9783)
|
||
public static class SparkPlanInfoAdvice { | ||
@Advice.OnMethodExit(suppress = Throwable.class, onThrowable = Throwable.class) | ||
public static void exit( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking loudly: would it make sense to hide this behind feature flague in config? Reasons to do so:
- we can let it bake on dd internal jobs for some time,
- users can turn it off in case of unexpected issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, added! 55b917b
|
||
if (unparsed.size() > 0) { | ||
// For now, place what we can't parse here with the types so we're aware of them | ||
args.put("_dd.unparsed", unparsed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to create separate SafeSparkPlanSerializer
class? Seems like this is what this all is about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, do we mean to use SafeSparkPlanSerializer
to store the list of parsed and unparsed orgs to then serialize? Or for it to handle the actual serialization (e.g. parsePlanProduct
) itself?
This did make me realize some of the function names are outdated, though. Updated them (extractPlanProduct
-> safeParseTreeNode
, parsePlanProduct
-> safeParseObjectToJson
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarified over a quick call; renamed class to AbstractSparkPlanSerializer
as well to address this!
What Does This Do
fromSparkPlan
function to:plan
parameter into a map of String propertiesmeta
field of returnedSparkPlanInfo
with those propertiesSpark21XPlanUtils
class with aextractPlanProduct
method that parses aSparkPlan
object and returns the properties as a <String, String> mapAbstractSparkPlanUtils
class with aparsePlanProduct
method that parses the various Objects extracted byextractPlanProduct
to return a comprehensible string representationSpark21XPlanUtils
toJson
function inSparkSQLUtils
to write a JSON object if possible, otherwise just write a stringMotivation
The SparkPlan houses additional details about its execution that is useful to visualize for operators to use. Extract these into spans so they can be ingested.
Additional Notes
This PR leverages the existing
meta
field in theSparkPlanInfo
class. This should be safe as we don't overwrite the field if any data exists, and it is currently only used forScanExec
node details. Furthermore since this class appears to be primarily intended as an abstraction for informational purposes, any faulty updates to the object shouldn't result in any breaking issues.Also note that we use the
Product
API to obtain the key names (usingproductElementName
), however this was only made available in Scala 2.13. As a result the Scala 2.12 instrumentation uses arbitrary_dd.unknown_key.X
names for the keys, so the values can at least be extracted.Worth mentioning that this PR does not introduce traversal of the physical plan itself into the tracer - this is left to Spark itself. This is because the recursive
fromSparkPlan
method is instrumented, meaning as each node is built the tracer is invoked to parse it, and we expressly filter out any potentialQueryPlan
nodes when performing the parsing.Contributor Checklist
type:
and (comp:
orinst:
) labels in addition to any useful labelsclose
,fix
or any linking keywords when referencing an issue.Use
solves
instead, and assign the PR milestone to the issueJira ticket: DJM-974