-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-3377] [SPARK-3610] Metrics can be accidentally aggregated / History server log name should not be based on user input #2432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…d and driver/executor-id
… BlockManagerSource" This reverts commit 71609f5.
…cture-improvement
…ause the instance of SparkContext is no longer used
…turn null when correspondin entry is absent
…cture-improvement
…cture-improvement
…cture-improvement
…cture-improvement
…ark into metrics-structure-improvement
…cture-improvement
…cture-improvement
…cture-improvement
This reverts commit e4a4593.
…cture-improvement
…cture-improvement
…cture-improvement
|
QA tests have started for PR 2432 at commit
|
|
QA tests have finished for PR 2432 at commit
|
|
QA tests have started for PR 2432 at commit
|
|
QA tests have finished for PR 2432 at commit
|
|
retest this please. |
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is > 100 chars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I know, but scalastyle for Spark except for import statements. In fact, lots of code in Spark have 100+ columns import statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm aware... I don't know if that's a good thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O.K. For now, I indent the line with 2 spaces.
|
Hey @sarutak I left a few more minor comments. Otherwise this LGTM. |
|
QA tests have started for PR 2432 at commit
|
|
Thanks @andrewor14 I'm waiting for tests. |
|
QA tests have finished for PR 2432 at commit
|
|
Test FAILed. |
Replaced getApplicationId with applicationId in SparkContext Replaced == with === in MetricsSystemSuite
…cture-improvement2
2cc09aa to
3288b2b
Compare
|
QA tests have started for PR 2432 at commit
|
|
Ok, this is ready to go from my perspective. Merging once the tests pass. Thanks @sarutak. |
|
QA tests have finished for PR 2432 at commit
|
|
Test PASSed. |
|
Alright, this went into master. There were too many merge conflicts for this to also go into 1.1. @sarutak if you feel inclined feel free to open another one against that branch. |
|
@andrewor14 O.K, I'll open another PR for 1.1. Thanks. |
|
This broken the yarn-alpha build. Please make sure to update YarnAllocationHandler for it also if you do any other prs |
|
Oops... I'll fix soon. |
|
@tgravescs Sorry for having you waiting. I've fixed the issue at #2715 . |
yarn alpha build was broken by #2432 as it added an argument to YarnAllocator but not to yarn/alpha YarnAllocationHandler commit 79e45c9 Author: Kousuke Saruta <[email protected]> Closes #2715 from sarutak/SPARK-3848 and squashes the following commits: bafb8d1 [Kousuke Saruta] Fixed parameters for the default constructor of alpha/YarnAllocatorHandler.
This PR is another solution for #2250
I'm using codahale base MetricsSystem of Spark with JMX or Graphite, and I saw following 2 problems.
(1) When applications which have same spark.app.name run on cluster at the same time, some metrics names are mixed. For instance, if 2+ application is running on the cluster at the same time, each application emits the same named metric like "SparkPi.DAGScheduler.stage.failedStages" and Graphite cannot distinguish the metrics is for which application.
(2) When 2+ executors run on the same machine, JVM metrics of each executors are mixed. For instance, 2+ executors running on the same node can emit the same named metric "jvm.memory" and Graphite cannot distinguish the metrics is from which application.
And there is an similar issue. The directory for event logs is named using application name.
Application name is defined by user and the name can includes illegal character for path names.
Further more, the directory name consists of application name and System.currentTimeMillis even though each application has unique Application ID so if we run jobs which have same name, it's difficult to identify which directory is for which application.
Closes #2250
Closes #1067