[SPARK-5210] Support group event log when app is long-running #9246

XuTingjun · 2015-10-23T08:58:12Z

For long-running Spark applications (e.g. running for days / weeks), the Spark event log may grow to be very large.

I think group event log by job is an acceptable resolution.

To group eventLog, one application has two kinds file: one meta file and many part files. We put StageSubmitted/ StageCompleted/ TaskResubmit/ TaskStart/TaskEnd/TaskGettingResult/ JobStart/JobEnd events into meta file, and put other events into part file. The event log shows like below:

application_1439246697595_0001-meta
application_1439246697595_0001-part1
application_1439246697595_0001-part2

2.To HistoryServer, every part file will be treated as an application, and it will replay meta file after replay part file. Below is the display of group app on HistoryServer web:

SparkQA · 2015-10-23T09:14:38Z

Test build #44216 has finished for PR 9246 at commit 62c982b.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-10-23T11:48:20Z

Test build #44219 has finished for PR 9246 at commit b8f2b3c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

markhamstra · 2015-11-01T22:41:21Z

@XuTingjun you speak of an "acceptable resolution", but you haven't adequately described the problem you are trying to resolve. Yes, the event log can get long, but I'm not seeing why that is inherently a problem or why you can't "acceptably resolve" your problem by post-processing the event log while leaving that log as a single, unified stream of events.

andrewor14 · 2015-12-15T01:33:21Z

@XuTingjun I think this is something good to fix. I've noticed that uncompressed event logs can amount up to 15GB for a 5 minute application. However, I think a lot of functionality is already implemented so we should reuse existing code where possible. In particular, have you looked at RollingFileAppender? We can use that and specify a RollingPolicy based on number of bytes written. That seems to achieve what we want.

andrewor14 · 2015-12-15T01:34:28Z

By the way, since this patch has been opened many months ago it's now mostly stale. If you plan to work on this, would you mind closing this patch and re-opening one that uses the RollingFileAppender? Hopefully the size of the diff will be much smaller then.

satlal · 2016-05-13T21:25:45Z

@XuTingjun revisiting this thread. Since this patch seems to be abandoned, were you able to work around the large log files issue for long running streaming jobs?

eladamitpxi · 2016-06-26T06:30:39Z

+1 large / infinitely growing event log files are quite a problem for long running streaming jobs

add big event log

62c982b

fix style

b8f2b3c

squito mentioned this pull request Oct 26, 2015

[SPARK-8029][core] first successful shuffle task always wins #9214

Closed

XuTingjun closed this Dec 15, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-5210] Support group event log when app is long-running #9246

[SPARK-5210] Support group event log when app is long-running #9246

Uh oh!

XuTingjun commented Oct 23, 2015

Uh oh!

SparkQA commented Oct 23, 2015

Uh oh!

SparkQA commented Oct 23, 2015

Uh oh!

markhamstra commented Nov 1, 2015

Uh oh!

andrewor14 commented Dec 15, 2015

Uh oh!

andrewor14 commented Dec 15, 2015

Uh oh!

satlal commented May 13, 2016

Uh oh!

eladamitpxi commented Jun 26, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[SPARK-5210] Support group event log when app is long-running #9246

[SPARK-5210] Support group event log when app is long-running #9246

Uh oh!

Conversation

XuTingjun commented Oct 23, 2015

Uh oh!

SparkQA commented Oct 23, 2015

Uh oh!

SparkQA commented Oct 23, 2015

Uh oh!

markhamstra commented Nov 1, 2015

Uh oh!

andrewor14 commented Dec 15, 2015

Uh oh!

andrewor14 commented Dec 15, 2015

Uh oh!

satlal commented May 13, 2016

Uh oh!

eladamitpxi commented Jun 26, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants