Skip to content

Conversation

guowei2
Copy link
Contributor

@guowei2 guowei2 commented Aug 29, 2014

with this PR: no job is commited when there's no imputs.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@guowei2
Copy link
Contributor Author

guowei2 commented Aug 29, 2014

i recommit this.for the last PR is compicated by mistake

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

Can one of the admins verify this patch?

@rxin
Copy link
Contributor

rxin commented Sep 27, 2014

@tdas can you take a look at this?

@tdas
Copy link
Contributor

tdas commented Oct 1, 2014

This is not a good idea. Not returning an RDD can mess up a lot of the logic and semantics. For example if there is a transform() followed by updateStateByKey(), the result will be unpredictable. updateStateByKey expects the previous batch to have a state RDD. If it does not find any state RDD it will assume that this the start of the streamign computation and effectively initialize again, forgetting the previous states from 2 batches ago. So this change is incorrect.

@tdas
Copy link
Contributor

tdas commented Nov 7, 2014

@guowei2 As i had explained, this is not a good idea because it breaks semantics for a state dstream. Mind closing this PR?

@asfgit asfgit closed this in f73b56f Nov 10, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants