Skip to content

Conversation

@kayousterhout
Copy link
Contributor

The pendingPartitions instance variable should be moved to ShuffleMapStage,
because it is only used by ShuffleMapStages. This change is purely refactoring
and does not change functionality.

I fixed this in an attempt to clarify some of the discussion around #16620, which I was having trouble reasoning about. I stole the helpful comment Imran wrote for pendingPartitions and used it here.

cc @squito @markhamstra @jinxing64

The pendingPartitions instance variable should be moved to ShuffleMapStage,
because it is only used by ShuffleMapStages. This change is purely refactoring
and does not change functionality.
@markhamstra
Copy link
Contributor

Makes good sense to me.

@kayousterhout
Copy link
Contributor Author

Thanks for the quick look @markhamstra !

@markhamstra
Copy link
Contributor

You're welcome -- but do be aware that I'm going to be extremely busy with non-Spark stuff for at least the next week, so for awhile my Spark code reviews are likely to be more cursory than they should be.

@jinxing64
Copy link

It's great to have pendingPartitions in ShuffleMapStage.

@SparkQA
Copy link

SparkQA commented Feb 10, 2017

Test build #72662 has finished for PR 16876 at commit 3e9af42.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

stage.pendingPartitions ++= tasks.map(_.partitionId)
logDebug("New pending partitions: " + stage.pendingPartitions)
logInfo(s"Submitting ${tasks.size} missing tasks (for partitions " +
s"${tasks.map(_.partitionId)}) from $stage (${stage.rdd})")
Copy link
Contributor

@squito squito Feb 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think including all the partitions could make this log line a little too long for an info message (what if there are 1k or 10k tasks). I understand not wanting to stick it in a debug message (you often dont' realize you want debug logging on until after something has failed), but I think this will be too much for regular runs.

Would it still be useful if you truncate to the first 25 tasks? tasks.take(25).map(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good point. 25 seemed like maybe too many for an info-level message so I split the difference with your original comment and did 15. I could also just leave the original logging if you think that's better.

Copy link
Contributor

@squito squito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than the logging issue, looks good. definitely a useful cleanup.

@squito
Copy link
Contributor

squito commented Feb 10, 2017

lgtm when tests pass

@SparkQA
Copy link

SparkQA commented Feb 10, 2017

Test build #72684 has finished for PR 16876 at commit 8fdecda.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jinxing64
Copy link

jinxing64 commented Feb 10, 2017

@kayousterhout
It's great to give a definition of pendingPartitions in ShuffleMapStage.
May I ask few questions and make my understanding about pendingPartitions clear ? It means:

  1. PartitionId gets removed from pendingPartitions because of real success(not including ones from failed executors);
  2. pendingPartitions is the exact mirror(in reverse) of ShuffleMapStage 's output locations. Thus when pendingPartition is empty, all output locations are available;
  3. When executor lost, we should check if there is need to add partitionId back to pendingPartitions accordingly.
  4. No need to consider resubmit stage when pendingPartitions is empty.

Is my understanding correct ? It'd great if you can help rectify my wrong points.

@kayousterhout
Copy link
Contributor Author

@jinxing64 I wrote a long comment on your other PR to attempt to explain (my current understanding of) these issues!

@kayousterhout
Copy link
Contributor Author

Thanks all for the review; merged this into master

@asfgit asfgit closed this in 0fbecc7 Feb 11, 2017
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 15, 2017
The pendingPartitions instance variable should be moved to ShuffleMapStage,
because it is only used by ShuffleMapStages. This change is purely refactoring
and does not change functionality.

I fixed this in an attempt to clarify some of the discussion around apache#16620, which I was having trouble reasoning about.  I stole the helpful comment Imran wrote for pendingPartitions and used it here.

cc squito markhamstra jinxing64

Author: Kay Ousterhout <[email protected]>

Closes apache#16876 from kayousterhout/SPARK-19537.
@kayousterhout kayousterhout deleted the SPARK-19537 branch April 11, 2017 22:03
xuanyuanking pushed a commit to xuanyuanking/spark that referenced this pull request Oct 19, 2017
The pendingPartitions instance variable should be moved to ShuffleMapStage,
because it is only used by ShuffleMapStages. This change is purely refactoring
and does not change functionality.

I fixed this in an attempt to clarify some of the discussion around apache#16620, which I was having trouble reasoning about.  I stole the helpful comment Imran wrote for pendingPartitions and used it here.

cc squito markhamstra jinxing64

Author: Kay Ousterhout <[email protected]>

Closes apache#16876 from kayousterhout/SPARK-19537.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants