Skip to content

Conversation

liuzqt
Copy link
Contributor

@liuzqt liuzqt commented Oct 7, 2024

What changes were proposed in this pull request?

Replace TreeNode.allChildren with identity map (which was Set previously) to avoid hashcode lazy evalution.

Why are the changes needed?

We hit a deadlock between shuffle dependency initialization and explain string generation, in which both code paths trigger lazy variable instantiation while visiting some tree nodes in different orders and thus acquiring object locks in a reversed order.

The hashcode of plan node is implemented as lazy val. So the fix is to remove hash code computation from explain string generation so to break the chain of lazy variable instantiation.

TreeNode.allChildren is only used in explain string generation and only require identity equality. This should be also a small perforamance improvement BTW.

Does this PR introduce any user-facing change?

NO

How was this patch tested?

Current UTs

Was this patch authored or co-authored using generative AI tooling?

NO

@github-actions github-actions bot added the SQL label Oct 7, 2024
@cloud-fan
Copy link
Contributor

the pyspark failure is unrelated, thanks, merging to master!

@cloud-fan cloud-fan closed this in d8aca18 Oct 8, 2024
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
### What changes were proposed in this pull request?

Replace `TreeNode.allChildren` with identity map (which was `Set` previously) to avoid hashcode lazy evalution.

### Why are the changes needed?

We hit a deadlock between shuffle dependency initialization and explain string generation, in which both code paths trigger lazy variable instantiation while visiting some tree nodes in different orders and thus acquiring object locks in a reversed order.

The hashcode of plan node is implemented as lazy val. So the fix is to remove hash code computation from explain string generation so to break the chain of lazy variable instantiation.

`TreeNode.allChildren` is only used in explain string generation and only require identity equality. This should be also a small perforamance improvement BTW.

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
Current UTs

### Was this patch authored or co-authored using generative AI tooling?
NO

Closes apache#48375 from liuzqt/SPARK-49852.

Authored-by: Ziqi Liu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants