Skip to content

Conversation

sunchao
Copy link
Member

@sunchao sunchao commented Sep 1, 2020

What changes were proposed in this pull request?

This is a follow-up on SPARK-32721 and PR #29567. In the previous PR we missed two more cases that can be optimized:

if(p, false, null) ==> and(not(p), null)
if(p, true, null) ==> or(p, null)

Why are the changes needed?

By transforming if to boolean conjunctions or disjunctions, we can enable more filter pushdown to datasources.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added unit tests.

@sunchao
Copy link
Member Author

sunchao commented Sep 1, 2020

cc @dbtsai

case If(cond, l @ Literal(null, _), FalseLiteral) if !cond.nullable => And(cond, l)
case If(cond, l @ Literal(null, _), TrueLiteral) if !cond.nullable => Or(Not(cond), l)
case If(cond, FalseLiteral, l @ Literal(null, _)) if !cond.nullable => And(Not(cond), l)
case If(cond, TrueLiteral, l @ Literal(null, _)) if !cond.nullable => Or(cond, l)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, can we call l as nullLiteral?

      case If(cond, nullLiteral @ Literal(null, _), FalseLiteral) if !cond.nullable => And(cond, nullLiteral)
      case If(cond, nullLiteral @ Literal(null, _), TrueLiteral) if !cond.nullable => Or(Not(cond), nullLiteral)
      case If(cond, FalseLiteral, nullLiteral @ Literal(null, _)) if !cond.nullable => And(Not(cond), nullLiteral)
      case If(cond, TrueLiteral, nullLiteral @ Literal(null, _)) if !cond.nullable => Or(cond, nullLiteral)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that will break each of these into two lines which may affect readability, also I think given these are already Literal(null, _), naming the variable to nullLiteral doesn't add much value maybe?

assertEquivalent(If(p, nullLiteral, FalseLiteral), And(p, nullLiteral))
assertEquivalent(If(p, nullLiteral, TrueLiteral), Or(IsNotNull('a), nullLiteral))
assertEquivalent(If(p, FalseLiteral, nullLiteral), And(IsNotNull('a), nullLiteral))
assertEquivalent(If(p, TrueLiteral, nullLiteral), Or(IsNull('a), nullLiteral))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, make it consistent with the first test. Either we use p for IsNull('a) or in first test, we use IsNull('a).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Member

@dbtsai dbtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just minor comment.

@SparkQA
Copy link

SparkQA commented Sep 1, 2020

Test build #128124 has finished for PR 29603 at commit 978bad9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dbtsai
Copy link
Member

dbtsai commented Sep 1, 2020

Merged into master. Thanks @sunchao @cloud-fan @viirya

@dbtsai dbtsai closed this in 94d313b Sep 1, 2020
@SparkQA
Copy link

SparkQA commented Sep 1, 2020

Test build #128136 has finished for PR 29603 at commit 34e3d38.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

holdenk pushed a commit to holdenk/spark that referenced this pull request Oct 27, 2020
### What changes were proposed in this pull request?

This is a follow-up on SPARK-32721 and PR apache#29567. In the previous PR we missed two more cases that can be optimized:
```
if(p, false, null) ==> and(not(p), null)
if(p, true, null) ==> or(p, null)
```

### Why are the changes needed?

By transforming if to boolean conjunctions or disjunctions, we can enable more filter pushdown to datasources.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added unit tests.

Closes apache#29603 from sunchao/SPARK-32721-2.

Authored-by: Chao Sun <[email protected]>
Signed-off-by: DB Tsai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants