Skip to content

Conversation

@stanzhai
Copy link
Contributor

@stanzhai stanzhai commented Feb 9, 2017

What changes were proposed in this pull request?

If a column of a table is all null values, the follow SQL will throw an NPE: select count(1) from test group by e grouping sets(e).

The reason is that when transformUp a GroupingSets in ResolveGroupingAnalytics it uses a nullBitmask to set an attribute with null ability, the nullable attribute may be modified.

This pr just set all attribute's null ability to true in group by expressions to fix the problem.

The pr #15484 in master branch has fixed this problem.

We also need to fix this problem in branch-2.1.

How was this patch tested?

Test with Hive in my environment.

@stanzhai stanzhai changed the title [SPARK-19509][SQL][branch-2.1]Fix a NPE problem in grouping sets when using an empty column [SPARK-19509][SQL]Fix a NPE problem in grouping sets when using an empty column Feb 9, 2017
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

asfgit pushed a commit that referenced this pull request Feb 9, 2017
…umns

## What changes were proposed in this pull request?
The analyzer currently does not check if a column used in grouping sets is actually nullable itself. This can cause the nullability of the column to be incorrect, which can cause null pointer exceptions down the line. This PR fixes that by also consider the nullability of the column.

This is only a problem for Spark 2.1 and below. The latest master uses a different approach.

Closes #16874

## How was this patch tested?
Added a regression test to `SQLQueryTestSuite.grouping_set`.

Author: Herman van Hovell <[email protected]>

Closes #16873 from hvanhovell/SPARK-19509.
asfgit pushed a commit that referenced this pull request Feb 9, 2017
…umns

## What changes were proposed in this pull request?
The analyzer currently does not check if a column used in grouping sets is actually nullable itself. This can cause the nullability of the column to be incorrect, which can cause null pointer exceptions down the line. This PR fixes that by also consider the nullability of the column.

This is only a problem for Spark 2.1 and below. The latest master uses a different approach.

Closes #16874

## How was this patch tested?
Added a regression test to `SQLQueryTestSuite.grouping_set`.

Author: Herman van Hovell <[email protected]>

Closes #16873 from hvanhovell/SPARK-19509.

(cherry picked from commit a3d5300)
Signed-off-by: Herman van Hovell <[email protected]>
@hvanhovell
Copy link
Contributor

@stanzhai I have merged my PR, and assigned the PR to your name. Could you close this?

@stanzhai stanzhai closed this Feb 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants