You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## What changes were proposed in this pull request?
This PR is follow up of apache#24286. As gatorsmile pointed out that column with null value is inaccurate as well.
```
> select key from test;
2
NULL
1
spark-sql> desc extended test key;
col_name key
data_type int
comment NULL
min 1
max 2
num_nulls 1
distinct_count 2
```
The distinct count should be distinct_count + 1 when column contains null value.
## How was this patch tested?
Existing tests & new UT added.
Closesapache#24436 from pengbo/aggregation_estimation.
Authored-by: pengbo <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Copy file name to clipboardExpand all lines: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/AggregateEstimation.scala
0 commit comments