- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 19.2k
Closed
Labels
EnhancementGroupbyNA - MaskedArraysRelated to pd.NA and nullable extension arraysRelated to pd.NA and nullable extension arrays
Description
Similarly as for normal reductions (eg #30982), we should investigate having masked array-specific support in the groupby algorithms.
Currently, when starting from a nullable extension array, they get converted to a numpy array (eg integers with missing values will typically get cast to float with nan) before passing to the cython algorithm.
Having support for passing a mask to the cython algos can improve the groupby support for nullable dtypes.
-  any/all(correctly pass mask + add Kleene logic) -> ENH/BUG: Use Kleene logic for groupby any/all #40819
-  cummin/cummax-> PERF/BUG: use masked algo in groupby cummin and cummax #40651
-  cumsum/cumprod-> ENH: Support mask in GroupBy.cumsum #48070, ENH: Support mask in groupby cumprod #48138
-  sum/prod/var/mean
-  min/max-> BUG: Groupby min/max with nullable dtypes #42567
-  median
-  ohlc(ENH: Add support for groupby.ohlc for ea dtypes #48081)
-  quantile(correctly pass mask)
-  last/nth(last PERF: support mask in group_last #46107)
-  rank(ENH: support mask in libalgos.rank #46932)
Metadata
Metadata
Assignees
Labels
EnhancementGroupbyNA - MaskedArraysRelated to pd.NA and nullable extension arraysRelated to pd.NA and nullable extension arrays