-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
BUG: Add limit_area to EA ffill/bfill #56616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
doc/source/whatsnew/v2.2.0.rst
Outdated
| - :meth:`ExtensionArray._explode` interface method added to allow extension type implementations of the ``explode`` method (:issue:`54833`) | ||
| - :meth:`ExtensionArray.duplicated` added to allow extension type implementations of the ``duplicated`` method (:issue:`55255`) | ||
| - :meth:`Series.ffill`, :meth:`Series.bfill`, :meth:`DataFrame.ffill`, and :meth:`DataFrame.bfill` have gained the argument ``limit_area`` (:issue:`56492`) | ||
| - :meth:`Series.ffill`, :meth:`Series.bfill`, :meth:`DataFrame.ffill`, and :meth:`DataFrame.bfill` have gained the argument ``limit_area``; 3rd part :class:`.ExtensionArray` authors need to add this argument to the method ``_pad_or_backfill`` (:issue:`56492`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3rd party instead of 3rd part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this into the ExtensionArray section as well
| stacklevel=find_stack_level(), | ||
| ) | ||
| if limit_area is not None: | ||
| raise NotImplementedError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only get here if 3rd party authors didn't implement this themselves, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right - this is only hit when EA author overrides .fillna but not ._pad_or_backfill. Currently pandas .ffill will call the EA's .fillna in such a case, which can only be correctly done when limit_area is None.
|
Friendly ping @phofl @jbrockmendel |
| npvalues = npvalues.copy() | ||
| new_mask = new_mask.copy() | ||
| elif limit_area is not None: | ||
| mask = mask.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking, but why do we need this copy here?
|
thx @rhshadrach |
…fill) (#56720) Backport PR #56616: BUG: Add limit_area to EA ffill/bfill Co-authored-by: Richard Shadrach <[email protected]>
|
Opened #56729 to track the deprecation |
limit_areapassed. #41813 (Replace xxxx with the GitHub issue number)doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.Follow up to #56531.
This is slightly complicated by the
EA.fillnadeprecation.EA.fillnais implemented, we raise wheneverlimit_areais not NoneEA._pad_or_backfillis implemented and does not have alimit_areaarg, we raise whenlimit_areais not NoneMy plan is to introduce a deprecation in 3.1 whenever
_pad_or_backfillgets called (regardless of whetherlimit_areaisNone) informing EA authors they need to add this argument. Assuming this is a good way to do, I'll open a tracking issue for this.For dtypes where the corresponding
maskis not used after filling (e.g.NDArrayBackedExtensionArray), this takes a shortcut by modifying mask prior to filling. Compared to implementinglimit_areaafter filling, this saves 1 or 2 copies (depending on the EA) and extra computation, at the cost of makingmaskno longer necessarily represent NA values.