Skip to content

Conversation

@jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Sep 9, 2024

Removes all remaining usages of "string[pyarrow_numpy]", the to-be deprecated string alias for the future default string dtype (will then actually deprecate it in a follow-up PR).

xref #54792

@jorisvandenbossche jorisvandenbossche added the Strings String extension data type and string data label Sep 9, 2024
obj = Series(["foo", "foo", None, "foo"], dtype=dtype)
obj = Series(["foo", "foo", None, "foo"], dtype=string_dtype_no_object)
result = obj.rank(method="first")
exp_dtype = "Int64" if string_dtype_no_object.na_value is pd.NA else "float64"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm why are these int64 for some types now? I think float is required for the return type to allow for some tiebreakers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point! But so that is an existing bug in the ArrowExtensionArray implementation then, because this test was already asserting Int64 (I just rewrote it a bit, but in the removed parametrization a few lines above there is already this dtype)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok...interesting!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a separate PR to address this -> #59768

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backported Strings String extension data type and string data

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants