Skip to content

Boolean operations on Series (object dtype) with NaNs retain NaNs: breaks boolean indexing #982

@lbeltrame

Description

@lbeltrame

Case in point:

In [14]: example = pandas.Series(["Activated", "Inhibited", np.nan], dtype="object")

In [15]: example == "Inhibited"
Out[15]: 
0    False
1     True
2      NaN

This breaks boolean indexing:

example[example == "Inihibted"]
/usr/lib64/python2.7/site-packages/pandas/core/series.pyc in __getitem__(self, key)
    392         # special handling of boolean data with NAs stored in object

    393         # arrays. Since we can't represent NA with dtype=bool

--> 394         if _is_bool_indexer(key):
    395             key = self._check_bool_indexer(key)
    396             key = np.asarray(key, dtype=bool)

/usr/lib64/python2.7/site-packages/pandas/core/common.pyc in _is_bool_indexer(key)
    321         if not lib.is_bool_array(key):
    322             if isnull(key).any():
--> 323                 raise ValueError('cannot index with vector containing '
    324                                  'NA / NaN values')
    325             return False

ValueError: cannot index with vector containing NA / NaN value

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions