@@ -44,16 +44,6 @@ The categorical data type is useful in the following cases:
4444* As a signal to other Python libraries that this column should be treated as a categorical
4545 variable (e.g. to use suitable statistical methods or plot types).
4646
47- .. note ::
48-
49- In contrast to R's `factor ` function, categorical data is not converting input values to
50- strings and categories will end up the same data type as the original values.
51-
52- .. note ::
53-
54- In contrast to R's `factor ` function, there is currently no way to assign/change labels at
55- creation time. Use `categories ` to change the categories after creation time.
56-
5747See also the :ref: `API docs on categoricals<api.categorical> `.
5848
5949.. _categorical.objectcreation :
@@ -113,19 +103,18 @@ Categorical data has a specific ``category`` :ref:`dtype <basics.dtypes>`:
113103 DataFrame Creation
114104~~~~~~~~~~~~~~~~~~
115105
116- Columns in a ``DataFrame `` can be batch converted to categorical, either at the time of construction
117- or after construction. The conversion to categorical is done on a column by column basis; labels present
118- in a one column will not be carried over and used as categories in another column.
106+ Similar to the previous section where a single column was converted to categorical, all columns in a
107+ ``DataFrame `` can be batch converted to categorical either during or after construction.
119108
120- Columns can be batch converted by specifying ``dtype="category" `` when constructing a ``DataFrame ``:
109+ This can be done during construction by specifying ``dtype="category" `` in the ``DataFrame `` constructor :
121110
122111.. ipython :: python
123112
124113 df = pd.DataFrame({' A' : list (' abca' ), ' B' : list (' bccd' )}, dtype = " category" )
125114 df.dtypes
126115
127- Note that the categories present in each column differ; since the conversion is done on a column by column
128- basis, only labels present in a given column are categories:
116+ Note that the categories present in each column differ; the conversion is done column by column, so
117+ only labels present in a given column are categories:
129118
130119.. ipython :: python
131120
@@ -135,15 +124,15 @@ basis, only labels present in a given column are categories:
135124
136125 .. versionadded :: 0.23.0
137126
138- Similarly, columns in an existing ``DataFrame `` can be batch converted using :meth: `DataFrame.astype `:
127+ Analogously, all columns in an existing ``DataFrame `` can be batch converted using :meth: `DataFrame.astype `:
139128
140129.. ipython :: python
141130
142131 df = pd.DataFrame({' A' : list (' abca' ), ' B' : list (' bccd' )})
143132 df_cat = df.astype(' category' )
144133 df_cat.dtypes
145134
146- This conversion is likewise done on a column by column basis :
135+ This conversion is likewise done column by column:
147136
148137.. ipython :: python
149138
@@ -191,7 +180,7 @@ are consistent among all columns.
191180 categories for each column, the ``categories `` parameter can be determined programatically by
192181 ``categories = pd.unique(df.values.ravel()) ``.
193182
194- If you already have `codes ` and `categories `, you can use the
183+ If you already have `` codes `` and `` categories ` `, you can use the
195184:func: `~pandas.Categorical.from_codes ` constructor to save the factorize step
196185during normal constructor mode:
197186
@@ -216,6 +205,16 @@ To get back to the original ``Series`` or NumPy array, use
216205 s2.astype(str )
217206 np.asarray(s2)
218207
208+ .. note ::
209+
210+ In contrast to R's `factor ` function, categorical data is not converting input values to
211+ strings; categories will end up the same data type as the original values.
212+
213+ .. note ::
214+
215+ In contrast to R's `factor ` function, there is currently no way to assign/change labels at
216+ creation time. Use `categories ` to change the categories after creation time.
217+
219218.. _categorical.categoricaldtype :
220219
221220CategoricalDtype
0 commit comments