@@ -702,7 +702,8 @@ on an entire ``DataFrame`` or ``Series``, row- or column-wise, or elementwise.
702702
7037031. `Tablewise Function Application `_: :meth: `~DataFrame.pipe `
7047042. `Row or Column-wise Function Application `_: :meth: `~DataFrame.apply `
705- 3. Elementwise _ function application: :meth: `~DataFrame.applymap `
705+ 3. `Aggregation API `_: :meth: `~DataFrame.agg ` and :meth: `~DataFrame.transform `
706+ 4. `Applying Elementwise Functions `_: :meth: `~DataFrame.applymap `
706707
707708.. _basics.pipe :
708709
@@ -778,6 +779,13 @@ statistics methods, take an optional ``axis`` argument:
778779 df.apply(np.cumsum)
779780 df.apply(np.exp)
780781
782+ ``.apply() `` will also dispatch on a string method name.
783+
784+ .. ipython :: python
785+
786+ df.apply(' mean' )
787+ df.apply(' mean' , axis = 1 )
788+
781789 Depending on the return type of the function passed to :meth: `~DataFrame.apply `,
782790the result will either be of lower dimension or the same dimension.
783791
@@ -827,16 +835,234 @@ set to True, the passed function will instead receive an ndarray object, which
827835has positive performance implications if you do not need the indexing
828836functionality.
829837
830- .. seealso ::
838+ .. _basics.aggregate :
839+
840+ Aggregation API
841+ ~~~~~~~~~~~~~~~
842+
843+ .. versionadded :: 0.20.0
844+
845+ The aggregation API allows one to express possibly multiple aggregation operations in a single concise way.
846+ This API is similar across pandas objects, :ref: `groupby aggregates <groupby.aggregate >`,
847+ :ref: `window functions <stats.aggregate >`, and the :ref: `resample API <timeseries.aggregate >`.
848+
849+ We will use a similar starting frame from above.
850+
851+ .. ipython :: python
852+
853+ tsdf = pd.DataFrame(np.random.randn(10 , 3 ), columns = [' A' , ' B' , ' C' ],
854+ index = pd.date_range(' 1/1/2000' , periods = 10 ))
855+ tsdf.iloc[3 :7 ] = np.nan
856+ tsdf
857+
858+ Using a single function is equivalent to ``.apply ``; You can also pass named methods as strings.
859+ This will return a Series of the output.
860+
861+ .. ipython :: python
862+
863+ tsdf.agg(np.sum)
864+
865+ tsdf.agg(' sum' )
866+
867+ # these are equivalent to a ``.sum()`` because we are aggregating on a single function
868+ tsdf.sum()
869+
870+ On a Series this will result in a scalar value
871+
872+ .. ipython :: python
873+
874+ tsdf.A.agg(' sum' )
875+
876+
877+ Aggregating multiple functions at once
878+ ++++++++++++++++++++++++++++++++++++++
879+
880+ You can pass arguments as a list. The results of each of the passed functions will be a row in the resultant DataFrame.
881+ These are naturally named from the aggregation function.
882+
883+ .. ipython :: python
884+
885+ tsdf.agg([' sum' ])
886+
887+ Multiple functions yield multiple rows.
831888
832- The section on :ref: `GroupBy <groupby >` demonstrates related, flexible
833- functionality for grouping by some criterion, applying, and combining the
834- results into a Series, DataFrame, etc.
889+ .. ipython :: python
890+
891+ tsdf.agg([' sum' , ' mean' ])
892+
893+ On a Series, multiple functions return a Series, indexed by the function names.
894+
895+ .. ipython :: python
896+
897+ tsdf.A.agg([' sum' , ' mean' ])
898+
899+
900+ Aggregating with a dict of functions
901+ ++++++++++++++++++++++++++++++++++++
902+
903+ Passing a dictionary of column name to function or list of functions, to ``DataFame.agg ``
904+ allows you to customize which functions are applied to which columns.
905+
906+ .. ipython :: python
907+
908+ tsdf.agg({' A' : ' mean' , ' B' : ' sum' })
909+
910+ Passing a list-like will generate a DataFrame output. You will get a matrix-like output
911+ of all of the aggregators; some may be missing values.
912+
913+ .. ipython :: python
914+
915+ tsdf.agg({' A' : [' mean' , ' min' ], ' B' : ' sum' })
835916
836- .. _Elementwise :
917+ For a Series, you can pass a dict. You will get back a MultiIndex Series; The outer level will
918+ be the keys, the inner the name of the functions.
919+
920+ .. ipython :: python
921+
922+ tsdf.A.agg({' foo' : [' sum' , ' mean' ]})
923+
924+ Alternatively, using multiple dictionaries, you can have renamed elements with the aggregation
925+
926+ .. ipython :: python
927+
928+ tsdf.A.agg({' foo' : ' sum' , ' bar' : ' mean' })
929+
930+ Multiple keys will yield a MultiIndex Series. The outer level will be the keys, the inner
931+ the names of the functions.
932+
933+ .. ipython :: python
934+
935+ tsdf.A.agg({' foo' : [' sum' , ' mean' ], ' bar' : [' min' , ' max' , lambda x : x.sum()+ 1 ]})
936+
937+ .. _basics.aggregation.mixed_dtypes :
938+
939+ Mixed Dtypes
940+ ++++++++++++
941+
942+ When presented with mixed dtypes that cannot aggregate, ``.agg `` will only take the valid
943+ aggregations. This is similiar to how groupby ``.agg `` works.
944+
945+ .. ipython :: python
837946
838- Applying elementwise Python functions
839- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
947+ mdf = pd.DataFrame({' A' : [1 , 2 , 3 ],
948+ ' B' : [1 ., 2 ., 3 .],
949+ ' C' : [' foo' , ' bar' , ' baz' ],
950+ ' D' : pd.date_range(' 20130101' , periods = 3 )})
951+ mdf.dtypes
952+
953+ .. ipython :: python
954+
955+ mdf.agg([' min' , ' sum' ])
956+
957+ .. _basics.aggregation.custom_describe :
958+
959+ Custom describe
960+ +++++++++++++++
961+
962+ With ``.agg() `` is it possible to easily create a custom describe function, similar
963+ to the built in :ref: `describe function <basics.describe >`.
964+
965+ .. ipython :: python
966+
967+ from functools import partial
968+
969+ q_25 = partial(pd.Series.quantile, q = 0.25 )
970+ q_25.__name__ = ' 25%'
971+ q_75 = partial(pd.Series.quantile, q = 0.75 )
972+ q_75.__name__ = ' 75%'
973+
974+ tsdf.agg([' count' , ' mean' , ' std' , ' min' , q_25, ' median' , q_75, ' max' ])
975+
976+ .. _basics.transform :
977+
978+ Transform API
979+ ~~~~~~~~~~~~~
980+
981+ .. versionadded :: 0.20.0
982+
983+ The ``transform `` method returns an object that is indexed the same (same size)
984+ as the original. This API allows you to provide *multiple * operations at the same
985+ time rather than one-by-one. Its api is quite similar to the ``.agg `` API.
986+
987+ Use a similar frame to the above sections.
988+
989+ .. ipython :: python
990+
991+ tsdf = pd.DataFrame(np.random.randn(10 , 3 ), columns = [' A' , ' B' , ' C' ],
992+ index = pd.date_range(' 1/1/2000' , periods = 10 ))
993+ tsdf.iloc[3 :7 ] = np.nan
994+ tsdf
995+
996+ Transform the entire frame. Transform allows functions to input as a numpy function, string
997+ function name and user defined function.
998+
999+ .. ipython :: python
1000+
1001+ tsdf.transform(np.abs)
1002+ tsdf.transform(' abs' )
1003+ tsdf.transform(lambda x : x.abs())
1004+
1005+ Since this is a single function, this is equivalent to a ufunc application
1006+
1007+ .. ipython :: python
1008+
1009+ np.abs(tsdf)
1010+
1011+ Passing a single function to ``.transform() `` with a Series will yield a single Series in return.
1012+
1013+ .. ipython :: python
1014+
1015+ tsdf.A.transform(np.abs)
1016+
1017+
1018+ Transform with multiple functions
1019+ +++++++++++++++++++++++++++++++++
1020+
1021+ Passing multiple functions will yield a column multi-indexed DataFrame.
1022+ The first level will be the original frame column names; the second level
1023+ will be the names of the transforming functions.
1024+
1025+ .. ipython :: python
1026+
1027+ tsdf.transform([np.abs, lambda x : x+ 1 ])
1028+
1029+ Passing multiple functions to a Series will yield a DataFrame. The
1030+ resulting column names will be the transforming functions.
1031+
1032+ .. ipython :: python
1033+
1034+ tsdf.A.transform([np.abs, lambda x : x+ 1 ])
1035+
1036+
1037+ Transforming with a dict of functions
1038+ +++++++++++++++++++++++++++++++++++++
1039+
1040+
1041+ Passing a dict of functions will will allow selective transforming per column.
1042+
1043+ .. ipython :: python
1044+
1045+ tsdf.transform({' A' : np.abs, ' B' : lambda x : x+ 1 })
1046+
1047+ Passing a dict of lists will generate a multi-indexed DataFrame with these
1048+ selective transforms.
1049+
1050+ .. ipython :: python
1051+
1052+ tsdf.transform({' A' : np.abs, ' B' : [lambda x : x+ 1 , ' sqrt' ]})
1053+
1054+ On a Series, passing a dict allows renaming as in ``.agg() ``
1055+
1056+ .. ipython :: python
1057+
1058+ tsdf.A.transform({' foo' : np.abs})
1059+ tsdf.A.transform({' foo' : np.abs, ' bar' : [lambda x : x+ 1 , ' sqrt' ]})
1060+
1061+
1062+ .. _basics.elementwise :
1063+
1064+ Applying Elementwise Functions
1065+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8401066
8411067Since not all functions can be vectorized (accept NumPy arrays and return
8421068another array or value), the methods :meth: `~DataFrame.applymap ` on DataFrame
0 commit comments