数据透视表——pd.pivot_table()

Signature: pd.pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')
 

Docstring:
创造透视表(多级索引表格),将值进行聚合运算。

Parameters
----------
data : DataFrame
values : 要聚合的列,可选
index : column, Grouper, array, or list of the previous列,组合,数组或是他们的列表。
    If an array is passed, it must be the same length as the data. The
    list can contain any of the other types (except list).
    Keys to group by on the pivot table index.  If an array is passed,
    it is being used as the same manner as column values.
columns : column, Grouper, array, or list of the previous
    If an array is passed, it must be the same length as the data. The
    list can contain any of the other types (except list).
    Keys to group by on the pivot table column.  If an array is passed,
    it is being used as the same manner as column values.
aggfunc : function, list of functions, dict, default numpy.mean
    If list of functions passed, the resulting pivot table will have
    hierarchical columns whose top level are the function names
    (inferred from the function objects themselves)
    If dict is passed, the key is column to aggregate and value
    is function or list of functions
fill_value : scalar, default None
    Value to replace missing values with
margins : boolean, default False
    Add all row / columns (e.g. for subtotal / grand totals)
dropna : boolean, default True
    Do not include columns whose entries are all NaN
margins_name : string, default 'All'
    Name of the row / column that will contain the totals
    when margins is True.

df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo",
                          "bar", "bar", "bar", "bar"],
                    "B": ["one", "one", "one", "two", "two",
                          "one", "one", "two", "two"],
                    "C": ["small", "large", "large", "small",
                          "small", "large", "small", "small",
                          "large"],
                    "D": [1, 2, 2, 3, 3, 4, 5, 6, 7]})


df
Out[21]: 
     A    B      C  D
0  foo  one  small  1
1  foo  one  large  2
2  foo  one  large  2
3  foo  two  small  3
4  foo  two  small  3
5  bar  one  large  4
6  bar  one  small  5
7  bar  two  small  6
8  bar  two  large  7


pivot_table = pd.pivot_table(df, values='D', index=['A', 'B'], columns='C', aggfunc=np.sum, fill_value=0)
pivot_table
Out[29]: 
C        large  small
A   B                
bar one      4      5
    two      7      6
foo one      4      1
    two      0      6

猜你喜欢

转载自blog.csdn.net/zs15321583801/article/details/81334710