Python data analysis tips: How to implement pivot table in Pandas?

Python data analysis tips: How to implement pivot table in Pandas?

Pivot table is a very useful tool in data analysis, which can help us quickly understand the structure, association and trend of data. In Pandas, we can use the pivot_table() function to implement a pivot table. For example, we have a sales dataset that contains information such as products, sale dates, and sales. Let's start by creating a simple pivot table.

In this example, we have a dataframe with three columns: Product, , Dateand Sales. We want to create a pivot table that shows the total sales for each product on each date. We specified the rows, columns, and values ​​of the pivot table, and aggregated the sales using the sum function. After running the code, we can quickly understand the sales of each product on each date. pivot_table()The function has four parameters:

  • index: the column to use as row labels in the pivot table (in this case, Productthe
  • columns: the columns to use as column labels in the PivotTable (in this case, Datethe
  • values: the column to use as value in the pivot table (in this case Sales,
  • aggfunc: the aggregate function used in the pivot table (in this case, sumthe
# 数据透视表
import pandas as pd

df = pd.DataFrame({
    
    
    'Product': ['A', 'B', 'C', 'A', 'B', 'C'],
    'Date': ['2019-01-01', '2019-01-01', '2019-01-01', '2019-01-02', '2019-01-02', '2019-01-02'],
    'Sales': [100, 200, 300, 150, 250, 350]
})

print(df)

pivot_table = df.pivot_table(index='Product', columns='Date', values='Sales', aggfunc='sum')


print(pivot_table)

Python data analysis: groupby function realizes pivot table function

In addition to using the pivot_table() function, we can also use the groupby() and unstack() functions to implement pivot tables.

In this example, we first use the groupby() function to group the sales data by product and date and calculate the sum of the sales. Next, we use the unstack() function to rearrange the data with dates as columns and products as rows. Finally, we can get a similar pivot table to better analyze and understand the sales data.

Specifically, we can explain this code step by step:

  1. sales_data.groupby(['Product', 'Date']): First use groupby()the function to sales_datagroup, Productand Dateperform the grouping operation according to the two columns.
  2. ['Sales'].sum()Sales: Sum the columns in each group to get the sum of the sales of each product on each date.
  3. .unstack(): Use unstack()a function to rearrange the data, with dates as columns and products as rows, to get a result similar to a pivot table.
# 使用groupby函数实现数据透视表
import pandas as pd

sales_data = pd.DataFrame({
    
    
    'Product': ['A', 'B', 'C', 'A', 'B', 'C'],
    'Date': ['2019-01-01', '2019-01-01', '2019-01-01', '2019-01-02', '2019-01-02', '2019-01-02'],
    'Sales': [100, 200, 300, 150, 250, 350]
})
print(sales_data)


pivot_table = sales_data.groupby(['Product', 'Date'])['Sales'].sum().unstack()
print(pivot_table)

Guess you like

Origin blog.csdn.net/xili1342/article/details/130085320