The basic use of pandas.DataFrame GroupBy () method

pandas.DataFrame the GroupBy () method is a particularly common and useful method. Let's quickly master the basic use groupby () method of data analysis from the addition of a magic weapon.

First import package:

import pandas as pd
import numpy as np

groupby most basic operations

df = pd.DataFrame({'A':[1,2,3,1],'B':[2,3,3,6],'C':[3,1,5,7]})
df

 

 

 A column according to group (in fact, it means the duplicate values ​​in column A and to the same value, and then re-A as index data packet)

df.groupby ( ' A ' ) .mean () # Mean is averaged

df.groupby ( ' A ' ) .sum () # SUM is the sum

df.groupby ([ ' A ' ]). First () # fetch the first data appears

df.groupby ([ ' A ' ]). Last () # data taking the last to appear

 It can also be grouped by a plurality of sets

df.groupby(['A','B']).sum()

 

The number of statistical data 

With the size difference count: count includes NaN values ​​when the size, and the count value does not include NaN

df = pd.DataFrame({'A':[1,2,3,1],'B':[2,3,3,6],'C':[3,np.nan,5,7]})
df

df.groupby(['A']).count()

df.groupby(['A']).size()

Guess you like

Origin www.cnblogs.com/nsw0419/p/11620904.html