Series.value_counts, pd.value_counts calculated Series, DataFrame data frequency

That the frequency used by the data appearing in pandas value_counts inside.

First, the use of Series

ss = Series.values_count()
Note that this is the return of Series

In[2]: import numpy as np
  ...: import pandas as pd
  ...: from pandas import DataFrame
  ...: from pandas import Series
  ...: ss = Series(['Tokyo', 'Nagoya', 'Nagoya', 'Osaka', 'Tokyo', 'Tokyo'])   
  ...: ss.value_counts()   #value_counts 直接用来计算series里面相同数据出现的频率
Out[2]: 
Tokyo     3
Nagoya    2
Osaka     1
dtype: int64

Second, the use of DataFrame

df = DataFrame.apply(pd.value_counts) Apply the method used here, and finally return type is assigned to the df DataFrame
series = DataFrame(colName).value_counts()herein specific column operations, and finally returns to the series is assigned type Series

In[2]: import numpy as np
  ...: import pandas as pd
  ...: from pandas import DataFrame
  ...: from pandas import Series
  ...: df=DataFrame({'a':['Tokyo','Osaka','Nagoya','Osaka','Tokyo','Tokyo'],'b':['Osaka','Osaka','Osaka','Tokyo','Tokyo','Tokyo']})       #DataFrame用来输入两列数据,同时value_counts将每列中相同的数据频率计算出来
  ...: print(df)
Backend TkAgg is interactive backend. Turning interactive mode on.
        a      b
0   Tokyo  Osaka
1   Osaka  Osaka
2  Nagoya  Osaka
3   Osaka  Tokyo
4   Tokyo  Tokyo
5   Tokyo  Tokyo

In[3]: df.apply(pd.value_counts)
Out[3]: 
        0
Tokyo   3
Nagoya  2
Osaka   1

In[4]: type(df.apply(pd.value_counts))
Out[4]: pandas.core.series.Series

Third, to be in ascending order, the parameters can be added ascending = True (the default is False, i.e., in descending order)

1, Series

In[5]: ss.value_counts(ascending=True)
Out[5]: 
Osaka     1
Nagoya    2
Tokyo     3
dtype: int64

2, DataFrame

In[6]: df.apply(pd.value_counts, ascending=True)
Out[6]: 
        a    b
Nagoya  1  NaN
Osaka   2  3.0
Tokyo   3  3.0
Name: a, dtype: int64

Fourth, to normalization, i.e., each accounting calculation, the parameters can be added to normalize = True (the default is False)

1, Series

In[7]: ss.value_counts(ascending=True, normalize=True)
Out[7]: 
Osaka     0.166667
Nagoya    0.333333
Tokyo     0.500000
dtype: float64

Or straightforward calculation of values may also be refer to "Math Pandas.Series mathematical operation of"

In[12]: ss.value_counts(ascending=True) / 6
Out[12]: 
Osaka     0.166667
Nagoya    0.333333
Tokyo     0.500000
dtype: float64

2, DataFrame

In[8]: df.apply(pd.value_counts, ascending=True, normalize=True)
Out[8]: 
               a    b
Nagoya  0.166667  NaN
Osaka   0.333333  0.5
Tokyo   0.500000  0.5

Fifth, there are other parameters, continued


Reference Bowen:
"value_counts calculation DataFrame, Series frequency data"
"Python3 PANDAS (. 6) count value_counts ()"

Published 131 original articles · won praise 81 · views 60000 +

Guess you like

Origin blog.csdn.net/weixin_43469047/article/details/104159595
Recommended