Data analysis tool Pandas (1): Pandas data structure
Data analysis tool Pandas (2): Pandas indexing operation
Data analysis tool Pandas (3): Pandas alignment operation
Data analysis tool Pandas (4): Pandas application function
Data analysis tools Pandas (5): level of the index Pandas
Data analysis tools Pandas (6): Pandas statistical calculations and description
Pandas statistical calculations and description
import numpy as np
import pandas as pd
df_obj = pd.DataFrame(np.random.randn(5,4), columns = ['a', 'b', 'c', 'd'])
print(df_obj)
operation result:
a b c d
0 1.469682 1.948965 1.373124 -0.564129
1 -1.466670 -0.494591 0.467787 -2.007771
2 1.368750 0.532142 0.487862 -1.130825
3 -0.758540 -0.479684 1.239135 1.073077
4 -0.007470 0.997034 2.669219 0.742070
Commonly used statistical computing
sum, mean, max, min…
axis = 0 according to column statistics, axis = 1 row statistics
skipna exclude missing values, the default is True
df_obj.sum()
df_obj.max()
df_obj.min(axis=1, skipna=False)
operation result:
a 0.605751
b 2.503866
c 6.237127
d -1.887578
dtype: float64
a 1.469682
b 1.948965
c 2.669219
d 1.073077
dtype: float64
0 -0.564129
1 -2.007771
2 -1.130825
3 -0.758540
4 -0.007470
dtype: float64
Commonly used statistical description
describe generating a plurality of statistical data
print(df_obj.describe())
operation result:
a b c d
count 5.000000 5.000000 5.000000 5.000000
mean 0.180305 0.106488 0.244978 0.178046
std 0.641945 0.454340 1.064356 1.144416
min -0.677175 -0.490278 -1.164928 -1.574556
25% -0.064069 -0.182920 -0.464013 -0.089962
50% 0.231722 0.127846 0.355859 0.190482
75% 0.318854 0.463377 1.169750 0.983663
max 1.092195 0.614413 1.328220 1.380601
Commonly used statistical methods are described:
Reference material