Use Python to calculate the Pearson correlation coefficient and display it with a heat map

Study Notes ☞Learn to calculate the Pearson correlation coefficient and display it with a heat map.

       Since it is a self-practice note, here we first randomly generate a part of the time series data through Pandas, and then call the corr() function to calculate the Pearson correlation coefficient, and display and output the calculation results first, and finally use the heat map to show the calculation results show out.

   The following is the specific process of development:

1. First import the required algorithm package

import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

2. Generate data (because it is an exercise, the data here is randomly generated time series data, where the index is time)

dates = pd.date_range('20220101', periods=15)#生成连续15天的时间序列
df = pd.DataFrame(np.random.randn(15,4), index=dates, columns=list('ABCD'))
print(df)

output:

3. Use the corr() function to calculate the Pearson correlation coefficient of the pairwise elements in the df data

a=df.corr()
print('皮尔逊系数')
print(a)

The calculation results are as follows: 

 

4. Draw a heat map through the calculated Pearson correlation coefficient, and use the heat map to represent the correlation.

sns.heatmap(df.corr(method='pearson'),linewidths=0.1,vmax=1.0, square=True,linecolor='white', annot=True)
plt.title('皮尔逊热力图')
plt.show()

output:

 The full version of the code is as follows (the local operation is correct and can be used):

 #coding=utf-8
import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dates = pd.date_range('20220101', periods=15)#生成连续六天的时间序列
#随机生成数据
df = pd.DataFrame(np.random.randn(15,4), index=dates, columns=list('ABCD'))
print(df)
a=df.corr()
print('皮尔逊系数')
print(a)
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
sns.heatmap(df.corr(method='pearson'),linewidths=0.1,vmax=1.0, square=True,linecolor='white', annot=True)
plt.title('皮尔逊热力图')
plt.show()

Guess you like

Origin blog.csdn.net/weixin_43155435/article/details/126598058