Detailed explanation of the process of least squares fitting in matplotlib in python

This article mainly introduces the relevant information about the realization of the least squares fitting in matplotlib in python. The article introduces the implementation process of the least squares fitting straight line and the least squares fitting curve through the sample code. Friends who need it For reference, let's take a look below.

foreword

The Least Square Method, the basis for classification and regression algorithms, has a long history (proposed by Marie Legendre in 1806). It finds the best functional match for the data by minimizing the sum of squared errors. The unknown data can be easily obtained by using the least squares method, and the sum of squares of errors between the obtained data and the actual data can be minimized. The least squares method can also be used for curve fitting. Some other optimization problems can also be formulated with least squares by minimizing energy or maximizing entropy.

The following article mainly introduces to you the relevant content about the implementation of least squares fitting in matplotlib in python. I won't say much, let's take a look at the detailed introduction:

First, the least squares method to fit a straight line

Generate sample points

First, we generate random points from a normal distribution around the line y = 3 + 5x as sample points for the fitted line.

?
1
2
3
4
5
6
7
8
9
10
import numpy as np 
import matplotlib.pyplot as plt
  
# 在直线 y = 3 + 5x 附近生成随机点
X = np.arange( 0 , 5 , 0.1
Z = [ 3 + 5 * x for x in X] 
Y = [np.random.normal(z, 0.5 ) for z in Z]
  
plt.plot(X, Y, 'ro'
plt.show()

The sample points are shown in the figure:

Fit straight line

Let y = a0 + a1*x, we use the least squares canonical equation system to solve the unknown coefficients a0 and a1.

There is a solve function in numpy's linalg module, which can solve unknowns according to the coefficient matrix of the equation system and the vector formed by the right-hand side of the equation.

?
1
2
3
4
5
6
7
8
9
10
11
12
13
def linear_regression(x, y): 
  N = len (x)
  sumx = sum (x)
  sumy = sum (y)
  sumx2 = sum (x * * 2 )
  sumxy = sum (x * y)
  
  A = np.mat([[N, sumx], [sumx, sumx2]])
  b = np.array([sumy, sumxy])
  
  return np.linalg.solve(A, b)
  
a0, a1 = linear_regression(X, Y)

draw straight lines

At this point, we have obtained the fitted line equation coefficients a0 and a1. Next, we draw this line and compare it with the sample points.

?
1
2
3
4
5
6
7
# 生成拟合直线的绘制点
_X = [ 0 , 5
_Y = [a0 + a1 * x for x in _X]
  
plt.plot(X, Y, 'ro' , _X, _Y, 'b' , linewidth = 2
plt.title( "y = {} + {}x" . format (a0, a1)) 
plt.show()

The fitting effect is as follows:

Second, the least squares fitting curve

Generate sample points

The same as generating straight line sample points, we generate random points that obey the normal distribution around the curve y = 2 + 3x + 4x^2 as the sample points for the fitted curve.

?
1
2
3
4
5
6
7
8
9
10
import numpy as np 
import matplotlib.pyplot as plt
  
# y = 2 + 3x + 4x^2
X = np.arange( 0 , 5 , 0.1
Z = [ 2 + 3 * x + 4 * x * * 2 for x in X] 
Y = np.array([np.random.normal(z, 3 ) for z in Z])
  
plt.plot(X, Y, 'ro'
plt.show()

The sample points are shown in the figure:

Curve fitting

Let the equation of this curve be y = a0 + a1*x + a2*x^2, and again, we solve for the unknowns a0, a1, and a2 through a regular system of equations.

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# 生成系数矩阵A
def gen_coefficient_matrix(X, Y): 
  N = len (X)
  m = 3
  A = []
  # 计算每一个方程的系数
  for i in range (m):
   a = []
   # 计算当前方程中的每一个系数
   for j in range (m):
    a.append( sum (X * * (i + j)))
   A.append(a)
  return A
  
# 计算方程组的右端向量b
def gen_right_vector(X, Y): 
  N = len (X)
  m = 3
  b = []
  for i in range (m):
   b.append( sum (X * * i * Y))
  return b
  
A = gen_coefficient_matrix(X, Y) 
b = gen_right_vector(X, Y)
  
a0, a1, a2 = np.linalg.solve(A, b)

draw curve

We draw the image of the curve according to the obtained curve equation.

?
1
2
3
4
5
6
7
# 生成拟合曲线的绘制点
_X = np.arange( 0 , 5 , 0.1
_Y = np.array([a0 + a1 * x + a2 * x * * 2 for x in _X])
  
plt.plot(X, Y, 'ro' , _X, _Y, 'b' , linewidth = 2
plt.title( "y = {} + {}x + {}$x^2$ " . format (a0, a1, a2)) 
plt.show()

The fitting effect is as follows:

Summarize

以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作能带来一定的帮助,如果有疑问大家可以留言交流,谢谢大家对脚本之家的支持。

 

原文链接:http://www.codebelief.com/article/2017/04/matplotlib-demonstrate-least-square-regression-process/

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325344045&siteId=291194637