foreword
The Least Square Method, the basis for classification and regression algorithms, has a long history (proposed by Marie Legendre in 1806). It finds the best functional match for the data by minimizing the sum of squared errors. The unknown data can be easily obtained by using the least squares method, and the sum of squares of errors between the obtained data and the actual data can be minimized. The least squares method can also be used for curve fitting. Some other optimization problems can also be formulated with least squares by minimizing energy or maximizing entropy.
The following article mainly introduces to you the relevant content about the implementation of least squares fitting in matplotlib in python. I won't say much, let's take a look at the detailed introduction:
First, the least squares method to fit a straight line
Generate sample points
First, we generate random points from a normal distribution around the line y = 3 + 5x as sample points for the fitted line.
1
2
3
4
5
6
7
8
9
10
|
import
numpy as np
import
matplotlib.pyplot as plt
# 在直线 y = 3 + 5x 附近生成随机点
X
=
np.arange(
0
,
5
,
0.1
)
Z
=
[
3
+
5
*
x
for
x
in
X]
Y
=
[np.random.normal(z,
0.5
)
for
z
in
Z]
plt.plot(X, Y,
'ro'
)
plt.show()
|
The sample points are shown in the figure:
Fit straight line
Let y = a0 + a1*x, we use the least squares canonical equation system to solve the unknown coefficients a0 and a1.
There is a solve function in numpy's linalg module, which can solve unknowns according to the coefficient matrix of the equation system and the vector formed by the right-hand side of the equation.
1
2
3
4
5
6
7
8
9
10
11
12
13
|
def
linear_regression(x, y):
N
=
len
(x)
sumx
=
sum
(x)
sumy
=
sum
(y)
sumx2
=
sum
(x
*
*
2
)
sumxy
=
sum
(x
*
y)
A
=
np.mat([[N, sumx], [sumx, sumx2]])
b
=
np.array([sumy, sumxy])
return
np.linalg.solve(A, b)
a0, a1
=
linear_regression(X, Y)
|
draw straight lines
At this point, we have obtained the fitted line equation coefficients a0 and a1. Next, we draw this line and compare it with the sample points.
1
2
3
4
5
6
7
|
# 生成拟合直线的绘制点
_X
=
[
0
,
5
]
_Y
=
[a0
+
a1
*
x
for
x
in
_X]
plt.plot(X, Y,
'ro'
, _X, _Y,
'b'
, linewidth
=
2
)
plt.title(
"y = {} + {}x"
.
format
(a0, a1))
plt.show()
|
The fitting effect is as follows:
Second, the least squares fitting curve
Generate sample points
The same as generating straight line sample points, we generate random points that obey the normal distribution around the curve y = 2 + 3x + 4x^2 as the sample points for the fitted curve.
1
2
3
4
5
6
7
8
9
10
|
import
numpy as np
import
matplotlib.pyplot as plt
# y = 2 + 3x + 4x^2
X
=
np.arange(
0
,
5
,
0.1
)
Z
=
[
2
+
3
*
x
+
4
*
x
*
*
2
for
x
in
X]
Y
=
np.array([np.random.normal(z,
3
)
for
z
in
Z])
plt.plot(X, Y,
'ro'
)
plt.show()
|
The sample points are shown in the figure:
Curve fitting
Let the equation of this curve be y = a0 + a1*x + a2*x^2, and again, we solve for the unknowns a0, a1, and a2 through a regular system of equations.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
# 生成系数矩阵A
def
gen_coefficient_matrix(X, Y):
N
=
len
(X)
m
=
3
A
=
[]
# 计算每一个方程的系数
for
i
in
range
(m):
a
=
[]
# 计算当前方程中的每一个系数
for
j
in
range
(m):
a.append(
sum
(X
*
*
(i
+
j)))
A.append(a)
return
A
# 计算方程组的右端向量b
def
gen_right_vector(X, Y):
N
=
len
(X)
m
=
3
b
=
[]
for
i
in
range
(m):
b.append(
sum
(X
*
*
i
*
Y))
return
b
A
=
gen_coefficient_matrix(X, Y)
b
=
gen_right_vector(X, Y)
a0, a1, a2
=
np.linalg.solve(A, b)
|
draw curve
We draw the image of the curve according to the obtained curve equation.
1
2
3
4
5
6
7
|
# 生成拟合曲线的绘制点
_X
=
np.arange(
0
,
5
,
0.1
)
_Y
=
np.array([a0
+
a1
*
x
+
a2
*
x
*
*
2
for
x
in
_X])
plt.plot(X, Y,
'ro'
, _X, _Y,
'b'
, linewidth
=
2
)
plt.title(
"y = {} + {}x + {}$x^2$ "
.
format
(a0, a1, a2))
plt.show()
|
The fitting effect is as follows:
Summarize
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作能带来一定的帮助,如果有疑问大家可以留言交流,谢谢大家对脚本之家的支持。
您可能感兴趣的文章:
原文链接:http://www.codebelief.com/article/2017/04/matplotlib-demonstrate-least-square-regression-process/