Simplest Linear Regression Model - Scalar

  first consider yyy is a scalar,wwWhen w is a scalar, then our linear function isy = wx + by=wx+by=wx+b . The batch size of each batch input is1 11 , each batch of inputxxx is a scalar, setx ∗ x^*x , labelyyy is also a scalar, set toy ∗ y^*y . Therefore the loss functionLLDefine L
: L = ( y − y ∗ ) 2 = ( wx ∗ + b − y ∗ ) 2 \begin{aligned} L&=\left(yy^*\right)^2\\ &=\left( wx^*+by^*\right)^2\\\end{aligned}L=(yy)2=(wx+by)2
  After each training, the parameter ww needs to be updatedw andbbb , if we use the gradient descent method to update these two parameters, we need to find the gradient of the two parameters, that is, we need to find∂ L ∂ w \frac{\partial{L}}{\partial{w}}wL ∂ L ∂ b \frac{\partial{L}}{\partial{b}} bL, the result is as follows:
∂ L ∂ w = 2 ( wx ∗ + b − y ∗ ) x ∗ \frac{\partial{L}}{\partial{w}}=2(wx^*+by^*)x^ *wL=2(wx+by)x
∂ L ∂ b = 2 ( w x ∗ + b − y ∗ ) \frac{\partial{L}}{\partial{b}}=2(wx^*+b-y^*) bL=2(wx+by )
Before training,www andbbb initialization assignment, set the step sizestep stepstep . _ _ _ so every roundwww andbbThe update method of b
is: wnew = w − step ∗ 2 ( wx ∗ + b − y ∗ ) x ∗ w_{new}=w-step*2(wx^*+by^*)x^*wnew=wstep2(wx+by)x
b n e w = b − s t e p ∗ 2 ( w x ∗ + b − y ∗ ) b_{new}=b-step*2(wx^*+b-y^*) bnew=bstep2(wx+by )
First consideryyy is a scalar,wwWhen w is a scalar, then our linear function isy = wx + by=wx+by=wx+b . The batch size of each batch input isNNN , each batch of inputxxx is a vector, setx ∗ \boldsymbol{x}^*x , labelyyy is also a vector, sety ∗ \boldsymbol{y}^*y . So the loss function can be expressed as:
L = ∑ n = 1 N ( y − y ∗ ) 2 = ∑ n = 1 N ( y − y ∗ ) 2 \begin{aligned} L&=\sum_{n=1}^{ N}\left(yy^*\right)^2\\ &=\sum_{n=1}^{N}\left(yy^*\right)^2\\ \end{aligned}L=n=1N(yy)2=n=1N(yy)2
Let's implement this simplest linear regression model using python:

x = np.array([0.1,1.2,2.1,3.8,4.1,5.4,6.2,7.1,8.2,9.3,10.4,11.2,12.3,13.8,14.9,15.5,16.2,17.1,18.5,19.2])
y = np.array([5.7,8.8,10.8,11.4,13.1,16.6,17.3,19.4,21.8,23.1,25.1,29.2,29.9,31.8,32.3,36.5,39.1,38.4,44.2,43.4])
print(x,y)
plt.scatter(x,y)
plt.show()

insert image description here
The regression process is as follows:

# 设定步长
step=0.001
# 存储每轮损失的loss数组
loss_list=[]
# 定义epoch
epoch=30
# 定义参数w和b并初始化
w=0.0
b=0.0
#梯度下降回归
for i in range(epoch) :
    #计算当前输入x和标签y的索引,由于x和y数组长度一致,因此通过i整除x的长度即可获得当前索引
    index = i % len(x)
    # 当前轮次的x值为:
    cx=x[index]
    # 当前轮次的y值为:
    cy=y[index]
    # 计算当前loss
    loss_list.append((w*cx+b-cy)**2)
    # 计算参数w和b的梯度
    grad_w = 2*(w*cx+b-cy)*cx
    grad_b = 2*(w*cx+b-cy)
    # 更新w和b的值
    w -= step*grad_w
    b -= step*grad_b

The output loss is as follows:

plt.plot(loss_list)
plt.show()

insert image description here
Print the result of the fitted function:

print("y=%.2fx+%.2f" %(w,b))
y=2.46x+0.39

The relationship between the fitted function image and the points in the training data is as follows:
insert image description here
You can see the function image after 30 iterations, and now the number of iterations has increased to 3000, and the fitting results are as follows:
insert image description here
The loss is as follows:
insert image description here

When the batchsize is 1, the loss fluctuates greatly. Therefore, it is necessary to increase the batchsize. In the next article, we will increase the batchsize on this basis to see the results of linear regression.

Guess you like

Origin blog.csdn.net/zhuzheqing/article/details/129354364