## Simplest Linear Regression Model - Scalar

first consider yy is a scalar,ww is a scalar, then our linear function isy = wx + by=wx+b . The batch size of each batch input is1 1 , each batch of inputxx is a scalar, setx ∗ x^* , labelyy is also a scalar, set toy ∗ y^* . Therefore the loss functionLLDefine
: L = ( y − y ∗ ) 2 = ( wx ∗ + b − y ∗ ) 2 \begin{aligned} L&=\left(yy^*\right)^2\\ &=\left( wx^*+by^*\right)^2\\\end{aligned}
After each training, the parameter ww needs to be updated andbb , if we use the gradient descent method to update these two parameters, we need to find the gradient of the two parameters, that is, we need to find∂ L ∂ w \frac{\partial{L}}{\partial{w}} ∂ L ∂ b \frac{\partial{L}}{\partial{b}} , the result is as follows:
∂ L ∂ w = 2 ( wx ∗ + b − y ∗ ) x ∗ \frac{\partial{L}}{\partial{w}}=2(wx^*+by^*)x^ *
∂ L ∂ b = 2 ( w x ∗ + b − y ∗ ) \frac{\partial{L}}{\partial{b}}=2(wx^*+b-y^*)
Before training,ww andbb initialization assignment, set the step sizestep step .so every roundww andbbThe update method of
is: wnew = w − step ∗ 2 ( wx ∗ + b − y ∗ ) x ∗ w_{new}=w-step*2(wx^*+by^*)x^*
b n e w = b − s t e p ∗ 2 ( w x ∗ + b − y ∗ ) b_{new}=b-step*2(wx^*+b-y^*)
First consideryy is a scalar,ww is a scalar, then our linear function isy = wx + by=wx+b . The batch size of each batch input isNN , each batch of inputxx is a vector, setx ∗ \boldsymbol{x}^* , labelyy is also a vector, sety ∗ \boldsymbol{y}^* . So the loss function can be expressed as:
L = ∑ n = 1 N ( y − y ∗ ) 2 = ∑ n = 1 N ( y − y ∗ ) 2 \begin{aligned} L&=\sum_{n=1}^{ N}\left(yy^*\right)^2\\ &=\sum_{n=1}^{N}\left(yy^*\right)^2\\ \end{aligned}
Let's implement this simplest linear regression model using python:

x = np.array([0.1,1.2,2.1,3.8,4.1,5.4,6.2,7.1,8.2,9.3,10.4,11.2,12.3,13.8,14.9,15.5,16.2,17.1,18.5,19.2])
y = np.array([5.7,8.8,10.8,11.4,13.1,16.6,17.3,19.4,21.8,23.1,25.1,29.2,29.9,31.8,32.3,36.5,39.1,38.4,44.2,43.4])
print(x,y)
plt.scatter(x,y)
plt.show()


The regression process is as follows:

# 设定步长
step=0.001
# 存储每轮损失的loss数组
loss_list=[]
# 定义epoch
epoch=30
# 定义参数w和b并初始化
w=0.0
b=0.0
#梯度下降回归
for i in range(epoch) :
#计算当前输入x和标签y的索引，由于x和y数组长度一致，因此通过i整除x的长度即可获得当前索引
index = i % len(x)
# 当前轮次的x值为：
cx=x[index]
# 当前轮次的y值为：
cy=y[index]
# 计算当前loss
loss_list.append((w*cx+b-cy)**2)
# 计算参数w和b的梯度
# 更新w和b的值


The output loss is as follows:

plt.plot(loss_list)
plt.show()


Print the result of the fitted function:

print("y=%.2fx+%.2f" %(w,b))

y=2.46x+0.39


The relationship between the fitted function image and the points in the training data is as follows:

You can see the function image after 30 iterations, and now the number of iterations has increased to 3000, and the fitting results are as follows:

The loss is as follows:

When the batchsize is 1, the loss fluctuates greatly. Therefore, it is necessary to increase the batchsize. In the next article, we will increase the batchsize on this basis to see the results of linear regression.

### Guess you like

Origin blog.csdn.net/zhuzheqing/article/details/129354364
Recommended
Ranking
Daily