Lecture2 Linear Model (Linear Model)

Note that the article directory is on the left ヽ(*^ー^)人(^ー^*)ノ

1 The training process of the model

1.1 Introduction of Classroom Cases

Let's look at a picture first:

Figure 1 Classroom case

When x=1, it corresponds to y=2; when x=2, it corresponds to y=4; when x=3, it corresponds to y=6. Now ask you what is the corresponding y when x=4?

For the human brain, we can immediately draw a conclusion: , so when x=4, y=8. However, this is difficult for computers. What process must computers go through to draw corresponding conclusions like the human brain? The following is the relevant flow chart:

To put it simply, we use an algorithm to train the data set , and after training a model, we can directly predict the result through the model for future input.

Figure 2 Relationship Diagram

For Figure 1, the orange part is equivalent to the training set (Training Set) , and the light blue part is equivalent to the test set (Test Set) . The whole process is called supervised learning (Supervised Learning) . Supervised learning is a learning method in machine learning. It learns to predict outcomes for unseen data by training on sample data with correct answers. We continuously adjust and optimize the model based on the difference between the value calculated by the model and the correct value, and finally obtain a model that can predict the results more accurately.

For the case in Figure 1, in short, use the orange part to train the corresponding model, and then input the test set, that is, x=4, and let the model predict the corresponding y value when x=4.

1.2 The relationship between data set, training set, verification set and test set

Figure 3 Note: The development set is the verification set

Data set (Data Set) includes training set, validation set, and test set.

Validation Set : In order to test the generalization ability of the model, we usually divide the training set into two parts, one for training, that is, the training set; one for evaluating the model, that is, the verification set.

Test set : The test set is used to evaluate the generalization ability of the model, that is, the predictive ability of the model on unseen data.

Usually, we train the model through the training set, and then use the test set to test the generalization ability of the model. However, simply using the training set to train the model is prone to overfitting . Overfitting refers to the situation where the machine learning model is too adapted to the training data, so that it cannot generalize well to new data. To put it simply, it is to learn unwanted features, such as image noise.

At this time, in order to alleviate the phenomenon of overfitting, it is necessary to use a verification set. By evaluating the generalization ability of the model, the verification set can help developers better adjust the hyperparameters of the model (Hyperparameter) and reduce the risk of overfitting as much as possible.

Why not directly use the test set to evaluate the generalization ability, but use the validation set to evaluate it? Because if the test set data is used to evaluate the generalization ability of the model, it is easy to cause overfitting, that is, the model matches the test set data too closely.

For more details, please refer to this article: http://t.csdn.cn/EAm93

2 Model settings

2.1 Linear Model (Linear Model)

The linear model given in class is:

Figure 4 Linear model

Among them, represents the predicted value, represents the weight, and represents the offset. The offset allows the model to better fit the data and has an important impact on the performance of the model.

At present, the problem we have to solve is to determine the value of and so that the trained model has better generalization ability.

Let us first simplify the model to

Figure 5 Simplified linear model

此时我们根据原有的数据来画出对应示意图:

图6 根据原有数据画出正确结果示意图

现在,我们需要不断调整值,使得的值和图中的‘True Line’一致。

图7 不同的权重对应不同的图像

2.2 损失函数(Loss Function)

那么如何确定呢?在机器学习中,通常会以一个随机数作为值,然后通过一个评估模型,来计算不同情况下与真实值对应的误差值,最终确定最拟合情况下的(比如图7中要确定哪条直线与True Line最拟合)。这种评估模型就叫做损失函数(Loss Function),课上用的损失函数模型如下:

图8 损失函数

我们根据损失函数,计算出不同值下的情况,我们的目的是,找到一个值,使得损失的均值(mean)降到最低。:

图9 情况1

图10 情况2

图11 情况3

可以发现,上述评估过程中=2时,Loss均值是最小的,所以我们确定=2是最优值。

2.3 平均平方误差(Mean Square Error)

损失函数仅是对应一个样本的,而对于整个训练集来估算偏差,这时引入了一个新的模型:平均平方误差/均方误差(Mean Square Error), 即MSE。MSE越小,则说明模型的预测更接近实际值,模型更准确。MSE公式的意思是,对所有样本的预测值和真实值的差值的平方进行求和,然后除以样本总数。

图12 MSE和Loss之间的关系

于是能得到下表:

图13 各种情况下MSE的值

2.4 穷举法以及代码实现

上述评估过程所用到的方法,叫做穷举法(Exhaustive search),即采样一定范围内的值,计算每个可能的取值对应的Loss值,然后画出对应图像,根据图像性质来确定最优值。比如根据下图可知最低点就是的最优值

图14 穷举法画出函数图像以确定最优值

代码实现

import numpy as np
import matplotlib.pyplot as plt

'''训练集'''
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]

'''线性函数'''
def forward(x):
    return x * w


'''损失函数'''
def loss(x, y):
    y_pred = forward(x) # 计算出y_hat
    return (y_pred - y) * (y_pred - y)


w_list = []  # 用来保存权重
mse_list = []  # 用来保存对应权重的损失值

for w in np.arange(0.0, 4.1, 0.1):  # 从0.0到4.1,间隔0.1生成权重序列
    print('w=', w)
    l_sum = 0
    for x_val, y_val in zip(x_data, y_data):
        # 代码中使用zip(x_data, y_data)将x_data和y_data中的元素打包为一个tuple,方便同时遍历。
        y_pred_val = forward(x_val) # 计算预测值(这边主要是用来打印用)
        loss_val = loss(x_val, y_val)  # 计算损失值
        l_sum += loss_val  # 求和
        print('\t', '%.2f' % x_val, '%.2f' % y_val, '%.2f' % y_pred_val, '%.2f' % loss_val)
    print('MSE=', l_sum / 3) # 总和除以样本总数,转变成MSE
    w_list.append(w)
    mse_list.append(l_sum / 3)

'''绘图'''
plt.plot(w_list, mse_list)
plt.ylabel('Loss')
plt.xlabel('w')
plt.show()

图15 输出结果

图16 生成图像

图17 生成的数据

Guess you like

Origin blog.csdn.net/m0_56494923/article/details/128904287