The relationship between loss function and regularization is covered in one place. What is the use of adding regularization to the loss function? How to add it? Here is a detailed explanation with easy-to-understand examples! (Series 1)

Insert image description here


1. In BP neural network prediction, the common loss function is the mean square error loss function MSE.

InBP神经网络预测, Common损失函数是均方误差损失函数 (Mean Squared Error Loss, MSE). In addition, to prevent model overfitting, often使用L2正则化.

均方误差损失函数(MSE)的公式as follows:

M S E = 1 n ∑ i = 1 n ( y i − y i ^ ) 2 MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2 MSE=n1i=1n(yiandi^)2

in:

  • n is样本数量
  • y i y_i andiyes第i个样本的真实值
  • y i ^ \hat{y_i} andi^yes第i个样本的预测值
  • MSE represents the difference between the predicted value and the true value平均平方误差, which reflects the accuracy of the model prediction.

2. The formula of L2 regularization is as follows:

L 2 = λ ∑ i = 1 n ∑ j = 1 k w i j 2 L2 = \lambda \sum_{i=1}^{n} \sum_{j=1}^{k} w_{ij}^2L2=li=1nj=1kInij2
in:

  • n is样本数量
  • k is神经元的数量
  • w i j w_{ij} Inijis the connection between the i-th sample and the j-th neuron权重
  • The regularization term L2 aims to limit the size of the weights in the model, thereby preventing overfitting.
  • λ \lambdaλ is正则化强度,它是一个超参数 and needs to be adjusted according to actual problems. Larger λ \lambda λmeeting加强正则化效果,有助于防止过拟合;
  • 较小的 λ \lambda λencounter减弱正则化效果,有助于提高模型的泛化能力.

3. The formula of the total loss function combining MSE and L2 regularization is as follows:

T o t a l L o s s = M S E + L 2 TotalLoss = MSE + L2 TotalLoss=MSE+L2
The total loss function TotalLoss is the weighted sum of MSE and L2 regularization, where MSE measures the prediction accuracy of the model, and L2 Regularization terms are used to prevent overfitting. By minimizing the total loss function, the model can simultaneously optimize prediction accuracy and generalization ability during training.


Summarize

The smaller the L2 value is, it means模型的权重越小 that the model complexity is lower. PassL2正则化可以约束模型的复杂性,避免过拟合现象. Combining MSE and L2 regularization, the resulting loss function can 同时优化预测误差和模型复杂度, making the model more generalizable in regression prediction tasks.

Guess you like

Origin blog.csdn.net/qlkaicx/article/details/134841886