Pytorch Tutorial Introduction Series 9----The meaning of loss function

Series Article Directory



foreword

With the continuous learning of Pytorch, we have come to the discussion of model evaluation. In this issue, we will give some basic introduction to the loss function.


1. What is the loss function?

A loss function is a way to measure the distance between the model's predicted value and the true value. Quantify the gap mathematically, evaluate the predictive accuracy of the model and optimize it. When the error between the model's predicted value and the true value is small, the value of the loss function will also be small.

Second, the meaning of the loss function

With the loss function, the entire model has the cornerstone of training. More importantly, the loss function not only guides the goal of the model, but also reverses the structure of the model from the opposite side.
Its essential meaning is to guide the prediction of the neural network to continuously approach the real situation by minimizing the loss.
换言之:只有定义好了损失的计算方式,才有模型的搭建思路。

3. Common loss introduction

1. Squared Loss Function (Squared Loss Function):

它的优点是可导,在梯度下降法中可以很方便地求解损失函数的最小值。
For a given training data set, its loss function is defined as the average of the sum of the squares of the differences between the predicted values ​​of all training samples and the true values. The formula is 1 2 ( y − y ^ ) 2 \frac{1}{2}(y - \hat{y})^221(yy^)2 , which is often used in regression problems.

API in Pytorch: nn.MSELoss
Its implementation method is very simple, mainly calling torch.powthe sum torch.meanfunction in PyTorch.
Here is nn.MSELosspart of the source code (example):

class MSELoss(_Loss):
    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        return torch.mean(torch.pow(input - target, 2))

In this code, MSELoss inherits from the _Loss class, which defines a forward method for calculating the value of the loss function. This method receives two parameters: input represents the predicted value of the model, and target represents the target value. The implementation of this method is very simple. It only needs to calculate the sum of squares of the difference between the predicted value and the target value, and then calculate the average of these values ​​to obtain the value of the loss function.

2. Absolute Loss Function:

它的优点是对异常值不敏感,可以很好地避免因异常值的存在而使模型的预测结果变得不准确。
Similar to the square loss function, it is also for a given training data set, its loss function is defined as the average of the sum of the absolute values ​​of the differences between the predicted values ​​of all training samples and the true values. The formula is ∣ y − y ^ ∣ |y - \hat{y}|yy^ , is also commonly used in regression problems.

API in Pytorch: nn.L1Loss
Its implementation method is similar to nn.MSELoss, mainly calling torch.absthe sum torch.meanfunction in PyTorch.
Here is nn.L1Losspart of the source code (example):

class L1Loss(_Loss):
    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        return torch.mean(torch.abs(input - target))

In this code, the L1Loss class also inherits from the _Loss class, which defines a forward method for calculating the value of the loss function. This method receives two parameters: input represents the predicted value of the model, and target represents the target value. The implementation of this method is very simple. It only needs to calculate the absolute value sum of the difference between the predicted value and the target value, and then calculate the average value of these values ​​to obtain the value of the loss function.

3. Cross-Entropy Loss Function (Cross-Entropy Loss Function):

它的优点是可以很好地表示模型的预测结果与真实值之间的差距,可以很方便地计算损失函数的最小值。
It is used to measure the difference between two probability distributions. The formula is − ∑ inyi log ⁡ ( yi ^ ) -\sum_{i}^{n} y_i \log(\hat{y_i})inyilog(yi^) , which is commonly used in classification problems.

API in Pytorch: nn.CrossEntropyLoss
Its implementation method is slightly more complicated than the previous two classes, mainly calling torch.nn.functional.nll_lossfunctions in PyTorch.
Here is nn.CrossEntropyLosspart of the source code:

 class CrossEntropyLoss(_WeightedLoss):
  def forward(self, input: Tensor, target: Tensor) -> Tensor:
      return torch.nn.functional.nll_loss(input, target, weight=self.weight, reduction=self.reduction)

In this code, the CrossEntropyLoss class inherits from the _WeightedLoss class, which defines a forward method for calculating the value of the loss function. This method receives two parameters: input represents the predicted value of the model, and target represents the target value. The implementation of this method is mainly to call the torch.nn.functional.nll_loss function in PyTorch, which can calculate the cross-entropy loss between the model prediction value and the target value.

4. Use of related APIs

API in Pytorch: nn.MSELoss, nn.L1Loss,nn.CrossEntropyLoss

import torch
from torch import nn

# 定义模型
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.linear1 = nn.Linear(1, 1)
        self.linear2 = nn.Linear(1, 2)

    def forward(self, x):
        return self.linear1(x), self.linear2(x)

# 实例化模型
model = Model()

# 定义损失函数
mse_loss = nn.MSELoss()
l1_loss = nn.L1Loss()
ce_loss = nn.CrossEntropyLoss()

# 计算损失
input = torch.tensor([[1.0]])
target1 = torch.tensor([[2.0]])
target2 = torch.tensor([[0, 1]])
output1, output2 = model(input)

loss1 = mse_loss(output1, target1)
loss2 = l1_loss(output1, target1)
loss3 = ce_loss(output2, target2)

total_loss = loss1 + loss2 + loss3
print(total_loss)

In this example, we define a linear model that outputs a numeric value and a predicted value for a binary classification, respectively. Then, we define three loss functions: nn.MSELoss is used to calculate the squared loss of values, nn.L1Loss is used to calculate the absolute loss of values, and nn.CrossEntropyLoss is used to calculate the cross-entropy loss of binary classification. Finally, we sum the results of the three loss functions to get the total loss value.


Summarize

The above is what I will talk about today. These three loss functions are commonly used. The choice of loss function should be determined according to the type and requirements of the actual problem. For example, for regression problems, you can choose the square loss function or the absolute loss function; for classification problems, you can choose the cross-entropy loss function.

Guess you like

Origin blog.csdn.net/weixin_46417939/article/details/128227993