Loss function - logarithmic loss (Logarithmic Loss, Log Loss)

Logarithmic Loss (Log Loss) is a loss function used to measure classification models. It is usually used for binary classification problems, but it can also be used for multi-class classification problems.

In binary classification problems, Log Loss calculates the loss based on the logarithmic error of the predicted probability and the actual label. For a sample i, assuming its actual label is yi(takes a value of 0 or 1), and the probability predicted by the model is y^i(0 ≤ y^i ≤ 1), then its logarithmic loss for:

where N is the total number of samples. It can be seen that when the predicted probability is close to the actual label, the log loss is close to 0, and when the predicted probability deviates from the actual label, the log loss increases.

For multi-class classification problems, Log Loss is defined slightly differently. Suppose there are K categories, the actual label of sample i is yi,j(the value is 0 or 1, indicating whether the sample i belongs to the jth category), and the probability predicted by the model is y^i,j(0 ≤ y^i,j ≤ 1, indicating the probability that sample i belongs to the jth category), then the logarithmic loss of sample i is:

where N is the total number of samples. It can be seen that the log loss is calculated in a similar way to the binary classification problem, except that the log errors for all classes need to be summed.

Log loss is a commonly used loss function that is widely used when training classification models. When the predicted probability of the model is closer to the actual label, the log loss is smaller, so it can help the model fit the data better and improve the classification accuracy.

In PyTorch, you can use nn.BCELoss() to calculate the log loss for binary classification problems and nn.CrossEntropyLoss() to calculate the log loss for multivariate classification problems. The following describes how to use them respectively.

For binary classification problems, the log loss can be calculated as follows:

import torch.nn as nn

# 定义二元分类模型
model = nn.Sequential(
    nn.Linear(10, 1),
    nn.Sigmoid()
)

# 定义损失函数
criterion = nn.BCELoss()

# 假设有 100 个样本，每个样本有 10 个特征
x = torch.randn(100, 10)
y = torch.randint(0, 2, (100, 1)).float()

# 前向传播计算损失
y_pred = model(x)
loss = criterion(y_pred, y)

# 反向传播更新模型参数
loss.backward()

In the code above, we first define a binary classification model consisting of a linear layer and a sigmoid activation function. Then, use nn.BCELoss() to define the loss function. During the forward pass, we first calculate the predicted probabilities of the model for all samples, and then pass them to the loss function along with the actual labels, thus computing the log loss. Finally, the gradient is calculated using the backpropagation algorithm, and the model parameters are updated.

For multivariate classification problems, the log loss can be calculated as follows:

import torch.nn as nn

# 定义多元分类模型
model = nn.Sequential(
    nn.Linear(10, 5),
    nn.Softmax(dim=1)
)

# 定义损失函数
criterion = nn.CrossEntropyLoss()

# 假设有 100 个样本，每个样本有 10 个特征，共有 5 个类别
x = torch.randn(100, 10)
y = torch.randint(0, 5, (100,))

# 前向传播计算损失
y_pred = model(x)
loss = criterion(y_pred, y)

# 反向传播更新模型参数
loss.backward()

In the code above, we first define a multivariate classification model consisting of a linear layer and a Softmax activation function. Then, use nn.CrossEntropyLoss() to define the loss function. During the forward pass, we first calculate the predicted probabilities of the model for all samples, and then pass them to the loss function along with the actual labels, thus computing the log loss. Finally, the gradient is calculated using the backpropagation algorithm, and the model parameters are updated.

To use logarithmic loss as the loss function in training, you can call the loss function calculation method in PyTorch during model training, and add the calculated loss to the backpropagation process to update the model parameters. Here is an example of training a binary classification model using the log loss function:

import torch
import torch.nn as nn
import torch.optim as optim

# 定义二元分类模型
class BinaryClassifier(nn.Module):
    def __init__(self, input_size):
        super(BinaryClassifier, self).__init__()
        self.linear = nn.Linear(input_size, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = self.linear(x)
        x = self.sigmoid(x)
        return x

# 定义训练函数
def train(model, train_loader, criterion, optimizer):
    model.train()
    train_loss = 0
    correct = 0
    total = 0
    for batch_idx, (data, target) in enumerate(train_loader):
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        train_loss += loss.item()
        pred = output >= 0.5
        correct += pred.eq(target).sum().item()
        total += target.size(0)
    accuracy = 100. * correct / total
    train_loss /= len(train_loader)
    print('Train set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)'.format(
        train_loss, correct, total, accuracy))

# 定义训练数据
train_data = torch.randn(1000, 10)
train_target = torch.randint(0, 2, (1000, 1)).float()
train_dataset = torch.utils.data.TensorDataset(train_data, train_target)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)

# 定义模型、损失函数和优化器
model = BinaryClassifier(input_size=10)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

# 开始训练
for epoch in range(10):
    train(model, train_loader, criterion, optimizer)

In the above code, we first define a binary classification model and define a training function train(). In the train() function, we call the model's train() method to ensure that dropout and batch normalization are enabled during training. Then, we traverse the training dataset, calculate the predicted output and log loss for each mini-batch, and call the backpropagation algorithm to calculate the gradient and update the model parameters. Finally, we compute and output the average loss and accuracy for the training set.

In the main program, we first define the training dataset and load it into DataLoader for batch training. Then, we define the model, loss function and optimizer, and call the train() function 10 times to train the model.

It should be noted that during the training process, the loss function we use is nn.BCELoss()

Loss function - logarithmic loss (Logarithmic Loss, Log Loss)

Guess you like