PyTorch----Softmax function and cross entropy function

Two classification problems and multi classification problems

  • Two classification problems:

    There are two categories in the classification task. For example, the previous perceptron recognizes bananas or apples. Generally, it will train a classifier, input an image, and output the probability that the image is an apple. After rounding p, the output result is 0 or 1. This is a classic binary classification problem. .

  • Multi-category problem:

    Basically similar to the binary classification task, the final output has multiple labels (>=2), and a classifier needs to be established to identify a variety of fruits.

  • In the binary classification problem, we can use the max function, because there are only two labels, either black or white. But in multi-classification problems, there is no way to judge because there are multiple labels. So use the Softmax function.

Softmax function

  • The Softmax function maps the input to a real number between 0 and 1, and ensures that the sum of these numbers is 1 through normalization, which is a kind of 激活函数.

  • In this way, the multi-classification task can find the probability.

  • Softmax layers are often used in conjunction with the cross-entropy loss function.

numpy write Softmax

import numpy as np

def softmax(x):
    return np.exp(x) / np.sum(np.exp(x))


x = np.array([2.0,1.0,0.1])

outputs = softmax(x)
print("输入:",x)
print("输出:",outputs)
print("输出之和:",outputs.sum())

It can be seen that 2.0 has the highest weight, and the weight of the obtained value is also the highest, and the final sum is 1:
insert image description here

Softmax() function encapsulated by PyTorch

dim参数:

  1. When dim is 0, perform softmax calculation on all data
  2. When dim is 1, perform softmax calculation on a column of a certain dimension
  3. When dim is -1 or 2, softmax calculation is performed on rows of a certain dimension
import torch
x = torch.tensor([2.0,1.0,0.1])
x.cuda()
outputs = torch.softmax(x,dim=0)
print("输入:",x)
print("输出:",outputs)
print("输出之和:",outputs.sum())

insert image description here


cross entropy function

Chestnut introduction:

  • Suppose you want to predict the alcohol concentration, malic acid concentration and ash concentration of wine to predict the origin of wine. Assume that the output labels are UK, France, and US.
  • The multi-classification task uses the Softmax function to normalize a numerical vector into a probability distribution vector, and the sum of each probability is 1.
  1. Model 1:
serial number forecast result predicted label is it right or not
1 0.3 ,0.3 ,0.4 0 0 1 (United States) correct
2 0.3,0.4, 0.3 0 1 0 (France) correct
3 0.1,0.2,0.7 1 0 0 (UK) mistake
  • 可以看出序号1预测正确的时候,0.4只比0.3高了0.1,勉强预测正确
  • 序号3预测错误的时候相差0.6之高,完全错误
  1. Model 2 (using the cross-entropy loss function):
serial number forecast result predicted label is it right or not
1 0.1 ,0.2 ,0.7 0 0 1 (United States) correct
2 0.1,0.7, 0.2 0 1 0 (France) correct
3 0.4,0.3,0.3 1 0 0 (UK) mistake
  • 可以看出预测正确的时候相差高达0.6,完全正确

  • 预测错误时相差仅有0.1

  • The cross entropy function is a loss function

  • The loss function reflects the gap between the predicted result and the actual result

  • The cross entropy function is divided into binary classification ( torch.nn.BCELoss()) and multi-classification function ( torch.nn.CrossEntropyLoss())

loss = BCELoss()
loss(预测值,真实值)

# 二分类 损失函数
loss = torch.nn.BCELoss()
l = loss(pred,real)
# 多分类损失函数
loss = torch.nn.CrossEntropyLoss()

Guess you like

Origin blog.csdn.net/bjsyc123456/article/details/124890272