What about the loss function

1. Description

  Non-original, source: Panchuang AI public account  (slightly modified).

2. Content

The loss function, also known as the cost function, is used to evaluate the degree of inconsistency between the predicted value of the model and the actual value, and is also the objective function of optimization in the neural network. The process of neural network training or optimization is to minimize the loss. The smaller the loss function, the closer the predicted value of the model is to the true value, and the better the robustness of the model.

Common loss functions are as follows:

(1)  0-1 loss function (0-1 lossfunction):

The 0-1 loss function is the simplest loss function, and it is mostly used in classification problems. If the predicted value is not equal to the target value, the prediction is wrong, and the output value is 1; if the predicted value is the same as the target value, the prediction is correct. , the output is 0, the implication is that there is no loss. Its mathematical formula can be expressed as:

 

Because the 0-1 loss function is too ideal and strict, and the mathematical properties are not very good, it is difficult to optimize, so in practical problems, we often use the following loss function instead.

(2) Perceptron Loss:
The perceptron loss function is an improvement of the 0-1 loss function. It is not as strict as the 0-1 loss function. Even if the predicted value is 0.99 and the real value is 1, it will be considered that is wrong; instead, an error interval is given, and as long as it is within the error interval, it is considered to be correct. Its mathematical formula can be expressed as:

 

(3) Square loss function (quadratic loss function):

As the name suggests, the squared loss function refers to the square of the difference between the predicted value and the true value. The larger the loss, the larger the difference between the predicted value and the true value. The squared loss function is mostly used in linear regression tasks, and its mathematical formula is:          

 

Next, we extend to the case where the number of samples is N, and the squared loss function at this time is:

 

 

(4) Hinge loss function:

The Hinge loss function is usually suitable for binary classification scenarios and can be used to solve the problem of interval maximization, and is often used in the famous SVM algorithm. Its mathematical formula is:

In the above formula, t is the target value {-1, +1}, y is the output of the predicted value, and the value range is between (-1, 1).

More about Hinge Loss reference: https://blog.csdn.net/luo123n/article/details/48878759 

 

(5) Log loss function (Log Loss):

The logarithmic loss function is also a common loss function, which is often used in logistic regression problems. Its standard form is:

In the above formula, y is the known classification category, and x is the sample value. We need to make the probability p(y|x) reach the maximum value, that is to say, we require a parameter value to make the output probability value of the current set of data. maximum. Because the value range of probability P(Y|X) is [0,1], the value of log(x) function in the interval [0,1] is negative, so in order to ensure that the loss value is positive, it must be before the log function Add a minus sign.

(6) Cross-entropy loss function:

The cross-entropy loss function is also essentially a logarithmic loss function, which is often used in multi-classification problems. Its mathematical formula is:

Note: x in the formula represents the sample, y represents the predicted output, a is the actual output, and n represents the total number of samples. The cross entropy loss function is often used when the sigmoid function is used as the activation function, because it can perfectly solve the problem that the weight update of the square loss function is too slow

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325853989&siteId=291194637