Logistic regression and loss function derivation derivative

sigmoid function derivation:

https://blog.csdn.net/u012421852/article/details/79614417
Here Insert Picture Description
by logistic curve can know

. 1) a threshold function is a sigmoid function, no matter what the value of x, corresponds to a sigmoid function value is always 0 <sigmoid (x) <1.

2) sigmoid function is strictly monotonically increasing, and its inverse function is also monotonically increasing

3) sigmoid function continuously

4) sigmoid function Smooth

5) sigmoid function with respect to the point (0, 0.5) symmetrical

6) sigmoid function derivative is itself a function of the dependent variable , i.e., f (x) '= F ( f (x))

So actually originated in the sigmoid function in biological phenomena, which is also referred ** S type curve growth curve.
** information science, since the sigmoid function and its inverse function is a strictly monotonic increasing, it is often a sigmoid function is used as the threshold value function neural network mapping variables to the (0,1).
The derivation process:
Here Insert Picture Description

Loss function:

Common loss function
https://blog.csdn.net/bitcarmanlee/article/details/51165444

In the derivation of logistic regression, we assume that the sample is subject to Bernoulli (0-1 distribution), and distribution of the obtained satisfies likelihood function , find the final value of the maximum likelihood function. ** The whole idea is to seek maximum likelihood function of thought *, then the minimum error function *.
The logarithmic, just for the convenience of our mathematical tools to take in seeking MLE (Maximum Likelihood Estimation) process only.
Here Insert Picture Description
Logistic regression, using the loss function is logarithmic. If the loss function becomes smaller, the better the model

Here Insert Picture Description
Here Insert Picture Description
Using a gradient descent solving
Here Insert Picture Description
**

Why do with cross-entropy loss function

After all, the better the model predictions should be more accurate thing.
But here no use as a least-squares loss function, why ?
Because linear regression is a regression problem, and called logistic regression Although logistic regression, but in reality does classification. Unlike continuous variables categorical variables, continuous variables can be used to measure the gap between the quality of the model, but only with a classification value represents the category value itself does meaningless, such as represented by {0,1} categories, with {1,2 } can be similarly expressed.
After learning we know, the category should be even more problems angle probability distribution of the measure predicted and the real gap class category.

https://blog.csdn.net/huwenxing0801/article/details/82791879

Published 273 original articles · won praise 1 · views 4670

Guess you like

Origin blog.csdn.net/wj1298250240/article/details/104039909