Machine learning logistic regression

Basic definition

Binomial logistic regression, referred to as logistic regression, also known as logarithmic regression

Is a two-class model

Using probability to study the relationship between categories and features is a kind of nonlinear regression.

Introduction of log probability function

Because the output label of the binary classification task y ∈ {0, 1} y ∈ \{0,1\}Y{ 0,1 } , and the predicted value of linear regressionz = w T x + bz=w^Tx+bwith=wTx+b is a real value.

We want to convert the real value z to a value of 0/1. The most ideal is the unit step function:
Insert picture description here
because it is not continuous, we hope to find a substitute function that is close to it to a certain extent, and hope that the substitute function is monotonically differentiable. So the logarithmic probability function was introduced

y = 1 1 + e - zy = \ frac {1} {1 + e ^ {- z}} Y=1+e- with1

If the predicted value z is greater than zero, it is judged as a positive type, if it is less than zero, it is judged as a negative type, and if it is a critical value of zero, it can be judged at will:

Insert picture description here

Derivation of log probability function

Let y be the probability that the sample x is a positive class, then 1-y is the probability that the sample x is a negative class

Their ratio y 1 − y \frac{y}{1-y}1 - andandCalled the probability , expressed as the relative probability of the positive class

Take the logarithm to get the logarithmic probability: lny 1 − y ln\frac{y}{1-y}ln1 - andand

z = lny 1 - yz = ln \ frac {y} {1-y}with=ln1 - andand, 可 得y = 1 1 + e - zy = \ frac {1} {1 + e ^ {- z}}Y=1+e- with1

The significance of the log probability function : the probability of predicting the sample x as a positive class

Logistic regression model

z = w T x + b z = w^Tx+b with=wTx+b is substituted into the logarithmic probability function to obtain the form of the logistic regression model:

h = 1 1 + e − ( w T x + b ) h=\frac{1}{1+e^{-(w^Tx+b)}} h=1+e(wTx+b)1

Estimation of model parameters

Cost function:

J = − 1 n ∑ i = 1 n y i l n   h ( x i ) + ( 1 − y i ) l n [ 1 − h ( x i ) ] J=-\frac{1}{n}∑_{i=1}^ny_iln~h(x_i)+(1-y_i)ln[1-h(x_i)] J=n1i=1nYil n h ( x i)+(1Yi)ln[1h(xi)]

Minimize the cost function to obtain the parameters w and b

Solution method: small batch gradient descent method

Guess you like

Origin blog.csdn.net/weixin_43772166/article/details/109578296