Logistic regression implement gradient descent

Reference book Jane Logistic regression and Python code implementation .

Partial derivative loss function Logistic function is $ \ frac {1} {m } \ sum_ {i = 1} ^ {m} (h_ \ theta (x_i) -y_i) x_i ^ j $, so $ \ Theta $ of updates can be written as: $ \ theta_j = \ theta_j- \ alpha \ frac {1} {m} \ sum_ {i = 1} ^ {m} (h_ \ theta (x_i) -y_i) x_i ^ j $, derivation use the maximum likelihood estimation, the specific process see the original .

Derived as follows:

1) Logistic function is as follows

$$ g (z) = \ frac {1} {1 + e ^ {-} z} $$

among them

$$z={\theta }_{0}{x }_{0}+{\theta }_{1}{x }_{1}+...+{\theta }_{n}{x }_{n}$$

Vector expressed as

$$Z={\Theta}^{T}X$$

2) objective function modification

Prediction function

$$h_\theta(x)=g(\theta^Tx)=\frac{1}{1+e^{-\theta^Tx}}$$

H_ function $ \ Theta value (x) $ has a special meaning, which represents the result probability is taken , therefore, the probability of the classification result for the input x class 1 and 0 respectively as follows:

$$P(y=1|x;\theta)=h_\theta(x)\\ P(y=0|x;\theta)=1-h_\theta(x)$$

The above two equations written together
$$ P (y | x; \ theta) = (h_ \ theta (x)) ^ y (1-h_ \ theta (x)) ^ {1-y} $$

3) Take likelihood function

$$L(\theta)=\prod_{i=1}^mP(y_i|x_i;\theta)=\prod_{i=1}^m(h_\theta(x_i))^{y_i}(1-h_\theta(x_i))^{1-y_i}$$

This is the probability that a set of observed values, taking the number of

$$l(\theta)=\log L(\theta)=\sum_{i=1}^m(y_i\log h_\theta(x_i)+(1-y_i)\log (1-h\theta(x_i)))$$

Maximum likelihood estimation is seeking the $ l (\ theta) $ maximum , $ \ theta $ value.

 

 

 

Guess you like

Origin www.cnblogs.com/guesswhy/p/11285753.html