Seven monograph study (Logistic Regression)

Seven monograph study (Logistic Regression)

  • Category: k- nearest neighbor, decision tree, naive Bayes, Logistic regression, support vector machine, AdaBoost algorithm.
  • use
    • k- nearest neighbor, distance calculation is achieved using a classification
    • Decision trees, build intuitive classification tree
    • Naive Bayes, using probability theory to build a classifier
    • Logistic regression was mainly to correct raw data is classified by finding the optimal parameters
  • Logistic Regression (Logistic Regression): Although "return" word in the name, but it is good at dealing with classification problems. LR classifier suitable for classification tasks on the broad, such as: positive and negative emotional review information analysis, user click-through rate, default user information to predict, predict disease spam detection, user level classification (binary).
  • Logistic regression and the linear regression obtained are essentially a straight line, except that the linear regression straight line is possible to fit the distribution of the input variable x, such that all sample points in the training set to the shortest distance of a straight line; linearly as logistic regression to fit the decision boundary, so that the sample points in the training set apart as possible. Two different purposes.
  • Under dichotomous: Unit step function (Oceanside High Wycombe step function). sigmoid function processing easier.
    • sigmoid function equation

      $ f (x) = \ frac {1} {1 + e ^ {- x}} $

  • Logistic regression: by combining linear and sigmoid function can be obtained by the logistic regression equation:

    Y = \ {FRAC. 1. 1 + E {} {^ - (\ Omega X + B)}}

    Y is (0,1) taken value.
    Transforming (by about log), available

    log \ frac {y} {1 -y} = \ omega x + b

    This formula is a logarithmic probability.
  • Binomial Logistic Regression

    \ (P (Y = 0 | X) = \ FRAC {. 1} {. 1 + E ^ {\ Omega X}} \)


    $ P (Y =. 1 | X) = \ FRAC {E ^ {\ omega x}} {1 + e ^ {\ omega x}} $

  • Logistic regression number

    \ (P (y = k | x) = \ frac {e ^ {\ omega x}} {1+ \ sum_ {k-1} ^ {K-1} e ^ {\ omega _ {k X}}} \)


    $ P (Y = K | X) = \ {FRAC. 1} {1+ \ sum_ {} ^ {K-K. 1. 1-E ^ {} \ Omega _ {X}}} $ K

  • LR difference and linear regression
    • Logistic regression and linear regression are two types of models, logistic regression is a classification model, linear regression is regression model.
  • LR loss function: The quality loss function using a predictive model to measure the function of the gap between real and predicted values, the smaller the loss function, the better the model, the smallest loss 0.

    -Log $ (X), Y =. 1 $


    \ (- log (. 1-X), Y = 0 \)

  • The two upper loss function together:

    \ (- [Ylog (X) + (. 1-Ylog (. 1-X)] \)

    Y is a label, were taken to 0,1 m samples, the total loss of function. : \

    (J (\ Theta) = - \ FRAC {. 1} {m} \ sum_ {I =. 1} ^ {m} [Y_ {I} log (P (X_ {I}) + (. 1-Y_ {I }) log (1-p (

    x_ {i}))] \) this formula, m is the number of samples, y is the label, the value 0 or 1, i denotes the i th samples, p (x) represents a prediction Output.

  • When the loss is too small, the model can fit most of the data, this time prone to over-fitting. To prevent over-fitting introduction regularization.
  • Gradient descent: the loss function is minimized, can be solved by an iterative gradient descent method, the loss function is minimized and the model parameter values.
  • Gradient descent of species
    • Batch gradient descent algorithm BGD
    • Stochastic gradient descent algorithm SGD

Guess you like

Origin www.cnblogs.com/zaw-315/p/11257719.html