DataWhale

LOGISTIC REGRESSION

Linear regression & Logistic regression

  • The purpose of linear regression is to predict a continuous variable y y with input x x .
  • Logistic regression mainly means to classify different categories with input x x .

The principle of logistic regression

LIke linear regression, there is also a predictive function h θ ( x ) h_\theta(x) (called classification function, linear or non-linear function) in logistic regression.

  • first, compute the predictive function

  • second, call the sigmoid function

  • finally, get the loss function and optimize parameters

    loss function

    J ( θ ) = 1 N i = 1 N C o s t ( h θ ( x ) , y ) J(\theta)=\frac{1}{N}\sum_{i=1}^NCost(h_\theta(x),y)

    optimization

    1. batch gradient descent
      θ = θ + α 1 N i = 1 N ( y i g ( X θ ) ) \theta=\theta+\alpha*\frac{1}{N}\sum_{i=1}^N(y_i-g(X_\theta))
      (repeat until convergence)

    2. stochastic gradient descent
      for i to N:
      θ = θ + α ( y i h θ ( x ( i ) ) ) x j ( i ) \theta=\theta+\alpha*(y_i-h_\theta(x^{(i)}))x_j^{(i)}
      (repeat until convergence)

Regularization

J ( θ ) = 1 N i = 1 N y ( i ) l o g h θ ( x ( i ) ) + ( 1 y ( i ) ) l o g ( 1 h θ ( x ( i ) ) ) + λ 2 N j = 1 N θ j 2 J(\theta)=\frac{1}{N}\sum_{i=1}^Ny^{(i)}logh_\theta(x^{(i)})+(1-y^{(i)})log(1-h_\theta(x^{(i)}))+\frac{\lambda}{2N}\sum_{j=1}^N\theta_j^2

Model evaluation index

  • receiver operating characteristic curve(roc)
    the horizontal axis: false positive rate(fpr)
    the vertical axis: true positive rate(tpr)
    every point corresponds to a thresthold

  • area under curve(auc)
    auc is related to roc

在这里插入图片描述

Advantages

  1. easy to compute, understand and implement
  2. efficient in time and memory
  3. robustness is good for small noise in data

Disadvantages

  1. underfitting happens easily
  2. classification accuracy is not high
  3. not work well for a big feature space

Sample imbalance issue

The logistic regression model ignores the feature of small sample category.

  • solutions
  1. oversampling, undersampling and combined sampling
  2. weight adjustment
  3. kernel function correction
  4. model correction

sklearn parameters

sklearn.linear_model.LogisticRegression(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver='liblinear', max_iter=100, multi_class='ovr', verbose=0, warm_start=False, n_jobs=1)
  1. regularization parameter: penalty(‘l1’ or ‘l2’)
  2. optimization: solver(‘liblinear’ ‘lbfgs’ ‘newton-cg’ ‘sag’)
  3. classification: multi_class(‘ovr’ or ‘multinomial’)
  4. class weight: class_weight
  5. sample weight: sample_weight

REFERENCE

[1] https://blog.csdn.net/touch_dream/article/details/79371462

[2] https://yoyoyohamapi.gitbooks.io/mit-ml/content/逻辑回归/articles/利用正规化解决过拟合问题.html

[3] https://www.cnblogs.com/dlml/p/4403482.html

[4] https://blog.csdn.net/u011088579/article/details/80654165

[5] https://blog.csdn.net/sun_shengyun/article/details/53811483

猜你喜欢

转载自blog.csdn.net/liyingjiehh/article/details/85163837