Logistic regression and linear regression analysis

1, the linear regression

2, Logistic regression

principle

Logistic regression models because only two values ​​0 and 1 variables, assumptions: at x1, x2 .... xp role of p independent variable, taking note of the probability y is 1 p = P (y = 1 | X), take the 0 probability 1-p, probability is 1 and the ratio of 0: p / (1-p), known as the advantages of an event than the odds of taking the natural logarithm of the reciprocal of Logistic transformation

 

 

 

 

 When p (0,1) changes, the odds in the range: 0 to positive infinity, the In (p / 1-p) in the range: negative infinity to positive infinity

Modeling steps

  • The purpose of providing the mining features, filtering characteristics y: x1, x2 .... xp
  • Column regression equation
  • The estimated regression coefficients
  • Model Checking (correct rate, confusion matrix, ROC curve, KS value)
  • Predictive Control

Code

import pandas as pd

# Initialize 
filename = ' ../data/bankloan.xls ' 
Data = pd.read_excel (filename)
x = data.iloc[:,:8].as_matrix()
y = data.iloc[:,8].as_matrix()

from sklearn.linear_model import LogisticRegression as LR
from sklearn.linear_model import RandomizeLogisticRegression as RLR

RLR = RLR, () # establish random logic regression model, the variable filter     
rlr.fit (X, Y) # training data 
rlr.get_support () # Get wherein the screening results, various features can also be obtained by fractional .scores_ method 
Print (U effective wherein:% S ' % ' , ' .join (data.columns [rlr.get_support ()]) as_matrix () # good filter characteristics. 

LR = the LR ()     # establishing a logical container model 
lr.fit (x, y )     # using training data characteristic model screening 
Print (U ' average model accuracy was: S% ' % lr.score (X, Y))     # model gives the correct average ratio

The main idea is that recursive feature elimination: repeatedly build a model, and then select the best or worst features of the selected features aside, the remaining portion of the above-described procedure is repeated, until the order of traversing all features, this procedure is eliminated It is the sort feature.

Guess you like

Origin www.cnblogs.com/Iceredtea/p/12052323.html