Article Directory
1. Import the necessary modules
2. Generate data
3. Model building
4. Model training
5. Model prediction
6.logistic regression model
7. Draw the prediction curve
8. Calculate the evaluation index accuracy
Text content:
1. Import the necessary modules
import numpy as np
import pandas as pd
import matplotlib. pyplot as plt
2. Generate data
2.1 Define the data generation function
def create_data ( data_num= 100 ) :
np. random. seed( 21 )
x1= np. random. normal( 1 , 0.2 , data_num)
x2= np. random. normal( 2 , 0.2 , data_num)
x= np. append( x1, x2)
y= np. array( [ 0 ] * data_num+ [ 1 ] * data_num)
return x, y
2.2 Generate data
X, y= create_data( 1000 )
X
array( [ 0.98960715 , 0.97776079 , 1.20835936 , . . . , 1.84049108 , 2.14936146 ,
1.90338769 ] )
y
array( [ 0 , 0 , 0 , . . . , 1 , 1 , 1 ] )
2.3 Divide training set and test set
from sklearn. model_selection import train_test_split
X_train, X_test, y_train, y_test= train_test_split(
X, y, test_size= 0.3 , random_state= 16 )
2.4 Draw a scatter plot of the training set data
plt. scatter( X_train, y_train, color= 'blue' , s= 20 )
plt. show( )
2.5 Draw a scatter plot of the test set data
plt. scatter( X_test, y_test, color= 'g' , s= 20 )
plt. show( )
3. Model building
from sklearn. linear_model import LogisticRegression
model= LogisticRegression( )
4. Model training
Linear regression model training sklearn.linear_model.LogisticRegression.fit
Parameters used: —X: input feature, if the input is in np.array format, shape must be (n_sample,n_feature). -Y: Enter the label.
X_train= X_train. reshape( - 1 , 1 )
model. fit( X= X_train, y= y_train)
LogisticRegression( )
5. Model prediction
Make predictions on the test set
Linear regression prediction model: sklearn.linear_model.LogisticRegression.predict
Parameters used: —X: input feature, if the input is in np.array format, shape must be (n_sample,n_feature). -C: Forecast result.
X_test= X_test. reshape( - 1 , 1 )
y_test_pred= model. predict( X= X_test)
y_test_pred_proba= model. predict_proba( X= X_test)
Take the threshold to judge the two classification results of the probability
def thes_func ( x) :
thes= 0.6
return 1 if x> thes else 0
y_test_pred_thes= list ( map ( thes_func, y_test_pred_proba[ : , 1 ] ) )
6. View the coefficient w and intercept b of the Logistic regression model
Regression coefficient: sklearn.linear_model.LogisticRegression.coef_
Intercept term: sklearn.linear_model.LogisticRegression.intercep_
w, b= model. coef_[ 0 ] , model. intercept_
print ( 'Weight={0}bias={1}' . format ( w, b) )
Weight= [ 9.53805539 ] bias= [ - 14.3705638 ]
7. Draw the prediction curve
The scipy.special.expit function, also known as the logistic sigmoid function, is defined as: expit(x)=1/(1+ex)
Parameters: -x: the input of the sigmoid function, the input requirement is np.array array format. --Out: The output of the sigmoid function, returned in the format of np.array, with the same shape as the input x.
from scipy. special import expit
X_train= X_train. reshape( - 1 )
X_test= X_test. reshape( - 1 )
sigmoid= expit( np. sort( X_test) * model. coef_[ 0 ] + model. intercept_)
plt. plot( np. sort( X_test) , sigmoid, color= 'g' )
plt. scatter( X_test, y_test, color= 'r' , label= 'test dataset' )
plt. legend( )
plt. show( )
8. Calculate the evaluation index Accuracy
Mean square error: sklearn.metrics.accuracy_score
Parameters used: —y_true: ground_truth —y_pred: predicted value. Returns: -loss:accuracy calculation result.
from sklearn. metrics import accuracy_score
acc= accuracy_score( y_true= y_test, y_pred= y_test_pred)
print ( 'Accuracy:{}' . format ( acc) )
Accuracy: 0.9916666666666667