The sklearn implementation of logistic regression

Article Directory

1. Import the necessary modules

2. Generate data

3. Model building

4. Model training

5. Model prediction

6.logistic regression model

7. Draw the prediction curve

8. Calculate the evaluation index accuracy

Text content:

1. Import the necessary modules

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

2. Generate data

2.1 Define the data generation function

def create_data(data_num=100):
    np.random.seed(21)
    x1=np.random.normal(1,0.2,data_num)
    x2=np.random.normal(2,0.2,data_num)
    x=np.append(x1,x2)
    y=np.array([0]*data_num+[1]*data_num)
    return x,y

2.2 Generate data

X,y=create_data(1000)

X #查看X的数据
array([0.98960715, 0.97776079, 1.20835936, ..., 1.84049108, 2.14936146,
       1.90338769])
y #查看y的数据
array([0, 0, 0, ..., 1, 1, 1])

2.3 Divide training set and test set

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(
    X,y,test_size=0.3,random_state=16)

2.4 Draw a scatter plot of the training set data

plt.scatter(X_train,y_train,color='blue',s=20)
plt.show()

Training set scatter plot

2.5 Draw a scatter plot of the test set data

plt.scatter(X_test,y_test,color='g',s=20)
plt.show()

Scatter plot of test set data

3. Model building

from sklearn.linear_model import LogisticRegression
model=LogisticRegression()

4. Model training

Linear regression model training sklearn.linear_model.LogisticRegression.fit
Parameters used:
—X: input feature, if the input is in np.array format, shape must be (n_sample,n_feature).
-Y: Enter the label.

X_train=X_train.reshape(-1,1)
model.fit(X=X_train,y=y_train)
LogisticRegression() #上述两行代码运行的输出

5. Model prediction

Make predictions on the test set
Linear regression prediction model: sklearn.linear_model.LogisticRegression.predict
Parameters used:
—X: input feature, if the input is in np.array format, shape must be (n_sample,n_feature).
-C: Forecast result.

X_test=X_test.reshape(-1,1)
y_test_pred=model.predict(X=X_test)# 默认阀值为0.5
y_test_pred_proba=model.predict_proba(X=X_test) # 可以自定义阀值，比如自定义阀值0.6

Take the threshold to judge the two classification results of the probability

def thes_func(x):
    thes=0.6
    return 1 if x>thes else 0
y_test_pred_thes=list(map(thes_func,y_test_pred_proba[:,1]))

6. View the coefficient w and intercept b of the Logistic regression model

Regression coefficient: sklearn.linear_model.LogisticRegression.coef_
Intercept term: sklearn.linear_model.LogisticRegression.intercep_

w,b=model.coef_[0],model.intercept_
print('Weight={0}bias={1}'.format(w,b))
Weight=[9.53805539]bias=[-14.3705638]# print的输出结果

7. Draw the prediction curve

The scipy.special.expit function, also known as the logistic sigmoid function, is defined as: expit(x)=1/(1+ex)
Parameters:
-x: the input of the sigmoid function, the input requirement is np.array array format.
--Out: The output of the sigmoid function, returned in the format of np.array, with the same shape as the input x.

from scipy.special import expit
X_train=X_train.reshape(-1)
X_test=X_test.reshape(-1)
sigmoid=expit(np.sort(X_test)*model.coef_[0]+model.intercept_)
plt.plot(np.sort(X_test),sigmoid,color='g')
plt.scatter(X_test,y_test,color='r',label='test dataset')
plt.legend()
plt.show()

Insert picture description here

8. Calculate the evaluation index Accuracy

Mean square error: sklearn.metrics.accuracy_score
Parameters used:
—y_true: ground_truth
—y_pred: predicted value.
Returns:
-loss:accuracy calculation result.

from sklearn.metrics import accuracy_score
acc=accuracy_score(y_true=y_test,y_pred=y_test_pred)
print('Accuracy:{}'.format(acc))
Accuracy:0.9916666666666667 # print输出的结果

The sklearn implementation of logistic regression

Article Directory

1. Import the necessary modules

2. Generate data

3. Model building

4. Model training

5. Model prediction

6.logistic regression model

7. Draw the prediction curve

8. Calculate the evaluation index accuracy

Text content:

1. Import the necessary modules

2. Generate data

2.1 Define the data generation function

2.2 Generate data

2.3 Divide training set and test set

2.4 Draw a scatter plot of the training set data

2.5 Draw a scatter plot of the test set data

3. Model building

4. Model training

5. Model prediction

Take the threshold to judge the two classification results of the probability

6. View the coefficient w and intercept b of the Logistic regression model

7. Draw the prediction curve

8. Calculate the evaluation index Accuracy

Guess you like