Machine learning 1: Linear regression model solves the problems of Boston housing price prediction and graduate enrollment rate

Python machine learning combat 1: Use linear regression models to solve the problems of Boston housing price prediction and graduate enrollment rate

Insert picture description here

boston house price forecast

Import library

from  sklearn.linear_model import LinearRegression
from  sklearn.datasets import load_boston
import matplotlib.pyplot as plt
%matplotlib inline

Get data set

bosten = load_boston()

Linear regression

  • Model training
clf = LinearRegression()
clf.fit(bosten.data[:,5:6],bosten.target)  #模型训练
x = bosten.data[:,5:6]
  • Regression coefficients
clf.coef_  
array([9.10210898])
  • Predictive value
y_pre = clf.predict(bosten.data[:,5:6])  #模型的输出值
  • Visualization
plt.scatter(x,bosten.target)
plt.plot(x,y_pre)
plt.show()

Insert picture description here

Graduate enrollment rate

Import library

import pandas as pd
from sklearn.linear_model import LogisticRegression  #逻辑回归
from sklearn.model_selection import train_test_split  #测试集训练集分割
from sklearn.metrics import classification_report

Import Data

data = pd.read_csv(r"LogisticRegression.csv")
data_tr,data_te,label_tr,label_te = train_test_split(data.iloc[:,1:],data["admit"],test_size = 0.2)
data.iloc[:,1:]
gre gpa rank
0 380 3.61 3
1 660 3.67 3
2 800 4.00 1
3 640 3.19 4
4 520 2.93 4
... ... ... ...
395 620 4.00 2
396 560 3.04 3
397 460 2.63 2
398 700 3.65 2
399 600 3.89 3

400 rows × 3 columns

data_tr.head()
gre gpa rank
252 520 4.00 2
94 660 3.44 2
41 580 3.32 2
2 800 4.00 1
207 640 3.63 1
data_te.head()
gre gpa rank
45 460 3.45 3
311 660 3.67 2
391 660 3.88 2
357 720 3.31 1
117 700 3.72 2

Model training

clf = LogisticRegression()
clf.fit(data_tr,label_tr)  #模型训练
pre = clf.predict(data_te) #模型预测
  • The predicted label, the actual value of label_te
pre  
array([0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int64)
  • The classification_report function in sklearn is used to display text reports of the main classification indicators. Display the accuracy, recall rate, F1 value and other information of each class in the report.
res = classification_report(label_te,pre)
print(res)
              precision    recall  f1-score   support

           0       0.71      0.89      0.79        56
           1       0.40      0.17      0.24        24

    accuracy                           0.68        80
   macro avg       0.56      0.53      0.51        80
weighted avg       0.62      0.68      0.63        80

Recommended reading

  1. Use Python to complete the basics of time series analysis
  2. A practical case of SPSS establishing a time series multiplication season model
  3. Practical case of building a time series ARIMA model in Python

This is the end, if it helps you, welcome to like and follow, your likes are very important to me

Guess you like

Origin blog.csdn.net/qq_45176548/article/details/112060382