Article Directory
1. Experimental Purpose
We have a data set containing the prices of used BMW cars. We will analyze this data set and build a prediction function that can predict the price by taking the mileage and age of the car as inputs. We will use the sklearn train_test_split method to split the training and test data sets
Data link
Password: n3dp
2. Import the necessary modules and read the data
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split #数据拆分模块
from sklearn.linear_model import LinearRegression
clf = LinearRegression() #实例化(由散点图可知线性回归比较合适)
clf.fit(X_train, y_train) #训练
clf.score(X_test, y_test) #计算得分
clf.coef_ #打印系数
clf.intercept_ #打印截距
clf.predict(X_test) #预测
If you want the data set to be split every time to be the same, you can add the random_state parameter to train_test_split as follows:
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=10)
X_test
10 is a state, and different numbers represent different states.