[The normal equation predicts the Boston house price data set]

data preparation

We first need to load the Boston house price dataset. This dataset contains house feature information and corresponding house price labels.

import pandas as pd
import numpy as np

data_url = "http://lib.stat.cmu.edu/datasets/boston"
raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
target = raw_df.values[1::2, 2]

print("数据集大小:{}".format(data.shape))
print("标签大小:{}".format(target.shape))

data division

Next, we split the dataset into training and testing sets.

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(data, target, random_state=8)

normal equation method

The normal equation is a closed-form solution to the linear regression problem, which directly computes the optimal solution for the parameters without iteration. We use LinearRegressionthe class to train the model and output the scores, parameters and intercepts on the training and test sets.

from sklearn.linear_model import LinearRegression

lr = LinearRegression()
lr.fit(X_train, y_train)

print("正规方程训练集得分:{:.3f}".format(lr.score(X_train, y_train)))
print("正规方程测试集得分:{:.3f}".format(lr.score(X_test, y_test)))
print("正规方程参数:{}".format(lr.coef_))
print("正规方程截距:{:.3f}".format(lr.intercept_))

model evaluation

We use mean square error and root mean square error to evaluate the performance of the model.

from sklearn.metrics import mean_squared_error

y_pred = lr.predict(X_test)

print("正规方程均方误差:{:.3f}".format(mean_squared_error(y_test, y_pred)))
print("正规方程均方根误差:{:.3f}".format(np.sqrt(mean_squared_error(y_test, y_pred))))

visualization

Finally, we visually compare the true and predicted values ​​to gain a more intuitive understanding of how well the model fits.

import matplotlib.pyplot as plt

plt.figure(figsize=(10, 6))
plt.plot(range(len(y_test)), y_test, "r", label="y_test")
plt.plot(range(len(y_pred)), y_pred, "g--", label="y_pred")
plt.legend()
plt.show()

Guess you like

Origin blog.csdn.net/qq_66726657/article/details/131968499