【sklearn】LinearRegression使用

1 parameter

sklearn's LinearRegression has a parameter that can be standardized before training

from sklearn.linear_model import LinearRegression
model = LinearRegression(normalize=True)

The document introduces
normalizebool, default=False

This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use StandardScaler before calling fit on an estimator with normalize=False.

The interesting thing is that normalized and standardize are both standardized, minus the mean divided by the l2 norm

2 coefficient

The coefficient of the linear regression model after training can represent the importance of the feature

sorted(dict(zip(continuous_feature_names, model.coef_)).items(), key=lambda x:x[1], reverse=True)

Can also draw

model = LinearRegression().fit(train_X, train_y_ln)
print('intercept:'+ str(model.intercept_))
sns.barplot(abs(model.coef_), continuous_feature_names)

3 Check the model

After training the model, it is necessary to compare the gap between the real and the predicted to determine whether the model is feasible

subsample_index = np.random.randint(low=0, high=len(train_y), size=50)
plt.scatter(train_X['v_9'][subsample_index], train_y[subsample_index], color='black')
plt.scatter(train_X['v_9'][subsample_index], model.predict(train_X.loc[subsample_index]), color='blue')
plt.xlabel('v_9')
plt.ylabel('price')
plt.legend(['True Price','Predicted Price'],loc='upper right')
print('The predicted price is obvious different from true price')
plt.show()

If the deviation is large, it means that there
is a problem with the model. It may be a problem with the label. If the label is a long-tailed distribution, it does not meet the assumptions of the model. It
needs to be adjusted to a normal distribution.

train_y_ln = np.log(train_y + 1)

Training again, it will be much better

Guess you like

Origin blog.csdn.net/qq_40860934/article/details/114288682