Used to enhance conversion accuracy a single regression counterexample

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import Ridge
from sklearn.metrics import r2_score, mean_squared_error
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler

X_train = np.array([
    [158, 1],
    [170, 1],
    [183, 1],
    [191, 1],
    [155, 0],
    [163, 0],
    [180, 0],
    [158, 0],
    [170, 0]
])
ss = StandardScaler()
X_trainss = ss.fit_transform(X_train)
X_train_log = np.log(X_train + 1)

y_train = [64, 86, 84, 80, 49, 59, 67, 54, 67]

X_test = np.array([
    [160, 1],
    [196, 1],
    [168, 0],
    [177, 0]
])
X_testss = ss.transform(X_test)
X_test_log = np.log(X_test + 1)
y_test = [66, 87, 68, 74]

K = 5
clf = KNeighborsRegressor(n_neighbors=K)
clf.fit(X_train, y_train)
clf1 = KNeighborsRegressor(n_neighbors=K)
clf1.fit(X_trainss, y_train)

clf2 = Ridge().fit(X_train_log, y_train)
predictions = clf.predict(np.array(X_test))
predictions1 = clf1.predict(X_testss)
predictions2 = clf2.predict(X_test_log)
print('Actual weights: %s' % y_test)
print('Predicted weights: %s' % predictions)

print('Predicted weights StandardScaler: %s' % predictions1)
print('Predicted weights Log: %s' % predictions2)
print(mean_squared_error(y_test, predictions))
print(mean_squared_error(y_test, predictions1))
print(mean_squared_error(y_test, predictions2))
print(r2_score(y_test, predictions))
print(r2_score(y_test, predictions1))
print(r2_score(y_test, predictions2))

The results are:

Actual weights: [66, 87, 68, 74]
Predicted weights: [62.4 76.8 66. 72.6]
Predicted weights by StandardScaler: [69.4 76.8 59.2 59.2]
Predicted weights by Log: [72.98731557 73.88528401 63.37281696 63.60369452]
mean_squared_error: 30.740000000000023
mean_squared_error by StandardScaler: 103.02
mean_squared_error by Log: 87.57808624078896
0.5424744186046508
-0.5333209302325581
-0.30348779521174274

Process finished with exit code 0

We found that after normalization and logarithmic scaling of the characteristic values, forecasting mean square deviation even harder. Extend the last example of the third chapter of this example from "scikit-learn machine learning", so the examples in the book feel there is a defect, because the sample is too small.

Guess you like

Origin www.cnblogs.com/starcrm/p/11712559.html