kaggle教程--9--交叉验证

交叉验证(Cross Validation)

Cross-Validation and Train-Test Split

数据很多的时候,用Train-Test Split,时间短

数据不多的时候,用Cross-Validation,模型的分数准

例子:

import pandas as pd
data = pd.read_csv('../input/melb_data.csv')
cols_to_use = ['Rooms', 'Distance', 'Landsize', 'BuildingArea', 'YearBuilt']
X = data[cols_to_use]
y = data.Price

from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import Imputer
my_pipeline = make_pipeline(Imputer(), RandomForestRegressor())

from sklearn.model_selection import cross_val_score
scores = cross_val_score(my_pipeline, X, y, scoring='neg_mean_absolute_error')
print(scores)

print('Mean Absolute Error %2f' %(-1 * scores.mean()))

猜你喜欢

转载自www.cnblogs.com/wangzhonghan/p/10515196.html
今日推荐