sklearn 交叉验证

当评价估计器的不同设置(”hyperparameters(超参数)”)时,例如手动为 SVM 设置的 C 参数, 由于在训练集上,通过调整参数设置使估计器的性能达到了最佳状态;但 在测试集上 可能会出现过拟合的情况。 此时,测试集上的信息反馈足以颠覆训练好的模型,评估的指标不再有效反映出模型的泛化性能。 为了解决此类问题,还应该准备另一部分被称为 “validation set(验证集)” 的数据集,模型训练完成以后在验证集上对模型进行评估。 当验证集上的评估实验比较成功时,在测试集上进行最后的评估。

下面的例子展示了如何通过分割数据,拟合模型和计算连续 5 次的分数(每次不同分割)来估计 linear kernel 支持向量机在 iris 数据集上的精度:

from sklearn.model_selection import cross_val_score
clf = svm.SVC(kernel='linear', C=1)
scores = cross_val_score(clf, iris.data, iris.target, cv=5)
scores  
array([ 0.96...,  1.  ...,  0.96...,  0.96...,  1.        ])
cross_val_score(estimator, X, y= None, groups= None, scoring= None, cv= None,
n_jobs= 1, verbose= 0, fit_params= None,
pre_dispatch= '2*n_jobs')

estimator:分类器

X:训练集

y:目标值

scoring:模型评估标准

Scoring(得分) Function(函数) Comment(注解)
Classification(分类)    
‘accuracy’ metrics.accuracy_score  
‘average_precision’ metrics.average_precision_score  
‘f1’ metrics.f1_score for binary targets(用于二进制目标)
‘f1_micro’ metrics.f1_score micro-averaged(微平均)
‘f1_macro’ metrics.f1_score macro-averaged(微平均)
‘f1_weighted’ metrics.f1_score weighted average(加权平均)
‘f1_samples’ metrics.f1_score by multilabel sample(通过 multilabel 样本)
‘neg_log_loss’ metrics.log_loss requires predict_proba support(需要 predict_proba 支持)
‘precision’ etc. metrics.precision_score suffixes apply as with ‘f1’(后缀适用于 ‘f1’)
‘recall’ etc. metrics.recall_score suffixes apply as with ‘f1’(后缀适用于 ‘f1’)
‘roc_auc’ metrics.roc_auc_score  
Clustering(聚类)    
‘adjusted_mutual_info_score’ metrics.adjusted_mutual_info_score  
‘adjusted_rand_score’ metrics.adjusted_rand_score  
‘completeness_score’ metrics.completeness_score  
‘fowlkes_mallows_score’ metrics.fowlkes_mallows_score  
‘homogeneity_score’ metrics.homogeneity_score  
‘mutual_info_score’ metrics.mutual_info_score  
‘normalized_mutual_info_score’ metrics.normalized_mutual_info_score  
‘v_measure_score’ metrics.v_measure_score  
Regression(回归)    
‘explained_variance’ metrics.explained_variance_score  
‘neg_mean_absolute_error’ metrics.mean_absolute_error  
‘neg_mean_squared_error’ metrics.mean_squared_error  
‘neg_mean_squared_log_error’ metrics.mean_squared_log_error  
‘neg_median_absolute_error’ metrics.median_absolute_error  
‘r2’ metrics.r2_score  

  注:roc_auc只能用于二分类。scoring的默认值是score,即为分类器clf.score方法得到的各个kfold的值

cross_validate 函数与 cross_val_score 在下面的两个方面有些不同 -

  • 它允许指定多个指标进行评估.
  • 除了测试得分之外,它还会返回一个包含训练得分,拟合次数, score-times (得分次数)的一个字典。 It returns a dict containing training scores, fit-times and score-times in addition to the test score.

猜你喜欢

转载自blog.csdn.net/qiufengily/article/details/80536628