sklearn 交叉验证

当评价估计器的不同设置（”hyperparameters(超参数)”）时，例如手动为 SVM 设置的 C 参数，由于在训练集上，通过调整参数设置使估计器的性能达到了最佳状态；但在测试集上可能会出现过拟合的情况。此时，测试集上的信息反馈足以颠覆训练好的模型，评估的指标不再有效反映出模型的泛化性能。为了解决此类问题，还应该准备另一部分被称为 “validation set(验证集)” 的数据集，模型训练完成以后在验证集上对模型进行评估。当验证集上的评估实验比较成功时，在测试集上进行最后的评估。

下面的例子展示了如何通过分割数据，拟合模型和计算连续 5 次的分数（每次不同分割）来估计 linear kernel 支持向量机在 iris 数据集上的精度:

from sklearn.model_selection import cross_val_score
clf = svm.SVC(kernel='linear', C=1)
scores = cross_val_score(clf, iris.data, iris.target, cv=5)
scores

array([ 0.96...,  1.  ...,  0.96...,  0.96...,  1.        ])

    cross_val_score(estimator, X, y= 
   None, groups= 
   None, scoring= 
   None, cv= 
   None, 
  

    n_jobs= 
   1, verbose= 
   0, fit_params= 
   None, 
  

    pre_dispatch= 
   '2*n_jobs') 
  

estimator：分类器

X：训练集

y：目标值

scoring：模型评估标准

Scoring（得分）	Function（函数）	Comment（注解）
Classification（分类）
‘accuracy’	`metrics.accuracy_score`
‘average_precision’	`metrics.average_precision_score`
‘f1’	`metrics.f1_score`	for binary targets（用于二进制目标）
‘f1_micro’	`metrics.f1_score`	micro-averaged（微平均）
‘f1_macro’	`metrics.f1_score`	macro-averaged（微平均）
‘f1_weighted’	`metrics.f1_score`	weighted average（加权平均）
‘f1_samples’	`metrics.f1_score`	by multilabel sample（通过 multilabel 样本）
‘neg_log_loss’	`metrics.log_loss`	requires `predict_proba` support（需要 `predict_proba` 支持）
‘precision’ etc.	`metrics.precision_score`	suffixes apply as with ‘f1’（后缀适用于 ‘f1’）
‘recall’ etc.	`metrics.recall_score`	suffixes apply as with ‘f1’（后缀适用于 ‘f1’）
‘roc_auc’	`metrics.roc_auc_score`
Clustering（聚类）
‘adjusted_mutual_info_score’	`metrics.adjusted_mutual_info_score`
‘adjusted_rand_score’	`metrics.adjusted_rand_score`
‘completeness_score’	`metrics.completeness_score`
‘fowlkes_mallows_score’	`metrics.fowlkes_mallows_score`
‘homogeneity_score’	`metrics.homogeneity_score`
‘mutual_info_score’	`metrics.mutual_info_score`
‘normalized_mutual_info_score’	`metrics.normalized_mutual_info_score`
‘v_measure_score’	`metrics.v_measure_score`
Regression（回归）
‘explained_variance’	`metrics.explained_variance_score`
‘neg_mean_absolute_error’	`metrics.mean_absolute_error`
‘neg_mean_squared_error’	`metrics.mean_squared_error`
‘neg_mean_squared_log_error’	`metrics.mean_squared_log_error`
‘neg_median_absolute_error’	`metrics.median_absolute_error`
‘r2’	`metrics.r2_score`

注：roc_auc只能用于二分类。scoring的默认值是score，即为分类器clf.score方法得到的各个kfold的值

cross_validate 函数与 cross_val_score 在下面的两个方面有些不同 -

它允许指定多个指标进行评估.
除了测试得分之外，它还会返回一个包含训练得分，拟合次数， score-times （得分次数）的一个字典。 It returns a dict containing training scores, fit-times and score-times in addition to the test score.

猜你喜欢