模型评估、选择与验证——数据集切分

train_test_split

模型原型

class sklearn.model_selection.train_test_split(arrays, *options)
参数

  • *arrays : 一个或多个数据集
  • test_size : 指定测试集的大小
    • 浮点数:测试集占原始数据集的比例
    • 整数:测试集的大小
    • None:测试集大小=原始数据集大小-训练数据集大小
  • train_size : 指定训练集的大小
    • 浮点数:训练集占原始数据集的比例
    • 整数:训练集的大小
    • None:训练集大小=原始数据集大小-测试数据集大小
  • random_state
  • shuffle
  • stratify : 采样的标记数组

返回值

  • 一个列表,依次给出一个或多个数据集划分的结果,每个数据集都划分为两部分:训练集,测试集

示例

from sklearn.model_selection import train_test_split
X=[
    [1,2,3,4],
    [11,12,13,14],
    [21,22,23,24],
    [31,32,33,34],
    [41,42,43,44],
    [51,52,53,54],
    [61,62,63,64],
    [71,72,73,74]
]
y=[1,1,0,0,1,1,0,0]
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.4,random_state=0)
print('X_train:%s\nX_test:%s\ny_train:%s\ny_test:%s'%(X_train,X_test,y_train,y_test))
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.4,random_state=0,stratify=y)
print('\nStratify:\nX_train:%s\nX_test:%s\ny_train:%s\ny_test:%s'%(X_train,X_test,y_train,y_test))

KFold

模型原型

class sklearn.model_selection.KFold(n_splits=3,shuffle=False,random_state=None)
参数

  • n_splits
  • shuffle
  • random_state

方法

  • get_n_splits([X,y,groups])
  • split(X[,y,groups])

示例

from sklearn.model_selection import KFold
import numpy as np

X=np.array([
    [1,2,3,4],
    [11,12,13,14],
    [21,22,23,24],
    [31,32,33,34],
    [41,42,43,44],
    [51,52,53,54],
    [61,62,63,64],
    [71,72,73,74],
    [81,82,83,84]
])
y=np.array([1,1,0,0,1,1,0,0,1])

folder=KFold(random_state=0,shuffle=False)
for train_index,test_index in folder.split(X,y):
    print('Train Index:%s\nTest Index:%s\nX_train:\n%s\nX_test:\n%s\n'%
        (train_index,test_index,X[train_index],X[test_index]))

shuffle_folder=KFold(random_state=0,shuffle=True)
for train_index,test_index in shuffle_folder.split(X,y):
    print('Shuffled\nTrain Index:%s\n
        Test Index:%s\nX_train:\n%s\nX_test:\n%s\n'%
        (train_index,test_index,X[train_index],X[test_index]))

猜你喜欢

转载自blog.csdn.net/weixin_39777626/article/details/79936334