TypeError: __init__() got multiple values for argument 'shuffle'

以下银行卡诈骗项目中的一段代码:

from sklearn.cross_validation import KFold
def printing_Kfold_scores(x_train_data,y_train_data):
     fold =KFold(len(y_train_data),5,shuffle=False) 
     for iteration, indices in enumerate(fold,start=1):
         lr = LogisticRegression(C = c_param,penalty = 'l1')   
         lr.fit(x_train_data.iloc[indices[0],:],y_train_data.iloc[indices[0],:].values.ravel())
         y_pred_undersample =lr.predict(x_train_data.iloc[indices[1],:].values)
         recall_acc =recall_score(y_train_data.iloc[indices[1],:].values,y_pred_undersample)
         recall_accs.append(recall_acc)

以上这段代码本身是没有问题的,但由于库版本的原因,有的人在运行这段代码后,出现以下错误:

ModuleNotFoundError: No module named 'sklearn.cross_validation'

为此他将from sklearn.cross_validation import KFold改为from sklearn.model_selection import KFold,再运行却发现有了新的问题:

TypeError: __init__() got multiple values for argument 'shuffle'

这是为什么呢?其实这是导入 KFold的方式不同引起的。如果你这样做:from sklearn.cross_validation import KFold,那么:

KFold(n,5,shuffle=False)  # n为总数,需要传入三个参数

但如果你这样做:from sklearn.model_selection import KFold,那么:

fold = KFold(5,shuffle=False)  # 无需传入n

正确代码如下:

from sklearn.model_selection import KFold
def printing_Kfold_scores(x_train_data,y_train_data):
    fold = KFold(5,shuffle=False) 
    recall_accs = []
    for iteration, indices in enumerate(fold.split(x_train_data)): 
        lr = LogisticRegression(C = c_param, penalty = 'l1')   
        lr.fit(x_train_data.iloc[indices[0],:],y_train_data.iloc[indices[0],:].values.ravel())  
        y_pred_undersample = lr.predict(x_train_data.iloc[indices[1],:].values)  
        recall_acc = recall_score(y_train_data.iloc[indices[1],:].values,y_pred_undersample) 
        recall_accs.append(recall_acc)

所以,导入库方式不同,会导致传入参数有所不同,一定要注意。

猜你喜欢

转载自blog.csdn.net/weixin_40283816/article/details/83242777