How to perform feature selection (rfecv) in cross validation in sklearn

EmJ :

I want to perform recursive feature elimination with cross validation (rfecv) in 10-fold cross validation (i.e. cross_val_predict or cross_validate) in sklearn.

Since rfecv itself has a cross validation part in its name, I am not clear how to do it. My current code is as follows.

from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target

from sklearn.ensemble import RandomForestClassifier

clf = RandomForestClassifier(random_state = 0, class_weight="balanced")

k_fold = StratifiedKFold(n_splits=10, shuffle=True, random_state=0)

rfecv = RFECV(estimator=clf, step=1, cv=k_fold)

Please let me know how I can use the data X and y with rfecv in 10-fold cross validation.

I am happy to provide more details if needed.

desertnaut :

To use recursive feature elimination in conjunction with a pre-defined k_fold, you should use RFE and not RFECV:

from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import accuracy_score
from sklearn import datasets

iris = datasets.load_iris()
X = iris.data
y = iris.target

k_fold = StratifiedKFold(n_splits=10, shuffle=True, random_state=0)
clf = RandomForestClassifier(random_state = 0, class_weight="balanced")
selector = RFE(clf, 5, step=1)

cv_acc = []

for train_index, val_index in k_fold.split(X, y):
    selector.fit(X[train_index], y[train_index])
    pred = selector.predict(X[val_index])
    acc = accuracy_score(y[val_index], pred)
    cv_acc.append(acc)

cv_acc
# result:
[1.0,
 0.9333333333333333,
 0.9333333333333333,
 1.0,
 0.9333333333333333,
 0.9333333333333333,
 0.8666666666666667,
 1.0,
 0.8666666666666667,
 0.9333333333333333]

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=368393&siteId=1
Recommended