sklearn.feature_selection讲解

class sklearn.feature_selection.SelectKBest(score_func=, k=10)
作用:Select features according to the k highest scores
选出分数最高的k个特征

Parameters:
score_func : callable
Function taking two arrays X and y, and returning a pair of arrays (scores, pvalues) or a single array with scores. Default is f_classif (see below “See also”). The default function only works with classification tasks.
k : int or “all”, optional, default=10
Number of top features to select. The “all” option bypasses selection, for use in a parameter search.
输出分数最高的K个特征

类方法:
fit(X, y) Run score function on (X, y) and get the appropriate features.
对X,y数据的特征进行评价
fit_transform(X[, y]) Fit to data, then transform it.
只保留数据X的前K个分数最高的特征

examples:

>>> from sklearn.datasets import load_digits
>>> from sklearn.feature_selection import SelectKBest, chi2
>>> X, y = load_digits(return_X_y=True)
>>> X.shape
(1797, 64)
>>> X_new = SelectKBest(chi2, k=20).fit_transform(X, y)
>>> X_new.shape
(1797, 20)

猜你喜欢

转载自blog.csdn.net/Du_Shuang/article/details/84338642