本文链接： https://blog.csdn.net/weixin_42297855/article/details/97917976

import sklearn

linear_model

.LinearRegression(fit_intercept=True, normalize=False, copy_X=True, n_jobs=None)
.Ridge(alpha=1.0, fit_intercept=True, normalize=False, copy_X=True, max_iter=None, tol=0.001, solver=’auto’, random_state=None)
.RidgeCV(alphas=(0.1, 1.0, 10.0), fit_intercept=True, normalize=False, scoring=None, cv=None, gcv_mode=None, store_cv_values=False)
.Lasso
.MultiTaskLasso
.ElasticNet
.MultiTaskElasticNet
.LassoLars
.OrthogonalMatchingPursuit 或 .orthogonal_mp
.BayesianRidge
.ARDRegression
.LogisticRegression
.SGDClassifier(loss=’hinge’, penalty=’l2’, alpha=0.0001, l1_ratio=0.15, fit_intercept=True, max_iter=1000, tol=0.001, shuffle=True, verbose=0, epsilon=0.1, n_jobs=None, random_state=None, learning_rate=’optimal’, eta0=0.0, power_t=0.5, early_stopping=False, validation_fraction=0.1, n_iter_no_change=5, class_weight=None, warm_start=False, average=False)

loss：分类损失函数： ‘hinge’, ‘log’, ‘modified_huber’, ‘squared_hinge’, ‘perceptron’；回归损失函数：‘squared_loss’, ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’
penalty：惩罚项，默认l2
max_iter：最大迭代次数。

.SGDRegressor(loss=’squared_loss’, penalty=’l2’, alpha=0.0001, l1_ratio=0.15, fit_intercept=True, max_iter=1000, tol=0.001, shuffle=True, verbose=0, epsilon=0.1, random_state=None, learning_rate=’invscaling’, eta0=0.01, power_t=0.25, early_stopping=False, validation_fraction=0.1, n_iter_no_change=5, warm_start=False, average=False)

loss：略。

.Perceptron
.PassiveAggressiveClassifier
.HuberRegressor

方法

以下clf代指上述任何分类器或回归器。

clf.fit(X_train,y_train)
clf.predict(X_test)

属性

clf.coef_：非常数项系数。
clf.intercept_：常数项系数。
clf.decision_function：决策函数

discriminant_analysis

判别分析
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA：线性判别分析
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis as QDA：二次判别分析

LDA(solver=’svd’, shrinkage=None, priors=None, n_components=None, store_covariance=False, tol=0.0001)

参数

solver：svd：奇异值分解，是默认的求解器，不计算协方差矩阵，因此建议用于具有大量特征的数据。lsqr：最小二乘解。eigen：特征值分解。
shrinkage：收缩率，可以在训练样本数量比特征数量少的情况下改进协方差矩阵的估计。可以设置为auto或[0,1]的数。指定为auto时需要将sover设置成lsqr或eigen。

下图来自于Sklearn官方文档关于收缩率在不同样本量下的表现。
在这里插入图片描述

priors
n_components：类别数
store_covariance
tol：SVD求解中用于秩和估计的阈值。

属性

coef_ :
Weight vector(s).

intercept_ :
Intercept term.

covariance_ :
Covariance matrix (shared by all classes).

explained_variance_ratio_ :
Percentage of variance explained by each of the selected components. If n_components is not set then all components are stored and the sum of explained variances is equal to 1.0. Only available when eigen or svd solver is used.

means_ : array-like, shape (n_classes, n_features)
Class means.

priors_ : array-like, shape (n_classes,)
Class priors (sum to 1).

scalings_ : array-like, shape (rank, n_classes - 1)
Scaling of the features in the space spanned by the class centroids.

xbar_ : array-like, shape (n_features,)
Overall mean.

classes_ : array-like, shape (n_classes,)
Unique class labels.

方法

decision_function(self, X) Predict confidence scores for samples.
fit(self, X, y) Fit LinearDiscriminantAnalysis model according to the given training data and parameters.
fit_transform(self, X[, y]) Fit to data, then transform it.
get_params(self[, deep]) Get parameters for this estimator.
predict(self, X) Predict class labels for samples in X.
predict_log_proba(self, X) Estimate log probability.
predict_proba(self, X) Estimate probability.
score(self, X, y[, sample_weight]) Returns the mean accuracy on the given test data and labels.
set_params(self, **params) Set the parameters of this estimator.
transform(self, X) Project data to maximize class separation.

QDA(priors=None, reg_param=0.0, store_covariance=False, tol=0.0001)

.kernel_ridge

from sklearn.kernel_ridge import KernelRidge

KernelRidge(alpha=1, kernel=’linear’, gamma=None, degree=3, coef0=1, kernel_params=None)

.svm

from sklearn import svm

分类

svm.SVC(C=1.0, kernel=’rbf’, degree=3, gamma=’auto_deprecated’, coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape=’ovr’, random_state=None)

C
kernel：指定核函数：linear,polynomial,rbf,sigmoid。还可以是自定义的核函数。
degree：当kernel指定为polynomial时，指定多项式次数。
gamma：当kernel为非linear时，指定 $\gamma$ 的值。
coef0：当kernel为polynomial或sigmoid时，指定r的值。
shrinking
probability
tol
cache_size
class_weight：在fit方法中设置，用于样本不平衡问题。
verbose
max_iter
decision_function_shape：ovo表示一对一，ovr表示一对剩下的，
random_state：略

svm.NuSVC(nu=0.5, kernel=’rbf’, degree=3, gamma=’auto_deprecated’, coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape=’ovr’, random_state=None)

svm.LinearSVC(penalty=’l2’, loss=’squared_hinge’, dual=True, tol=0.0001, C=1.0, multi_class=’ovr’, fit_intercept=True, intercept_scaling=1, class_weight=None, verbose=0, random_state=None, max_iter=1000)

回归

svm.SVR(kernel=’rbf’, degree=3, gamma=’auto_deprecated’, coef0=0.0, tol=0.001, C=1.0, epsilon=0.1, shrinking=True, cache_size=200, verbose=False, max_iter=-1)

svm.NuSVR(nu=0.5, C=1.0, kernel=’rbf’, degree=3, gamma=’auto_deprecated’, coef0=0.0, shrinking=True, tol=0.001, cache_size=200, verbose=False, max_iter=-1)

svm.LinearSVR(epsilon=0.0, tol=0.0001, C=1.0, loss=’epsilon_insensitive’, fit_intercept=True, intercept_scaling=1.0, dual=True, verbose=0, random_state=None, max_iter=1000)

属性：
support_vectors_
support_
n_support
.decision_function
dual_coef_： $y_i \alpha_i$ 或 $\alpha_i - \alpha_i^*$
intercept_：略

.neighbors

.NearestNeighbors(n_neighbors=5, radius=1.0, algorithm=’auto’, leaf_size=30, metric=’minkowski’, p=2, metric_params=None, n_jobs=None, **kwargs)

n_neighbors：近邻数，即选择最近的几个样本来分类。
radius
algorithm：算法：‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute`

.KDTree

.BallTree

方法：

.tree

.DecisionTreeClassifier(criterion=’gini’, splitter=’best’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, class_weight=None, presort=False)

.DecisionTreeRegressor(criterion=’mse’, splitter=’best’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, presort=False)

方法：
clf.predict_proba([[2., 2.]])：预测每个类的概率，即叶中相同类的训练样本的分数

.ensemble

集成学习

.AdaBoostRegressor(base_estimator=None, n_estimators=50, learning_rate=1.0, loss=’linear’, random_state=None)

n_estimators：基学习器数量
base_estimator：基学习器类型，默认.tree.DecisionTreeRegressor(max_depth=3)

.AdaBoostClassifier(base_estimator=None, n_estimators=50, learning_rate=1.0, algorithm=’SAMME.R’, random_state=None)
.BaggingClassifier(base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, warm_start=False, n_jobs=None, random_state=None, verbose=0)
.BaggingRegressor(base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, warm_start=False, n_jobs=None, random_state=None, verbose=0)
.RandomForestClassifier(n_estimators=’warn’, criterion=’gini’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=’auto’, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, class_weight=None)
.RandomForestRegressor(n_estimators=’warn’, criterion=’mse’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=’auto’, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False)
.ExtraTreesClassifier(n_estimators=’warn’, criterion=’gini’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=’auto’, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=False, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, class_weight=None)
.ExtraTreesRegressor(n_estimators=’warn’, criterion=’mse’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=’auto’, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=False, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False)
.VotingClassifier(estimators, voting=’hard’, weights=None, n_jobs=None, flatten_transform=True)

voting：默认hard表示绝对多数投票法，即选择超过半数的票数；若为soft表示相对多数投票法，即选择最多票数。

.VotingRegressor(estimators, weights=None, n_jobs=None)[source]
.GradientBoostingClassifier(loss=’deviance’, learning_rate=0.1, n_estimators=100, subsample=1.0, criterion=’friedman_mse’, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, min_impurity_split=None, init=None, random_state=None, max_features=None, verbose=0, max_leaf_nodes=None, warm_start=False, presort=’auto’, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001)
.GradientBoostingRegressor(loss=’ls’, learning_rate=0.1, n_estimators=100, subsample=1.0, criterion=’friedman_mse’, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, min_impurity_split=None, init=None, random_state=None, max_features=None, alpha=0.9, verbose=0, max_leaf_nodes=None, warm_start=False, presort=’auto’, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001)
.HistGradientBoostingClassifier(loss=’auto’, learning_rate=0.1, max_iter=100, max_leaf_nodes=31, max_depth=None, min_samples_leaf=20, l2_regularization=0.0, max_bins=256, scoring=None, validation_fraction=0.1, n_iter_no_change=None, tol=1e-07, verbose=0, random_state=None)：数据量较大时效果比GradientBoostingClassifier好得多。
.HistGradientBoostingRegressor(loss=’least_squares’, learning_rate=0.1, max_iter=100, max_leaf_nodes=31, max_depth=None, min_samples_leaf=20, l2_regularization=0.0, max_bins=256, scoring=None, validation_fraction=0.1, n_iter_no_change=None, tol=1e-07, verbose=0, random_state=None)

.feature_selection

.VarianceThreshold(threshold=0.0)？

.preprocessing

from sklearn import preprocessing

数据标准化

preprocessing.StandardScaler()
该标准器本质上是保留了原数据的均值和标准差，并可以同样标准化作用于测试数据。
例：

scaler = preprocessing.StandardScaler().fit(X_train) 
X_test_transformed = scaler.transform(X_test)

preprocessing.scale(X_train)把数据转化为标准正态分布。
min_max_scaler = preprocessing.MinMaxScaler()把数据转化到[0,1]区间
例：

min_max_scaler = preprocessing.MinMaxScaler()
X_train_minmax = min_max_scaler.fit_transform(X_train)
X_test_minmax = min_max_scaler.transform(X_test)

max_abs_scaler = preprocessing.MaxAbsScaler()，用法同MinMaxScaler，范围变成[-1,1]

特征编码

1.preprocessing.OneHotEncoder
2. preprocessing.OrdinalEncoder
3. preprocessing.LabelEncoder

model_selection

from sklearn import model_selection

数据集分割

X_train, X_test, y_train, y_test = model_selection.train_test_split(data,target, test_size=0.4, random_state=0,stratify=None)

test_size：测试集的比例
n_splits：k值，进行k次的分割
stratify：指定分层抽样变量，按该变量的类型分布分层抽样。

ShuffleSplit

打乱后分割
n_splits

交叉验证

model_selction.KFold(n_splits=’warn’, shuffle=False, random_state=None)

示例：kf = KFold(n_splits=2)
kf.split(X_train, y_train)

n_splits
shuffle是否打乱，默认否。

RepeatedKFold

n_splits
n_repeats：重复次数
3.LeaveOneOut
4.LeavePOut
5.StratifiedKFold
p：p值
6.GroupKFold
7.LeaveOneGroupOut
8.LeavePGroupsOut
9.GroupShuffleSplit
10.TimeSeriesSplit

超参数搜索

1.model_selection.GridSearchCV(estimator, param_grid, scoring=None, n_jobs=None, iid=’warn’, refit=True, cv=’warn’, verbose=0, pre_dispatch=‘2*n_jobs’, error_score=’raise-deprecating’, return_train_score=False) 网格搜索
参数：

estimator：学习器
param_grid：指定参数空间，以字典形式给出。若有多个参数空间则用list框起来。
scoring：评价准则，若不指定则默认学习器自带的评价准则
n_jobs：指定要并行计算的线程数，默认为None即1，如果设定为-1则表示使用全部cpu。
iid
refit
cv：略
verbose
pre_dispatch
error_score
return_train_score

属性：

cv_results_：返回网格搜索的结果
best_estimator_：返回最优的学习器
best_params_：返回最优的参数
best_score_：返回最优的评价值

2.model_selection.RandomizedSearchCV(estimator, param_distributions, n_iter=10, scoring=None, n_jobs=None, iid=’warn’, refit=True, cv=’warn’, verbose=0, pre_dispatch=‘2*n_jobs’, random_state=None, error_score=’raise-deprecating’, return_train_score=False)随机搜索
参数：

estimator：略
param_distributions：参数的分布，写法和上面的param_grid相似，字典值里是一个随机分布，如果给的是一个list则默认均匀分布。
n_iter
scoring：略
n_jobs：略
iid
refit
cv：略
verbose
pre_dispatch
random_state：略
error_score
return_train_score

属性同GridSearchCV

评估

1.cross_val_score
示例：scores = cross_val_score(clf, data, target, cv=5)
clf:分类器
data
target
cv：当cv为整数时默认使用kfold或分层折叠策略，如果估计量来自ClassifierMixin，则使用后者。另外还可以指定其它的交叉验证迭代器或者是自定义迭代器。
scoring：指定评分方式，详见这里
2.cross_validate

.cross_validation

from sklearn.cross_validation import KFold

.metrics

评价指标详解
分类器评价指标：

.accuracy_score(y_true, y_pred, normalize=True, sample_weight=None)

normalize：默认返回正确率，若为False则返回预测正确的样本数。

.balanced_accuracy_score(y_true, y_pred, sample_weight=None, adjusted=False)

adjusted：

.average_precision_score(y_true, y_score, average=’macro’, pos_label=1, sample_weight=None)
.recall_score(y_true, y_pred, labels=None, pos_label=1, average=’binary’, sample_weight=None)
.precision_score(y_true, y_pred, labels=None, pos_label=1, average=’binary’, sample_weight=None)
.f1_score(y_true, y_pred, labels=None, pos_label=1, average=’binary’, sample_weight=None)
.log_loss(y_true, y_pred, eps=1e-15, normalize=True, sample_weight=None, labels=None)

1.scorer
scorer.make_scorer
2.mean_squared_error
average_precision_score
brier_score_loss
log_loss
jaccard_score
roc_auc_score

Clustering

adjusted_mutual_info_score
adjusted_rand_score
completeness_score
fowlkes_mallows_score
homogeneity_score
mutual_info_score
normalized_mutual_info_score
v_measure_score

Regression

explained_variance_score
max_error
mean_absolute_error
mean_squared_error
mean_squared_log_error
median_absolute_error
r2_score

sklearn各模块详解

文章目录