scikit-learn使用汇总

1. 分类器

1.1. 逻辑回归

  Logistic regression在sklearn中有不同的实现方式,即solver{‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’}, default=’lbfgs’,其中当solver为‘sag’或者‘liblinear’时,需要指定随机种子(The seed of the pseudo random number generator to use when shuffling the data)。

Changed in version 0.22: The default solver changed from ‘liblinear’ to ‘lbfgs’ in 0.22.

  所以如果scikit-learn版本低于0.22,使用默认参数,则就需要指定随机种子。
  但如果使用逻辑回归,出现下列警告:

ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)

  此时增加max_iter即可。

修改前:

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

X, y = load_iris(return_X_y=True)
lr = LogisticRegression(random_state=0)
lr.fit(X, y)
print(lr.score(X, y))

修改后:

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

X, y = load_iris(return_X_y=True)
lr = LogisticRegression(random_state=0, max_iter=5000)
lr.fit(X, y)
print(lr.score(X, y))

  逻辑回归每次fit的时候,都会重新初始化coef_和intercept_,其中部分fit代码所示:

def fit(self, X, y, sample_weight=None):
	self.coef_ = list()
    self.intercept_ = np.zeros(n_classes)

  假如我们现有需求是进行多次fit,下一次fit想在上一次fit的基础上进行训练,则只需加上参数warm_start=True即可,
warm_start:热启动参数,bool类型。默认为False。如果为True,则下一次训练是以追加树的形式进行(重新使用上一次的调用作为初始化)。

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
import sklearn

X, y = load_iris(return_X_y=True)
lr = LogisticRegression(random_state=0, max_iter=5000, warm_start=True)
lr.fit(X, y)
lr.coef_
lr.intercept_

在这里插入图片描述

lr.fit(X, y)
lr.coef_
lr.intercept_

在这里插入图片描述
  如图所示,coef_和intercept_略有差异,说明warm_start参数是work的。

猜你喜欢

转载自blog.csdn.net/herosunly/article/details/105763460
今日推荐