sklearn adaboost source code analysis

CSDN

sklearn.ensemble._weight_boosting.BaseWeightBoosting.fit

for iboost in range(self.n_estimators):
    # Boosting step
    sample_weight, estimator_weight, estimator_error = self._boost(
        iboost,
        X, y,
        sample_weight,
        random_state)

    sample_weight_sum = np.sum(sample_weight)

    if iboost < self.n_estimators - 1:
        # Normalize
        sample_weight /= sample_weight_sum

sklearn.ensemble._weight_boosting.AdaBoostClassifier._boost_discrete

estimator.fit(X, y, sample_weight=sample_weight)

sample_weightIs wmi w_{mi}wm i, Fitting according to sample weight

# Instances incorrectly classified
incorrect = y_predict != y

# Error fraction
estimator_error = np.mean(
    np.average(incorrect, weights=sample_weight, axis=0))

The misclassification rate is the sum of the weights of the misclassified samples: em = ∑ i = 1 N wmi I (G m (xi) ≠ yi) e_m=\sum_{i=1}^{N}w_{mi}I(G_m(x_i )\neq y_i)em=i=1Nwm iI(Gm(xi)=Yi)

Insert picture description here

If the effect of the weak learner is not even random, stop early.

Insert picture description here

Calculate G m G_mGmThe coefficient of α m = 1 2 log 1 − emem \alpha_m=\frac{1}{2}log\frac{1-e_m}{e_m}am=21logem1em

SAMMEThe algorithm considers multiple classifications, and multiplies it by the attenuation of learning_rate to achieve regularization.

Insert picture description here

w m i = e x p ( − y i α m − 1 G m − 1 ( x i ) ) w_{mi}=exp(-y_i \alpha_{m-1} G_{m-1}(x_i)) wm i=e x p ( - yiam1Gm1(xi))

试想 ,y × y ^ y \ times \ hat {y}Y×Y^, In the second classification, only when the two are different, it is 1

In fact, this writing also takes into account the multi-category situation. For multi-category, you only need to modify α m \alpha_mamThat is estimator_weightthe calculation in the code .

Guess you like

Origin blog.csdn.net/TQCAI666/article/details/113248906