AdaBoost algorithm introduction and code implementation

Algorithm principle

The core idea of the AdaBoost algorithm is to combine weak classifiers into a strong classifier. In each round of iteration, AdaBoost will train a new weak classifier and adjust the weight of each sample, so that the previously misclassified samples will receive more attention in the next round of iteration. Finally, AdaBoost weighted and summed the prediction results of all weak classifiers to obtain the final classification result.

Specifically, the steps of the AdaBoost algorithm are as follows:

Initialize sample weights: assign equal weights to each sample, ie $w_i=1/N$ , where $N$ is the sample size.
Iteratively train a base classifier: In each iteration, a base classifier is trained using the current sample weights, and the error rate of the classifier is calculated. The error rate is defined as the number of misclassified samples divided by the total sample size.
Calculate the weight of the basic classifier: calculate its corresponding weight according to the error rate of the classifier, and the classifier with the lower error rate gets a higher weight. The specific calculation formula is $w_j=\frac{1}{2}\ln(\frac{1-\epsilon_j}{\epsilon_j})$ , where $\epsilon_j$ is $The error rate of j$ classifiers.
Update sample weights: For each sample, if it is correctly classified, its weight is decreased; if it is misclassified, its weight is increased. The specific formula is $w_i^{(t+1)}=\frac{w_i^{(t)}\exp(- \alpha_ty_ih_t(x_i))}{Z_t}$ , where $\alpha_t$ is the Classifier weights for $t$ $y_i$ is sample $The true label of i$ , $h_t(x_i)$ is the $The classifier for t$ iterations pairs sample $The predicted result of i$ , $Z_t$ is the normalization factor such that the new sample weights sum to 1.
Combine base classifiers: Combine all base classifiers into one strong classifier, where each classifier has a weight equal to its corresponding weight.

Code

Here is an example implementation of the AdaBoost algorithm using Python and the scikit-learn library:

# 导入所需的库和数据集
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# 加载数据集
data = load_iris()
X, y = data.data, data.target

# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# 初始化AdaBoost算法，并使用两种不同的基本分类器
base_estimators = [
    DecisionTreeClassifier(max_depth=1),  # 使用深度为1的决策树作为第一个基本分类器
    SVC(kernel='linear', C=1.0)          # 使用线性SVM作为第二个基本分类器
]
n_estimators = 50                        # 设置迭代次数为50
learning_rate = 1.0                      # 设置学习率为1.0
#使用AdaBoost算法创建分类器对象clf，estimator参数指定使用的基本分类器，这里选择使用base_estimators列表中的第一个元素，即深
#度为1的决策树
clf = AdaBoostClassifier(estimator=base_estimators[0], n_estimators=n_estimators, learning_rate=learning_rate)

# 在训练集上拟合模型
clf.fit(X_train, y_train)

# 在测试集上评估模型
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy: %.2f%%" % (accuracy * 100.0))

In the above example, we first load_irisload the iris dataset using the function, and then split the dataset into training and testing sets. Next, we initialize an AdaBoost classifier, where the basic classifier is a decision tree with a depth of 1, the number of iterations is 50, and the learning rate is 1.0. Finally, we fit the model on the training set and evaluate the accuracy of the model on the test set.

In the AdaBoost algorithm, each basic classifier is assigned a weight, which represents the importance of this basic classifier. During the training process, the AdaBoost algorithm gradually builds a strong classifier based on the weighted combination of all basic classifiers.

In each iteration of the AdaBoost algorithm, it uses the current strong classifier to classify the data and weights the data according to the classification results so that the next iteration can better handle misclassified data. Specifically, for each misclassified sample, the AdaBoost algorithm increases its weight so that the next iteration can pay more attention to these misclassified samples. For correctly classified samples, their weight will be reduced so that the next iteration can pay more attention to those samples that are difficult to classify.

During training, the weight of each base classifier is determined according to its classification performance in the current strong classifier. Specifically, the better the classification performance of the base classifier, the greater its weight, and vice versa. After each iteration, the AdaBoost algorithm calculates the error rate of the current strong classifier, and assigns weights to the basic classifiers in the next iteration according to the error rate.

Finally, when the specified number of iterations is reached or the error rate meets the requirements, the AdaBoost algorithm returns a strong classifier, which is a weighted combination of all basic classifiers. This strong classifier can be used to classify new samples to achieve predictions on unknown data.

Tuning hyperparameters

The hyperparameters can be tuned and the learning curve plotted using the following code:

import matplotlib.pyplot as plt

# 调整迭代次数
n_estimators_range = range(1, 101, 10)
train_scores = []
test_scores = []
for n_estimators in n_estimators_range:
    clf = AdaBoostClassifier(estimator=base_estimators[0], n_estimators=n_estimators, learning_rate=learning_rate)
    clf.fit(X_train, y_train)
    train_scores.append(clf.score(X_train, y_train))
    test_scores.append(clf.score(X_test, y_test))
plt.plot(n_estimators_range, train_scores, label="Train")
plt.plot(n_estimators_range, test_scores, label="Test")
plt.xlabel("n_estimators")
plt.ylabel("Accuracy")
plt.legend()
plt.show()

# 调整学习率
learning_rate_range = [0.1, 0.5, 1, 2, 5]
train_scores = []
test_scores = []
for learning_rate in learning_rate_range:
    clf = AdaBoostClassifier(estimator=base_estimators[0], n_estimators=n_estimators, learning_rate=learning_rate)
    clf.fit(X_train, y_train)
    train_scores.append(clf.score(X_train, y_train))
    test_scores.append(clf.score(X_test, y_test))
plt.plot(learning_rate_range, train_scores, label="Train")
plt.plot(learning_rate_range, test_scores, label="Test")
plt.xlabel("learning_rate")
plt.ylabel("Accuracy")
plt.legend()
plt.show()

Please add a picture description

Here, we adjusted the two hyperparameters of iteration number and learning rate respectively, and plotted the corresponding learning curve. It can be seen from the learning curve that as the number of iterations increases, the accuracy of the model gradually improves, but it tends to stabilize after reaching a certain level. At the same time, as the learning rate increases, the accuracy of the model will first increase and then decrease, so appropriate parameter tuning is required. It should be noted that other hyperparameters can also be adjusted in practical applications, such as the depth of the basic classifier, the kernel function of the support vector machine, etc., to obtain better performance.

Summarize

In conclusion, the AdaBoost algorithm is a powerful ensemble learning method that can combine multiple weak classifiers into a strong classifier, thereby improving the accuracy and generalization ability of the model. In practical applications, we can choose different basic classifiers and adjust different hyperparameters according to specific problems to obtain the best performance.

It should be noted that the AdaBoost algorithm has relatively loose requirements on the basic classifier, and any classifier can be used as the basic classifier, such as decision tree, support vector machine, etc. At the same time, adjusting hyperparameters such as the number of iterations and learning rate will also affect the performance of the model, and appropriate parameter adjustments are required.