[Paper Reading] Active Class Incremental Learning for Imbalanced Datasets

Paper address: https://arxiv.org/abs/2008.10968
Published in: ECCV 20 Workshop

Abstract

Incremental learning (IL) enables AI systems to adapt to streaming data. Most existing algorithms make two strong assumptions that make incremental schemes less realistic: (1) assume that new data is easily annotated when streaming; (2) test with a balanced dataset , while most real-life datasets are imbalanced. These assumptions are discarded, and the resulting challenges will be addressed by a combination of active and imbalanced learning. We introduce a sample acquisition function that solves the imbalance problem and is compatible with incremental learning constraints. We also treat incremental learning as an imbalanced learning problem rather than the established usage of knowledge distillation for catastrophic forgetting. Here, the imbalance effect is reduced by scaling of class predictions during inference. Evaluations are performed with four vision datasets, and existing and proposed sample acquisition functions are compared. The results show that the proposed contributions have a positive effect and reduce the gap between active and standard incremental learning performance.

I. Introduction

This paper is the first work that combines class incremental learning with active learning. The current class incremental learning has two problems: 1) data labeling is simple; 2) the data set is balanced. In practical applications, these two requirements are not always met, and the task of active learning is to pick out the most valuable samples, which are suitable for reducing the amount of annotation and solving data while maintaining performance as much as possible. Set imbalance problem. Therefore, active learning can be combined with incremental learning.

The algorithm flow of this paper is as follows:
insert image description here
Since active learning is essentially added to the class incremental learning method, the initialization method of the model and the class incremental learning are to select all samples of some classes (such as 50% of the classes). label, and then perform fully supervised training on this basis to obtain an initial model ( M 0 M_0 in the figureM0). After that, if the standard class incremental learning process is followed, all samples of some new classes (such as 10% classes) are continuously selected, finetune is performed on this basis, and the performance of the old and new classes is maintained as much as possible. However, since it is active learning, it becomes a partial sample of selecting some new classes (such as 10% of the classes).

As for the selection of these partial samples, the idea of ​​active learning is used. For example, assuming that the labeling budget for this batch of data is B, each time 1/5 B of the data is selected for finetune together with the example sample (exemplar) , rather than the common retrain in active learning . From this perspective, it can also be considered that class incremental learning improves a classic dilemma in active learning (retraining is required).

II. Classical Sample Acquisition Phase

Active learning in this paper adopts a two-stage strategy. In the first stage, some classical active learning methods are employed for initialization. This article selects four methods: coreset, random, entropy, and margin sampling (note that there is even random here). However, these methods do not consider the problem of class imbalance, and the assumption of this paper is class imbalance. For this reason, the second stage was born.

III. Balancing-Driven Sample Acquisition

The second stage is to solve the class imbalance problem. However, the solution is also quite primitive, that is, the classic oversample strategy is adopted: which classes have fewer labeled samples, these classes are labeled more in the post-order. This heuristic strategy is called "Poorest Class First" in the article.

IV. Experiment

insert image description here
Note that the goal of this paper is to compare with fully supervised class increments (i.e. sIL in the penultimate column ). However, the experimental results are also difficult to describe, and the performance gap with sIL is still large. The optimal active learning method is basically occupied by random, which is basically equivalent to not doing it.

Guess you like

Origin blog.csdn.net/qq_40714949/article/details/124312917