Machine Learning Notes - Multiple Instance Learning (MIL) Weakly Supervised Learning

1. Overview of multiple instance learning

        Multiple instance learning (MIL) is a form of weakly supervised learning in which training instances are arranged in sets called bags, and labels are provided for the entire bag. This approach has received increasing attention because it is naturally suitable for various problems and allows the exploitation of weakly labeled data. Therefore, it is used in different application domains such as computer vision and document classification.

        Multiple instance learning (MIL) deals with training data arranged in ensembles, called bags. Individual labels for the instances contained in the bag are not provided. This formulation of the problem has drawn much attention from the research community, especially in recent years when the amount of data required to solve large-scale problems has grown exponentially. Large amounts of data require more and more labeling work, and weakly supervised methods can reduce this burden. MIL is increasingly used in many other application domains, such as image and video classification, document classification, and sound classification.

        MIL is a variant of supervised learning. In many articles or papers, most of them are applied in pathology. The technique involves assigning a single class label to an input collection - called a bag of instances in this context. While labels are assumed to exist for every instance in the bag, these labels are not accessible and they remain unknown during training. A bag is usually labeled negative if all instances in the bag are negative, and positive if there is at least one positive instance (known as the standard MIL assumption). The diagram below shows a simple example where we only know if the keychain contains a key that can open a given door. This allows us to deduce that the green key opens the door.

2. The application of multi-instance learning

        MIL finds applications in various domains where training data can be naturally organized into packages. Some examples include:

        

Guess you like

Origin blog.csdn.net/bashendixie5/article/details/131161522