Machine Learning: Anomaly Detection

insert image description here

problem definition

insert image description here
anomaly,outlier, novelty, exceptions

Different approaches use different terms to define this type of problem.

insert image description here

application

insert image description here

Two categories

insert image description here

If there is only normal data, and the range of abnormal data is very wide (not exhaustive), binary classification is not easy to do. In addition, abnormal data is not easy to collect.

Classification

insert image description here
insert image description here
insert image description here
Each picture is labeled, and you can train a member classifier of the Simpson family.
insert image description here
Anomaly detection based on classifer.
insert image description here
Do abnormal questions based on confidence scores. If it is greater than a certain value, it is normal, and if it is less than a certain value, it is abnormal. The maximum score will be misjudged
insert image description here
as part of the confidence data.
insert image description here
insert image description here
insert image description here

Confidence Score Estimation

insert image description here
Directly teach the network confidence score, not only do the classification task C, but also give the confidence score P

Train and Eval

insert image description here
100 pictures of the Simpsons, 5 anomalous pictures
insert image description here
insert image description here

  • Normal graphs with blue color are misclassified as abnormal
  • Abnormal maps with red color are misclassified as normal

At this time, use the dev set to evaluate the system, which is a binary classification problem.
insert image description here
The distribution of normal and abnormal ratios is very different. This system can have a high accuracy rate, but it does nothing. It is meaningless to use ACC accuracy rate to classify.

insert image description here
Use the confusion matrix:
insert image description here
insert image description here
cost table, the cost of doing wrong behavior, calculate a score:
insert image description here
insert image description here
set the cost table for your own tasks. There are also some methods to measure, such as AUC (area of ​​ROC curve).

question

insert image description here
insert image description here
If the face is yellow, then the system will give a higher score, which means that what the classification system learns is not to recognize people, but whether the face is yellow.

insert image description here
Suppose you can receive some abnormal data, you can learn to classify and give abnormal scores at the same time, but this kind of data is not easy to collect. Consider using GAN to generate anomalous data.

Scenes without labels

insert image description here
insert image description here
Normal players and abnormal players (Xiaobai)

problem definition

insert image description here
insert image description here
insert image description here
insert image description here
A numerical method is needed to give each player a score. f (sta) f(sta)f ( sta ) probability density estimate
insert image description here
Gaussian distribution
insert image description here
insert image description here
insert image description here
insert image description here
insert image description here

Guess you like

Origin blog.csdn.net/uncle_ll/article/details/132007146