Anomaly data point detection

One of the functions of the automatic operation and maintenance system is the abnormal alarm of various indicators. At present, the system adopts a constant threshold method (that is, a few people came up with their heads Silent), which brings about two main problems: first, each The current system has many indicators and parameters, and each indicator is shot twice, and all the thresholds have been beaten before they are set giggle; secondly, the thresholds set in this way are either too frequent alarms or no alarms when there is a problem. Can't be adaptive.

Based on the above reasons, it is planned to add abnormal data detection to the alarm subsystem of the system. Because it is a work-in-progress project, for the sake of confidentiality, all relevant charts of the data set are omitted, and only the core algorithm ideas and conclusions are expressed.

===========================I am a cute dividing line =================== =

1. Algorithm selection

Since it only provides an alarm function, it is still up to the engineer to decide whether to deal with it or not. Therefore, it is decided that a very complicated algorithm is not needed for the time being. After weighing, two relatively simple algorithms are selected as candidates.

1. Box method , that is, the data (here are all one-dimensional data) are first arranged in ascending order, then the first quantile q1, tertile q3 are calculated, and then the maximum estimated value (because the system does not care about the minimum estimate) )As shown below:

Using the previous data as a sample set, calculate the maximum estimate. When testing, compare the data in a continuous period of time with the maximum estimated value. If more than 90% of the data are larger than the changed value, an abnormal alarm will be judged.

2. Rheinda's rule, first calculate the mean avg and variance std of the sample set, and then calculate |xi - avg | > 3 * std in the test set . Similarly, if the sampling value exceeds 90% of the data in a continuous period of time, it is abnormal. Report to the police.

2. Algorithm test results

1. Take the data of the system in August as the sample set, and calculate the q1, q3, avg, and std of a set of indicators respectively.

2. Take the data of the system in September as the test set, and calculate the number and time period of the alarm according to the two algorithms respectively.

3. The alarm results of the box-type method give a total of 12 alarms in different time periods of three days; the Rheinda algorithm does not give an alarm in the case of three times the standard deviation. After adjusting to 1.5 times the standard deviation, it gives There are 5 alarms in one day (coinciding with the box method), and no alarm was given in the entire September under the previous constant threshold

3. Conclusion

If the maximum estimated value calculated by the box-type method is used as the threshold value, the system will be more sensitive than the Rheinda method, and the number of alarms will be correspondingly more. If it is a more important indicator in the system, this method can be used. Using the Rheinda algorithm, adjust the k value or the multiple of the standard deviation according to the actual needs to achieve the sensitivity of the system. Secondly, in the actual system, because the algorithm is very simple, the q1, q3, avg, and std required by the system prediction can be recalculated at a fixed time every night (use the first 30 days of the current time as the sample set, which can reduce the false alarm rate of the system). In the follow-up work, it is also necessary to introduce algorithm selectors (that is, algorithms for automatically selecting computed columns) according to the importance and nature of indicators.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325866848&siteId=291194637