异常值检测(Detecting Outliers)

Most statistical approaches to outlier detection are based on building a probability distribution model and considering how likely objects are under that model.

Probalistic Definition of an Outlier: An outlier is an object that has a low probability with respect to a probability distribution model of the data.

The Gaussian(nomal) distribution is one of the most frequently used distributions in statistics. There is little chance that an object(value) from a N(0,1) distribution

will occur in the tails of the distribution. For instance, there is only a probability of 0.0027 that an object lies beyond the central area between -3 and +3 standard

deviations.

猜你喜欢

转载自www.cnblogs.com/donggongdechen/p/10837562.html