时间序列|change point detection

change point detection 被称为变点检测，其基本定义是在一个序列或过程中，当某个统计特性（分布类型、分布参数）在某时间点受系统性因素而非偶然因素影响发生变化，我们就称该时间点为变点。变点识别即利用统计量或统计方法或机器学习方法将该变点位置估计出来。

Change Point Detection的类型

online
指连续观察某一随机过程，监测到变点时停止检验，不运用到未来数据，主要用于事件预警。
offline
从已经获得的时序数列中检测过去的变点位置，主要用作历史检验。

1. 控制图方法CUSUM

原画与介绍

根据累计数据微小的偏差，来探测数据分布是否发生了变化。这也是最古老最原始的，被业界广泛应用于工业质量检测、自动监测、金融方面。比如被阿里用于全景业务平台的监控解决方案。

基本模型流程

param: threshold
input:time series
for 0---->n:
   计算累计偏差量
如果累计偏差量大于阈值threshold，则视为这段time series与前段time series有明显不同，分割出两个区间windows，从开始出现偏差的点，可以推出change point
output:windows, change points

优缺点

缺点：需要调参threshold，太小模型太敏感，太大模型太粗糙。
解决方案：动态阈值预测：样本选择—->异常样本筛除—>样本截取—>预测基准值(一阶导数的状态值)。
优点：比较简单，比较容易实施。业界研究与应用比较多，可以找到很相关材料与优化方法。

2. Probability Density Estimation

想法与原理

对于一组时间序列，出现change point前与出现change point后的概率密度分布会不同。

基本模型流程

用前n个点，probability density models—》estimate probability density function
然后用score来衡量，加入这个点后，概率密度分布的差异(加入这个点后，概率密度分布的变化大小 )
score越高，这个点是change point的概率越高。

优缺点

缺点：概率密度估计模型的确立，带参数。并且，probability density function很难做的准确，需要很大的数据量。

3. Direct Compute

想法

因为概率密度分布很难做的准确，因此衍生出了不计算评估概率密度分布，而是直接一个point前后的概率目的分布的差异，对于一个点前的数据和之后的数据，用一些模型/算法可以衡量他前后分布的差异。

可以比较多种差异：

均值
方差
均值与方差

相关的模型：量化两种概率分布P和Q之间差异

Kernel Mean
non-parametric Gaussian kernel model
Kulback-Leibler Importance Estimation Procedure
Kernel FDA

如何分割time series?

BinSegment
bottom-up
Window-based
Segment Neighbourhood

优缺点
该方法很容易受data noise的影响，高维度可能效果很差。

4. Probability Method

对比change point前后分布的不同，这部分侧重于直接预测某个点是否是change point.

（1）Gaussian Process

想法：用t之前的N点构建一个time series prediction 的模型(这里就是GP模型)，如果t点的值和预测的值偏差很大，就记一个potential Alarm，当连续几个值偏差很大，potential Alarm超过某个阈值时，就Raise Alarm，说明出现新的分布。

优缺点

这个方法对one-day-event不敏感。

（2）Bayesian

一开始和GP差不多。

想法：对于一个点，给定他在上一个change point之后的信息，估计这个点是change point的概率。

优缺点
先验函数很难定义，定的不好，结果不准。

一般来说offline的方法比online的更准确，然后GP会比Bayesian更准确。

5. Clustering Method

把change point detection看成，将time series分成很多的clusters。

（1）层次聚类：将许多time series进行聚类，在同一类中，如果有一个time series的行为与同聚类内其它成员差异较大，则视作发生了change。
（2）图形聚类：
（3）基于局部图形聚类与分割

6. 其它

基于图论、基于控制论、系统辩识方法等。

参考文献

[1]. Aminikhanghahi, Samaneh and Diane J. Cook. “A survey of methods for time series change point detection.” Knowledge and Information Systems 51 (2016): 339-367.

[2]. Liu, Song, Makoto Yamada, Nigel Collier and Masashi Sugiyama. “Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation.” Neural networks : the official journal of the International Neural Network Society 43 (2012): 72-83.

[3]. Itoh, Naoki and Juergen Kurths. “Change-Point Detection of Climate Time Series by Nonparametric Method.” (2010).

[4]. Cho, Haeran and Piotr Fryzlewicz. “Multiple change-point detection for high-dimensional time series via Sparsified Binary Segmentation.” (2013).

[5]. Saatci, Yunus, Ryan D. Turner and Carl E. Rasmussen. “Gaussian Process Change Point Models.” ICML (2010).

[6]. Lacasa, Lucas, Bartolo Luque, Fernando J Ballesteros, Jordi Luque and Juan Carlos Nuño. “From time series to complex networks: the visibility graph.” Proceedings of the National Academy of Sciences of the United States of America 105 13 (2008): 4972-5.

[7]. Harchaoui, Zaïd, Francis R. Bach and Eric Moulines. “Kernel Change-point Analysis.” NIPS (2008).

[8]. Jeske, Daniel R., Veronica Montes De Oca, Wolfgang Bischoff and Mazda Marvasti. “Cusum techniques for timeslot sequences with applications to network surveillance.” Computational Statistics & Data Analysis 53 (2009): 4332-4344.

[9]. Chib, Siddhartha and John M. Olin. “Estimation and comparison of multiple change-point models.” (1997).