时间序列方面的笔记

#下面这一段用一个txt来保存input的信息来模拟input.最后提交代码时候删除这一段即可.

def input():
    a9999=open('1.txt','r')#把这行写函数里面,可以防止变量污染.
    return a9999.readline().rstrip('\n')

#输入数据的一些模板
# n,a=map(int,input().split())

# arr=[int(i)for i in input().split()]

# 格式化输出
# print('%.2f'%(maxi/2))
'''
#
最大公约数用
import fractions
fractions.gcd(a,b)

'''
'''
爱奇艺2018秋招
[编程题] 括号匹配深度
时间限制：1秒
空间限制：32768K
一个合法的括号匹配序列有以下定义:
1、空串""是一个合法的括号匹配序列
2、如果"X"和"Y"都是合法的括号匹配序列,"XY"也是一个合法的括号匹配序列
3、如果"X"是一个合法的括号匹配序列,那么"(X)"也是一个合法的括号匹配序列
4、每个合法的括号序列都可以由以上规则生成。
例如: "","()","()()","((()))"都是合法的括号序列
对于一个合法的括号序列我们又有以下定义它的深度:
1、空串""的深度是0
2、如果字符串"X"的深度是x,字符串"Y"的深度是y,那么字符串"XY"的深度为max(x,y) 3、如果"X"的深度是x,那么字符串"(X)"的深度是x+1
例如: "()()()"的深度是1,"((()))"的深度是3。牛牛现在给你一个合法的括号序列,需要你计算出其深度。 
输入描述:
输入包括一个合法的括号序列s,s长度length(2 ≤ length ≤ 50),序列中只包含'('和')'。


输出描述:
输出一个正整数,即这个序列的深度。

输入例子1:
(())

输出例子1:
2






a=input()
memo={}
def valid(a):#这个是最经典的匹配括号算法,要达到秒写的熟练度
    cnt=0
    for i in range(len(a)):
        if cnt<0:
            return False
        if a[i]=='(':
            cnt+=1
        else:
            cnt-=1
    return cnt==0


def main(a):#分类递归即可
    if a in memo:
        return memo[a]
    if a=='':
        return 0
    case1=0
    if a[0]=='(' and a[-1]==')' and valid(a[1:-1])==True:#注意这里要判定valid
        case1=main(a[1:-1])+1

    case2=0
    for i in range(1,len(a)-1):
        now=max(main(a[:i]),main(a[i:]))
        case2=max(case2,now)
    memo[a]=max(case1,case2)
    return memo[a]
print(main(a))
#ac 了!


'''




'''
[编程题] 奶牛编号
时间限制：1秒
空间限制：32768K
牛牛养了n只奶牛,牛牛想给每只奶牛编号,这样就可以轻而易举地分辨它们了。 每个奶牛对于数字都有自己的喜好,第i只奶牛想要一个1和x[i]之间的整数(其中包含1和x[i])。
牛牛需要满足所有奶牛的喜好,请帮助牛牛计算牛牛有多少种给奶牛编号的方法,输出符合要求的编号方法总数。 
输入描述:
输入包括两行,第一行一个整数n(1 ≤ n ≤ 50),表示奶牛的数量 第二行为n个整数x[i](1 ≤ x[i] ≤ 1000)


输出描述:
输出一个整数,表示牛牛在满足所有奶牛的喜好上编号的方法数。因为答案可能很大,输出方法数对1,000,000,007的模。

输入例子1:
4
4 4 4 4

输出例子1:
24

这个显然先排序,因为跟奶牛的输入显然无关,所以可以先排序.
解题思路总结:1.预处理 2.递归函数的构造 3.考虑2分加速




num=int(input())
arr=[int(i)for i in input().split()]
arr.sort()

out=arr[0]
for i in range(1,len(arr)):
    out*=(arr[i]-i)
    out=int(out%(1e9+7))#这个题目怪异,必须每一步都取mod,不然最后答案不一样,ac了,这个是为什么会不一样??????????????????????????????整数都是精确的啊
print(int(out%(1e9+7)))
'''




'''
[编程题] 平方串
时间限制：1秒
空间限制：32768K
如果一个字符串S是由两个字符串T连接而成,即S = T + T, 我们就称S叫做平方串,例如"","aabaab","xxxx"都是平方串.
牛牛现在有一个字符串s,请你帮助牛牛从s中移除尽量少的字符,让剩下的字符串是一个平方串。换句话说,就是找出s的最长子序列并且这个子序列构成一个平方串。 
输入描述:
输入一个字符串s,字符串长度length(1 ≤ length ≤ 50),字符串只包括小写字符。


输出描述:
输出一个正整数,即满足要求的平方串的长度。

输入例子1:
frankfurt

输出例子1:
4

最大公共子序列问题.比如例子frankfurt

我切分点遍历1到len(a)-1
那么切分到第一个点时候,前面是f 后面是rankfurt 他们最大公共子序列是f
第二个切分点时候前fr 后面ankfurt 那么最大公共子序列是fr
公共子序列叫LCS算法.显然是双戒动态规划
比如 a,b两个子串的长度,那么函数main(a,b)返回问题的结果
那么转移方程:main(a,b)=
如果a,b最后字母相同,俺么=main(a-1,b-1)+1 
不同: main(a,b-1) or main(a-1,b)


'''








'''


首先还是针对时间序列的预测问题学习网上的资料


2018-10-24,8点36 开始学习这个网站
https://machinelearningmastery.com/
全部文章在:
https://machinelearningmastery.com/blog/

实习时候全是看这个网站做的,还是要系统的从头实现一下里面的代码.当时做的太冲忙,很多都是跳着看的,有机会找美国同学把里面的nlp和lstm 2本书买了.

这哥们太猛了,写了600多篇博客.......,这就是我深度学习的第一个老师了.
'''



'''
How to Grid Search SARIMA Model Hyperparameters for Time Series Forecasting in Python





An alternative approach to configuring the model that makes use of fast and parallel modern hardware is to grid search a suite of hyperparameter configurations in order to discover what works best. Often, this process can reveal non-intuitive model configurations that result in lower forecast error than those configurations specified through careful analysis.


Grid Search：一种调参手段；穷举搜索：在所有候选的参数选择中，通过循环遍历，尝试每一种可能性，表现最好的参数就是最终的结果。其原理就像是在数组里找最大值。（为什么叫网格搜索？以有两个参数的模型为例，参数a有3种可能，参数b有4种可能，把所有可能性列出来，可以表示成一个3*4的表格，其中每个cell就是一个网格，循环过程就像是在每个网格里遍历、搜索，所以叫grid search）








'''




'''
Simple Grid Search：简单的网格搜索
以2个参数的调优过程为例：




from sklearn.datasets import load_iris
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split

iris = load_iris()
X_train,X_test,y_train,y_test = train_test_split(iris.data,iris.target,random_state=0)
print("Size of training set:{} size of testing set:{}".format(X_train.shape[0],X_test.shape[0]))

####   grid search start
best_score = 0
for gamma in [0.001,0.01,0.1,1,10,100]:
    for C in [0.001,0.01,0.1,1,10,100]:
        svm = SVC(gamma=gamma,C=C)#对于每种参数可能的组合，进行一次训练；
        svm.fit(X_train,y_train)
        score = svm.score(X_test,y_test)
        if score > best_score:#找到表现最好的参数
            best_score = score
            best_parameters = {'gamma':gamma,'C':C}
####   grid search end

print("Best score:{:.2f}".format(best_score))
print("Best parameters:{}".format(best_parameters))
'''





'''
Grid Search with Cross Validation



from sklearn.datasets import load_iris
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score

iris = load_iris()
X_trainval,X_test,y_trainval,y_test = train_test_split(iris.data,iris.target,random_state=0)
X_train,X_val,y_train,y_val = train_test_split(X_trainval,y_trainval,random_state=1)
best_score = 0.0
for gamma in [0.001,0.01,0.1,1,10,100]:
    for C in [0.001,0.01,0.1,1,10,100]:
        svm = SVC(gamma=gamma,C=C)

        #就是下面这一行把之前的直接用score改成cross_val_score而已.
        scores = cross_val_score(svm,X_trainval,y_trainval,cv=5) #5折交叉验证
        score = scores.mean() #取平均数
        
        if score > best_score:
            best_score = score
            best_parameters = {"gamma":gamma,"C":C}
svm = SVC(**best_parameters)#字典参数用**来传进去
svm.fit(X_trainval,y_trainval)
test_score = svm.score(X_test,y_test)
print("Best score on validation set:{:.2f}".format(best_score))
print("Best parameters:{}".format(best_parameters))
print("Score on testing set:{:.2f}".format(test_score))
'''






'''
算法:这里面有很多算法或者智力题,很有趣
https://mp.weixin.qq.com/s/8SJj8-rbdJULdF5gfDj6ng
'''

'''
2018-10-28,13点20今天收到joy老师的邮件,offer应该没问题了,可以整深度学习了.


MA transform 应该是矩阵变换



SARIMA例子:


# grid search sarima hyperparameters
from math import sqrt
from multiprocessing import cpu_count
from joblib import Parallel
from joblib import delayed
from warnings import catch_warnings
from warnings import filterwarnings
from statsmodels.tsa.statespace.sarimax import SARIMAX
from sklearn.metrics import mean_squared_error

# one-step sarima forecast
def sarima_forecast(history, config):
    order, sorder, trend = config
    # define model
    model = SARIMAX(history, order=order, seasonal_order=sorder, trend=trend, enforce_stationarity=False, enforce_invertibility=False)
    # fit 
    model_fit = model.fit(disp=False)
    # make one step forecast
    yhat = model_fit.predict(len(history), len(history))
    return yhat[0]

# root mean squared error or rmse
def measure_rmse(actual, predicted):
    return sqrt(mean_squared_error(actual, predicted))

# split a univariate dataset into train/test sets
def train_test_split(data, n_test):
    return data[:-n_test], data[-n_test:]

# walk-forward validation for univariate data
def walk_forward_validation(data, n_test, cfg):
    predictions = list()
    # split dataset
    train, test = train_test_split(data, n_test)
    # seed history with training dataset
    history = [x for x in train]
    # step over each time-step in the test set
    for i in range(len(test)):
        # fit model and make forecast for history
        yhat = sarima_forecast(history, cfg)
        # store forecast in list of predictions
        predictions.append(yhat)
        # add actual observation to history for the next loop
        history.append(test[i])
    # estimate prediction error
    error = measure_rmse(test, predictions)
    return error

# score a model, return None on failure
def score_model(data, n_test, cfg, debug=False):
    result = None
    # convert config to a key
    key = str(cfg)
    # show all warnings and fail on exception if debugging
    if debug:
        result = walk_forward_validation(data, n_test, cfg)
    else:
        # one failure during model validation suggests an unstable config
        try:
            # never show warnings when grid searching, too noisy
            with catch_warnings():
                filterwarnings("ignore")
                result = walk_forward_validation(data, n_test, cfg)
        except:
            error = None
    # check for an interesting result
    if result is not None:
        print(' > Model[%s] %.3f' % (key, result))
    return (key, result)

# grid search configs
def grid_search(data, cfg_list, n_test, parallel=True):
    scores = None
    if parallel:
        # execute configs in parallel
        executor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')
        tasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)
        scores = executor(tasks)
    else:
        scores = [score_model(data, n_test, cfg) for cfg in cfg_list]
    # remove empty results
    scores = [r for r in scores if r[1] != None]
    # sort configs by error, asc
    scores.sort(key=lambda tup: tup[1])
    return scores

# create a set of sarima configs to try
def sarima_configs(seasonal=[0]):
    models = list()
    # define config lists
    p_params = [0, 1, 2]
    d_params = [0, 1]
    q_params = [0, 1, 2]
    t_params = ['n','c','t','ct']
    P_params = [0, 1, 2]
    D_params = [0, 1]
    Q_params = [0, 1, 2]
    m_params = seasonal
    # create config instances
    for p in p_params:
        for d in d_params:
            for q in q_params:
                for t in t_params:
                    for P in P_params:
                        for D in D_params:
                            for Q in Q_params:
                                for m in m_params:
                                    cfg = [(p,d,q), (P,D,Q,m), t]
                                    models.append(cfg)
    return models

if __name__ == '__main__':
    # define dataset
    data = [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]
    print(data)
    # data split
    n_test = 4
    # model configs
    cfg_list = sarima_configs()
    # grid search
    scores = grid_search(data, cfg_list, n_test)
    print('done')
    # list top 3 configs
    for cfg, error in scores[:3]:
        print(cfg, error)



'''


'''


https://machinelearningmastery.com/how-to-grid-search-naive-methods-for-univariate-time-series-forecasting/


There are two main themes to simple forecast strategies; they are:

Naive, or using observations values directly.
Average, or using a statistic calculated on previous observations.


# one-step naive forecast
def naive_forecast(history, n):
    return history[-n]




# define dataset
data = [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]
print(data)
# test naive forecast
for i in range(1, len(data)+1):
    print(naive_forecast(data, i))
'''





'''
https://machinelearningmastery.com/how-to-develop-deep-learning-models-for-univariate-time-series-forecasting/

这个实用
'''


from math import sqrt
from numpy import mean
from numpy import std
from pandas import DataFrame
from pandas import concat
from pandas import read_csv
from sklearn.metrics import mean_squared_error
from matplotlib import pyplot
series = read_csv('monthly-car-sales.csv', header=0, index_col=0)






# transform list into supervised learning format
def series_to_supervised(data, n_in=1, n_out=1):
    df = DataFrame(data)
    cols = list()
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
    # put it all together
    agg = concat(cols, axis=1)
    # drop rows with NaN values
    agg.dropna(inplace=True)
    return agg.values







# walk-forward validation for univariate data
def walk_forward_validation(data, n_test, cfg):
    predictions = list()
    # split dataset
    train, test = train_test_split(data, n_test)
    # fit model
    model = model_fit(train, cfg)
    # seed history with training dataset
    history = [x for x in train]
    # step over each time-step in the test set
    for i in range(len(test)):
        # fit model and make forecast for history
        yhat = model_predict(model, history, cfg)
        # store forecast in list of predictions
        predictions.append(yhat)
        # add actual observation to history for the next loop
        history.append(test[i])
    # estimate prediction error
    error = measure_rmse(test, predictions)
    print(' > %.3f' % error)
    return error


# stochastic 随机的,石头cast,所以是随机扔骰子


# 箱须图(Box-whisker Plot),它是用一组数据中的最小值、第一四分位数、中位数、第三四分位数和最大值来反映数据分布的中心位置和散布范围







# summarize model performance
def summarize_scores(name, scores):
    # print a summary
    scores_m, score_std = mean(scores), std(scores)
    print('%s: %.3f RMSE (+/- %.3f)' % (name, scores_m, score_std))
    # box and whisker plot
    pyplot.boxplot(scores)
    pyplot.show()


# 周德东老师的书还是那么给力,从来没让我失望过





# The standard deviation is a little more than 134 sales, meaning a worse case model run that is 2 or 3 standard deviations in error from the mean error may be worse than the naive model.什么意思,naive model 1800   a worse case model 1526 标准差134,所以说的意思是他们之间相距2or3个标准差.



# Convolutional Neural Networks, or CNNs, are a type of neural network developed for two-dimensional image data, although they can be used for one-dimensional data such as sequences of text and time series. 说cnn也能跑一维数据用Conv1D





# One difference is that the CNN can support multiple features or types of observations at each time step, which are interpreted as channels of an image. 所以多特征就可以看做是图像channel,所以对于多维时间序列cnn一样用.


















#总结用cnn 做时间序列分析
'''
# evaluate cnn
from math import sqrt
from numpy import array
from numpy import mean
from numpy import std
from pandas import DataFrame
from pandas import concat
from pandas import read_csv
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from matplotlib import pyplot

# split a univariate dataset into train/test sets
def train_test_split(data, n_test):
    return data[:-n_test], data[-n_test:]

# transform list into supervised learning format
def series_to_supervised(data, n_in=1, n_out=1):
    df = DataFrame(data)
    cols = list()
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
    # put it all together
    agg = concat(cols, axis=1)
    # drop rows with NaN values
    agg.dropna(inplace=True)
    return agg.values

# root mean squared error or rmse
def measure_rmse(actual, predicted):
    return sqrt(mean_squared_error(actual, predicted))

# fit a model
def model_fit(train, config):
    # unpack config
    n_input, n_filters, n_kernel, n_epochs, n_batch = config
    # prepare data
    data = series_to_supervised(train, n_in=n_input)
    train_x, train_y = data[:, :-1], data[:, -1]
    train_x = train_x.reshape((train_x.shape[0], train_x.shape[1], 1))
    # define model
    model = Sequential()
    model.add(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_input, 1)))
    model.add(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'))
    model.add(MaxPooling1D(pool_size=2))
    model.add(Flatten())
    model.add(Dense(1))
    model.compile(loss='mse', optimizer='adam')
    # fit
    model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)
    return model

# forecast with a pre-fit model
def model_predict(model, history, config):
    # unpack config
    n_input, _, _, _, _ = config
    # prepare data
    x_input = array(history[-n_input:]).reshape((1, n_input, 1))
    # forecast
    yhat = model.predict(x_input, verbose=0)
    return yhat[0]

# walk-forward validation for univariate data
def walk_forward_validation(data, n_test, cfg):
    predictions = list()
    # split dataset
    train, test = train_test_split(data, n_test)
    # fit model
    model = model_fit(train, cfg)
    # seed history with training dataset
    history = [x for x in train]
    # step over each time-step in the test set
    for i in range(len(test)):
        # fit model and make forecast for history
        yhat = model_predict(model, history, cfg)
        # store forecast in list of predictions
        predictions.append(yhat)
        # add actual observation to history for the next loop
        history.append(test[i])
    # estimate prediction error
    error = measure_rmse(test, predictions)
    print(' > %.3f' % error)
    return error

# repeat evaluation of a config
def repeat_evaluate(data, config, n_test, n_repeats=30):
    # fit and evaluate the model n times
    scores = [walk_forward_validation(data, n_test, config) for _ in range(n_repeats)]
    return scores

# summarize model performance
def summarize_scores(name, scores):
    # print a summary
    scores_m, score_std = mean(scores), std(scores)
    print('%s: %.3f RMSE (+/- %.3f)' % (name, scores_m, score_std))
    # box and whisker plot
    pyplot.boxplot(scores)
    pyplot.show()

series = read_csv('monthly-car-sales.csv', header=0, index_col=0)
data = series.values
# data split
n_test = 12
# define config
config = [36, 256, 3, 100, 100]
# grid search
scores = repeat_evaluate(data, config, n_test)
# summarize scores
summarize_scores('cnn', scores)
'''




#先打印每一个confg变量组合的最后误差,最后输出综合结果.

'''
这两个模型是干嘛用的？（模型数学形式建议去wiki里查一下，这篇文章只是直观地去讲这两个模型是干嘛用的）

它们分别描述了一个时间序列的两种特性。（时间序列是什么？其实它就是以时间为自变量的一段信号以某个采样间隔做采样之后得到的序列。描述股价变化的“K线”实质就是一个时间序列。）

1）AR模型描述的是时间序列的这种特性：

一个时间序列在某一时刻的值，与之前几个时刻的值是有关系的。有什么关系呢？线性加权叠加的关系。举个例子，某只股票前天的价格是20，昨天的价格是20.5，如果不发生什么突发事件，今天的价格很可能还是在20左右。

2）MA模型描述的是时间序列的这种特性：

一个时间序列当前时刻与前一时刻的差值，与前几个时刻与各自的前一个时刻的差值有关系。有什么关系呢？线性加权叠加的关系。举个例子，某只股票前天跌了5块，昨天跌了6块，那么今天，我也不知道要跌还是要涨，要不也不会因为炒股赔钱了。。。
--------------------- 
作者：TY_Yang 
来源：CSDN 
原文：https://blog.csdn.net/hnyzyty/article/details/75608257 
版权声明：本文为博主原创文章，转载请附上博文链接！
'''








'''
module compiled against API version 0xb but this version of numpy is 0xa
这个bug更新pip,然后更新numpy即可
'''




'''
Although developed for sequence data, LSTMs have not proven effective on time series forecasting problems where the output is a function of recent observations, e.g. an autoregressive type forecasting problem, such as the car sales dataset.



At the end of the sequence, each node in a layer of hidden LSTM units will output a single value. This vector of values summarizes what the LSTM learned or extracted from the input sequence. This can be interpreted 解释 by a fully connected layer before a final prediction is made.




Unlike the MLP and CNN that do not read the sequence data one-step at a time, the LSTM does perform better if the data is stationary. This means that difference operations are performed to remove the trend and seasonal structure.

If the difference operation was performed, we must add back the value that was subtracted after the model has made a forecast. We must also difference the historical data prior to formulating the single input used to make a prediction.


开创人即是开国皇帝，董事长即是如今皇帝 ，左丞相是CEO，右丞相是总裁 。
'''




'''
引入了差分的lstm时间序列经典例子.非常经典!!!!



#LSTM
# evaluate lstm
from math import sqrt
from numpy import array
from numpy import mean
from numpy import std
from pandas import DataFrame
from pandas import concat
from pandas import read_csv
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from matplotlib import pyplot

# split a univariate dataset into train/test sets
def train_test_split(data, n_test):
    return data[:-n_test], data[-n_test:]

# transform list into supervised learning format
def series_to_supervised(data, n_in=1, n_out=1):
    df = DataFrame(data)
    cols = list()
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
    # put it all together
    agg = concat(cols, axis=1)
    # drop rows with NaN values
    agg.dropna(inplace=True)
    return agg.values

# root mean squared error or rmse
def measure_rmse(actual, predicted):
    return sqrt(mean_squared_error(actual, predicted))

# difference dataset
def difference(data, interval):#interval就是12个月
    return [data[i] - data[i - interval] for i in range(interval, len(data))]#
    #for从12开始计算,因为data[12]-data[0]才是第一个.更早的直接扔掉

# fit a model
def model_fit(train, config):#config = [36, 50, 100, 100, 12]
    # unpack config
    n_input, n_nodes, n_epochs, n_batch, n_diff = config
    # prepare data
    if n_diff > 0:#先做间隔为12的差分,因为一年有12个月.12是一个周期,周期就是差分的间隔!
        train = difference(train, n_diff)#train改成差分后的形式
    data = series_to_supervised(train, n_in=n_input)#必用的转化时间序列函数
    train_x, train_y = data[:, :-1], data[:, -1]
    print(train_x.shape) 
    train_x = train_x.reshape((train_x.shape[0], train_x.shape[1], 1))
    print(train_x.shape)
    print(n_input)#这个参数表示的是一个点是由他前面36个点来决定的.也就是最近3年来决定当前时间的取值
    print(n_nodes)#是hidden layer的神经元个数.从程序运行开始看懂每一步的全部参数含义,作用即可

    # define model
    model = Sequential()
    model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, 1)))#应该用tanh效果更好
    model.add(Dense(n_nodes, activation='relu'))
    model.add(Dense(1))
    model.compile(loss='mse', optimizer='adam')
    # fit
    model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)#verbose就是不显示细节日志信息
    return model

# forecast with a pre-fit model
def model_predict(model, history, config):
    # unpack config
    n_input, _, _, _, n_diff = config
    # prepare data
    correction = 0.0
    if n_diff > 0:
        correction = history[-n_diff]
        history = difference(history, n_diff)
    x_input = array(history[-n_input:]).reshape((1, n_input, 1))
    # forecast
    yhat = model.predict(x_input, verbose=0)
    return correction + yhat[0]#因为model训练的时候用的数据是差分之后的,所以predict也用的是差分之后的.得到的yhat是差分之后的,所以变回之前test的数据需要做反差分运算.需要的就是当前预测数据的index-12对应的数据,就是correction = history[-n_diff].到此整个代码分析完毕.完美的引入了差分运算来更好的刻画时间序列.

# walk-forward validation for univariate data
def walk_forward_validation(data, n_test, cfg):
    predictions = list()
    # split dataset
    train, test = train_test_split(data, n_test)
    # fit model
    model = model_fit(train, cfg)
    # seed history with training dataset
    history = [x for x in train]
    # step over each time-step in the test set
    for i in range(len(test)):
        # fit model and make forecast for history
        yhat = model_predict(model, history, cfg)
        # store forecast in list of predictions
        predictions.append(yhat)
        # add actual observation to history for the next loop
        history.append(test[i])
    # estimate prediction error
    error = measure_rmse(test, predictions)
    print(' > %.3f' % error)
    return error

# repeat evaluation of a config
def repeat_evaluate(data, config, n_test, n_repeats=1):
    # fit and evaluate the model n times
    #下面开始利用test集合进行评估模型的好坏,利用n_repeats参数反复评估取平均.
    scores = [walk_forward_validation(data, n_test, config) for _ in range(n_repeats)]
    return scores

# summarize model performance
def summarize_scores(name, scores):
    # print a summary
    scores_m, score_std = mean(scores), std(scores)
    print('%s: %.3f RMSE (+/- %.3f)' % (name, scores_m, score_std))
    # box and whisker plot
    pyplot.boxplot(scores)
    pyplot.show()

series = read_csv('monthly-car-sales.csv', header=0, index_col=0)
data = series.values
# data split
n_test = 12  #表示最后12个来做测试用
# define config
config = [36, 50, 100, 100, 12]
# grid search
scores = repeat_evaluate(data, config, n_test)
# summarize scores
summarize_scores('lstm', scores)


'''






'''
We have seen that the CNN model is capable of automatically learning and extracting features from the raw sequence data without scaling or differencing.









#更牛逼的cnn-lstm网络
# evaluate cnn lstm
from math import sqrt
from numpy import array
from numpy import mean
from numpy import std
from pandas import DataFrame
from pandas import concat
from pandas import read_csv
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import TimeDistributed
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from matplotlib import pyplot

# split a univariate dataset into train/test sets
def train_test_split(data, n_test):
    return data[:-n_test], data[-n_test:]

# transform list into supervised learning format
def series_to_supervised(data, n_in=1, n_out=1):
    df = DataFrame(data)
    cols = list()
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
    # put it all together
    agg = concat(cols, axis=1)
    # drop rows with NaN values
    agg.dropna(inplace=True)
    return agg.values

# root mean squared error or rmse
def measure_rmse(actual, predicted):
    return sqrt(mean_squared_error(actual, predicted))

# fit a model
def model_fit(train, config):
    # unpack config
    n_seq, n_steps, n_filters, n_kernel, n_nodes, n_epochs, n_batch = config
    n_input = n_seq * n_steps
    print('n_nodes:'+str(n_nodes))
    print(n_input)
    print(n_steps)
    print('n_kernel:'+str(n_kernel))
    print('n_filters:'+str(n_filters))
    # prepare data
    data = series_to_supervised(train, n_in=n_input)
    print(data.shape)
    train_x, train_y = data[:, :-1], data[:, -1]
    print(train_x.shape)
    #就是把数据进行折叠,来多一维.
    train_x = train_x.reshape((train_x.shape[0], n_seq, n_steps, 1))
    print(train_x.shape)#(60, 3, 12, 1)            60,n_seq, n_steps, 1
    # define model
    model = Sequential()
    print(n_kernel)
    #filters表示的是卷积核的数量
#TimeDistributed表示的是对每一个index是1的分别进行计算.对于这个问题就是算了3次.
#只要是TimeDistributed,那么他的index是1的维度是始终保持的.
#filter:64 kernel_size=3 n_steps=12
    model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(None,n_steps,1))))#第一个分量写-1写None都行.

# Output Shape (None, 3, 10, 64) 这个结果是为什么?输入是(None,3,n_steps,1)
#n_steps变成10,这个跟2维cnn道理一样.之前是12*1的,然后每一个3*1的小块,进行一次kernal运算.
#这个cnn是从数组第二个数开始进行kernel运算的,到倒数第二个.所以变成了10*1的结果.最边上的因为拼不成kernel,所以只能拼出10个结果.(跟2维道理一样.当然这里面的步长strides取的默认1),
#下面开始理解TimeDistributed这个运算,这个运算就是不管时间变量,而把时间里面的每一个分别计算.对应这个问题就是3这个维度锁住3.每一个编号(0,1,2)进行计算括号里面的Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(None,n_steps,1)),也就是里面写的input_shape=(None,n_steps,1),从这个input_shape也看出,其实每一次input都是时间轴的锁定后的一个input,坐conv1,后input_shape变成out_shape:(None,10,64) ,为什么64写最后面?
#这个就是约定写在最后的,keras反正是这样定义filter数量的,放最后的位置,就是64这个位置.




    model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))# Output Shape (None, 3, 8, 64)   
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))# (None, 3, 4, 64)  
    model.add(TimeDistributed(Flatten()))# (None, 3, 256) 

#上面跑完得到的张良是:(None, 3, 256)
#为什么要提出一个3作为固定,对后面进行计算呢?这个就是类似做差分的运算.每12个元素作为一个整体进行处理,最后接lstm,就把这3组数据对立起来.下面解lstm是多特征分类,特征是256,time_step是3,得到的n_nodes=100,所以最后得到的结果就是None,100 (其实time_step和data_dimension含义差不多,都是高维对数据的刻画)
    model.add(LSTM(n_nodes, activation='relu'))
    model.add(Dense(n_nodes, activation='relu'))
    model.add(Dense(1))
    model.compile(loss='mse', optimizer='adam')
    # fit
    model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)
    print(model)
    print(model.summary())#利用这个方法来查看网络里面的设置情况.和输出的tensor的shape

    return model

# forecast with a pre-fit model
def model_predict(model, history, config):
    # unpack config
    n_seq, n_steps, _, _, _, _, _ = config
    n_input = n_seq * n_steps
    # prepare data
    x_input = array(history[-n_input:]).reshape((1, n_seq, n_steps, 1))
    # forecast
    yhat = model.predict(x_input, verbose=0)
    return yhat[0]

# walk-forward validation for univariate data
def walk_forward_validation(data, n_test, cfg):
    predictions = list()
    # split dataset
    train, test = train_test_split(data, n_test)
    # fit model
    model = model_fit(train, cfg)
    # seed history with training dataset
    history = [x for x in train]
    # step over each time-step in the test set
    for i in range(len(test)):
        # fit model and make forecast for history
        yhat = model_predict(model, history, cfg)
        # store forecast in list of predictions
        predictions.append(yhat)
        # add actual observation to history for the next loop
        history.append(test[i])
    # estimate prediction error
    error = measure_rmse(test, predictions)
    print(' > %.3f' % error)
    return error

# repeat evaluation of a config
def repeat_evaluate(data, config, n_test, n_repeats=1):
    # fit and evaluate the model n times
    scores = [walk_forward_validation(data, n_test, config) for _ in range(n_repeats)]
    return scores

# summarize model performance
def summarize_scores(name, scores):
    # print a summary
    scores_m, score_std = mean(scores), std(scores)
    print('%s: %.3f RMSE (+/- %.3f)' % (name, scores_m, score_std))
    # box and whisker plot
    pyplot.boxplot(scores)
    # pyplot.show()

series = read_csv('monthly-car-sales.csv', header=0, index_col=0)
data = series.values
# data split
n_test = 12
# define config
config = [3, 12, 64, 3, 100, 200, 100]
# grid search
scores = repeat_evaluate(data, config, n_test)
# summarize scores
summarize_scores('cnn-lstm', scores)
'''





'''
下面是最后的convolution-lstm模型.

This means, rather than reading a sequence one step at a time, the LSTM would read a block or subsequence of observations at a time using a convolutional process, like a CNN.

就是lstm的一种高维变形,lstm接受的不在是一个1维数据,而是2维数据.计算方法先卷积再1维lstm.
因为有记忆cnn,所以比上面的cnn+lstm这种类型更好.







# evaluate convlstm
from math import sqrt
from numpy import array
from numpy import mean
from numpy import std
from pandas import DataFrame
from pandas import concat
from pandas import read_csv
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import ConvLSTM2D
from matplotlib import pyplot

# split a univariate dataset into train/test sets
def train_test_split(data, n_test):
    return data[:-n_test], data[-n_test:]

# transform list into supervised learning format
def series_to_supervised(data, n_in=1, n_out=1):
    df = DataFrame(data)
    cols = list()
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
    # put it all together
    agg = concat(cols, axis=1)
    # drop rows with NaN values
    agg.dropna(inplace=True)
    return agg.values

# root mean squared error or rmse
def measure_rmse(actual, predicted):
    return sqrt(mean_squared_error(actual, predicted))

# difference dataset
def difference(data, interval):
    return [data[i] - data[i - interval] for i in range(interval, len(data))]

# fit a model
def model_fit(train, config):
    # unpack config
    n_seq, n_steps, n_filters, n_kernel, n_nodes, n_epochs, n_batch = config
    # config = [3, 12, 256, 3, 200, 200, 100]
    n_input = n_seq * n_steps
    # prepare data
    data = series_to_supervised(train, n_in=n_input)
    train_x, train_y = data[:, :-1], data[:, -1]
    train_x = train_x.reshape((train_x.shape[0], n_seq, 1, n_steps, 1))
    # define model
    print(train_x.shape)
    model = Sequential()
    model.add(ConvLSTM2D(filters=n_filters, kernel_size=(1,n_kernel), activation='relu', input_shape=(n_seq, 1, n_steps, 1)))
#strides=(1, 1)默认的
    #输入(n_samples , 3, 1, 12, 1) filters=256,n_kernel=3,n_seq=3,n_steps=12,n_samples=60.所以
#convlstm2d 里面如何算的没弄明白,应该是很复杂的一堆卷积和矩阵乘法

    model.add(Flatten())
    model.add(Dense(n_nodes, activation='relu'))
    model.add(Dense(1))
    model.compile(loss='mse', optimizer='adam')
    # fit
    model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)
    print(model)
    print(model.summary())#利用这个方法来查看网络里面的设置情况.和输出的tensor的shape
    return model

# forecast with a pre-fit model
def model_predict(model, history, config):
    # unpack config
    n_seq, n_steps, _, _, _, _, _ = config
    n_input = n_seq * n_steps
    # prepare data
    x_input = array(history[-n_input:]).reshape((1, n_seq, 1, n_steps, 1))
    # forecast
    yhat = model.predict(x_input, verbose=0)
    return yhat[0]

# walk-forward validation for univariate data
def walk_forward_validation(data, n_test, cfg):
    predictions = list()
    # split dataset
    train, test = train_test_split(data, n_test)
    # fit model
    model = model_fit(train, cfg)
    # seed history with training dataset
    history = [x for x in train]
    # step over each time-step in the test set
    for i in range(len(test)):
        # fit model and make forecast for history
        yhat = model_predict(model, history, cfg)
        # store forecast in list of predictions
        predictions.append(yhat)
        # add actual observation to history for the next loop
        history.append(test[i])
    # estimate prediction error
    error = measure_rmse(test, predictions)
    print(' > %.3f' % error)
    return error

# repeat evaluation of a config
def repeat_evaluate(data, config, n_test, n_repeats=1):
    # fit and evaluate the model n times
    scores = [walk_forward_validation(data, n_test, config) for _ in range(n_repeats)]
    return scores

# summarize model performance
def summarize_scores(name, scores):
    # print a summary
    scores_m, score_std = mean(scores), std(scores)
    print('%s: %.3f RMSE (+/- %.3f)' % (name, scores_m, score_std))
    # box and whisker plot
    pyplot.boxplot(scores)
    pyplot.show()

series = read_csv('monthly-car-sales.csv', header=0, index_col=0)
data = series.values
# data split
n_test = 12
# define config
config = [3, 12, 256, 3, 200, 200, 100]
# grid search
scores = repeat_evaluate(data, config, n_test)
# summarize scores
summarize_scores('convlstm', scores)
'''




























'''
2018-11-02,23点07
亲爱的同学，你好！

恭喜您通过了我们2018校招的面试，根据与你的沟通，我们将给你办理校招生入职手续。

以下相关信息，请仔细阅读。

一、 入职时间：11月8日

二、员工基本信息采集表，并一同反馈至此邮箱

注意：请使用adobe reader最新版本+windows系统打开附件并填写完整【签名处无需签名】

命名要求：姓名+员工基本信息采集表（如不能编辑请下载福昕阅读器）

三、请反馈白底证件照电子版一张，命名要求：姓名.jpg

以上信息请在（11月4日）晚5：00前反馈，谢谢！【若不能按时入职请发邮件说明原因】

同时，请于11月8日上午9：00前来办理入职(务必准时到)

入职地点：北京市亦庄经济开发区京东集团总部C座1层前台处

办理入职当天请携带以下资料：

1.身份证原件及复印件3张（正反面为同面同方向）；

2.校招正式入职携带学历、学位证书&毕业证书原件及复印件1张；

3.提前实习&实习生携带学生证原件以及复印件（复印件要求：首页加注册页）；如无学生证请登录学信网打印在籍证明页

4.近期一寸免冠白底冲印彩色照片1张，用于存档，需由专业照相馆拍摄或加洗，符合第二代身份证照片要求；

5.原公司离职证明【重要】（实习&校招生可忽略此项）；

6.黑色签字笔一支

入职相关问题联系方式：[email protected]

一起开启京东生活的多彩之旅吧！
'''



# The 3003 series of the M3-Competition were selected on a quota basis to include various types of time series data (micro, industry, macro, etc.) and different time intervals between successive observations (yearly, quarterly, etc.).    发现用直接设置build system为python 比设置为automatic要快. macro 巨大的




'''
In that paper, each time series was adjusted using a power transform, deseasonalized and detrended.


Comparing the performance of all methods, it was found that the machine learning methods were all out-performed by simple classical methods, where ETS and ARIMA models performed the best overall.

所以学习ets和arima才是重要的


Multi-step forecasting involves predicting multiple steps ahead of the last known observation.

The classical methods were found to outperform the machine learning methods again.


These findings strongly encourage the use of classical methods, such as ETS, ARIMA, and others as a first step before more elaborate methods are explored, and requires that the results from these simpler methods be used as a baseline in performance that more elaborate methods must clear in order to justify their usage.
'''



'''
systematic trend 有规律的趋势

curd 数据库的增、删、改、查 (CURD) 
'''


'''
2018-11-04,4点16一个无比恐怖的梦
#梦里面开始在医院里面抬一个桌子,上面分层都是
#玻璃杯,容易碎,想到的方法是用胶带贴上后再云.
#玻璃杯结构非常复杂.是很多歌玻璃杯嵌套的.如果弄下来抬玩桌子就不会放上去了.然后又莫名其妙跟
中一个个子很高的美女,比我高半个头,长得脸型像范冰冰,眼睛没有那么大.就是脸的上半部比较宽的类
型.突然想到可能是海贼王里面的小菊,但是眼睛没那么大,比较写实.突然穿越到那里了,我还真能想.然
后就跟踪他一直走,又发现什么反抗联盟啥的.最恐怖的是发现周围人莫名其妙会动的不对,会说话,比如
他们出现在不应该的位置.发现自己应该是幻视,绝对疯了.开始思考为什么.就发现应该在梦里,但是该
如何醒,开始想直接通过大脑信号让自己醒,没成功,就开始移动到这个世界的各个位置试,随后想到应该
想真实世界自己的身份和状态就能醒,但是什么都想不起来,就走到一个马路上,看到一个救护车但是上
不去.也怀疑他是不是真的.就继续乱走到一个下水道.路上也想到是不是死了就行了,但是如果没成功会不会一直困在梦中.下水道中路线很复杂带循环,走2个门后发现循环了.也知道在梦中,也没惊奇.途中看
到几个人也不知道是不是幻视.走到一个圆圈类似塔型的地方,想一定要出去,这你妈死在下水道太脏了.
走到圆圈型的路上,一直向上走.到了出口,几个破铁丝网挡住一个门的上面,就想手撕铁丝网,撕了一般,
发现外面有一个男的用刀在杀另外一个女的和另外一个男的,顶到这个门上了,也能投过铁丝网看到我.
我继续撕也不管,一刀扎过来扎到了我额头偏左的地方,扎的不深,也不疼.我就拿下了刀,发现右边有口居然能直接出去,纳闷刚才怎么没看到.出去之后又发现刚才的救护车,这尼玛我就生气,让他停下,救护
车就是一个电动三轮,后面一个白色箱子装病人,我用刀让司机停下,让他开门,让我上去去医院,但是后
门刚开,刚想上去,他就开跑了没上去.到这里就醒了.

# 人脑还真他吗乱想,知道是梦,但是失忆了,也出不来,收到刺激才出来,以前也遇到过,很多是死了才出来,这次遇到幻视,更是周围都没法相信,更麻烦.
'''







'''
Time series forecasting with LSTMs directly has shown little success.

This is surprising as neural networks are known to be able to learn complex non-linear relationships and the LSTM is perhaps the most successful type of recurrent neural network that is capable of directly supporting multivariate sequence prediction problems.

所以用lstm必须要用多元的时间序列,单元的被实验正面不如经典统计方法

rare event 小概率事件


端到端指的是输入是原始数据，输出是最后的结果，原来输入端不是直接的原始数据，而是在原始数据
中提取的特征，这一点在图像问题上尤为突出，因为图像像素数太多，数据维度高，会产生维度灾难，
所以原来一个思路是手工提取图像的一些关键特征，这实际就是就一个降维的过程。那么问题来了，特
征怎么提？特征提取的好坏异常关键，甚至比学习算法还重要，举个例子，对一系列人的数据分类，分
类结果是性别，如果你提取的特征是头发的颜色，无论分类算法如何，分类效果都不会好，如果你提取
的特征是头发的长短，这个特征就会好很多，但是还是会有错误，如果你提取了一个超强特征，比如染
色体的数据，那你的分类基本就不会错了。这就意味着，特征需要足够的经验去设计，这在数据量越来
越大的情况下也越来越困难。于是就出现了端到端网络，特征可以自己去学习，所以特征提取这一步也
就融入到算法当中，不需要人来干预了。

作者：张旭
链接：https://www.zhihu.com/question/51435499/answer/129379006
来源：知乎
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。




multi-step time series forecasting 就是预测未来很多步的数据,而不止1步.


The intent of the model was to forecast driver demand at Uber for ride sharing, specifically to forecast demand on challenging days such as holidays where the uncertainty for classical models was high.

特别是对于特殊日期的预测更好

Generally, this type of demand forecasting for holidays belongs to an area of study called extreme event prediction.

这叫极端时间的预测


Extreme event prediction has become a popular topic for estimating peak electricity demand, traffic jam severity and surge pricing for ride sharing and other applications. In fact there is a branch of statistics known as extreme value theory (EVT) that deals directly with this challenge.

极端预测:用电高峰,交通堵塞,价格的爆发
简称EVT extreme value theory



A multivariate time series as input to the autoencoder will result in multiple encoded vectors (one for each series) that could be concatenated. It is not clear what role averaging may take at this point, although we may guess that it is an averaging of multiple models performing the autoencoding process.

Exponential smoothing is a time series forecasting method for univariate data.


Time series methods like the Box-Jenkins ARIMA family of methods develop a model where the prediction is a weighted linear sum of recent past observations or lags.


Exponential smoothing forecasting methods are similar in that a prediction is a weighted sum of past observations, but the model explicitly uses an exponentially decreasing weight for past observations.


最重要的2个方法:
arima和ets
先看arima
https://otexts.org/fpp2/arima.html

Before we introduce ARIMA models, we must first discuss the concept of stationarity and the technique of differencing time series.


A stationary time series is one whose properties do not depend on the time at which the series is observed.14 Thus, time series with trends, or with seasonality, are not stationary — the trend and seasonality will affect the value of the time series at different times. On the other hand, a white noise series is stationary — it does not matter when you observe it, it should look much the same at any point in time.

说的是时间序列的稳定性问题:稳定性定义是从任何一个点观察序列都不改变观察的性质(从概率上)(从数值上当然没法说不改变,但是概率上可以这样说,利用的就是严格的数学定义在时间序列的书上有).所以trend 和季节性的序列一定不是稳定的,白噪音一定是稳定的.(从定义显然)




Collectively全部的

peaks and troughs 波峰波谷

This shows one way to make a non-stationary time series stationary — compute the differences between consecutive observations. This is known as differencing.

When the differenced series is white noise, the model for the original series can be written as
yt−yt−1=εt,yt−yt−1=εt,where  εt εt  denotes white noise. Rearranging this leads to the “random walk” modelyt=yt−1+εt.yt=yt−1+εt.


In practice, it is almost never necessary to go beyond second-order differences.


Seasonal differencing

A seasonal difference is the difference between an observation and the previous observation from the same season. So
y
′
t
=
y
t
−
y
t
−
m
,
yt′=yt−yt−m,
 
where  
m
=
m=  the number of seasons. These are also called “lag- 
m
m  differences”, as we subtract the observation after a lag of  
m
m  periods.


To distinguish seasonal differences from ordinary differences, we sometimes refer to ordinary differences as “first differences”, meaning differences at lag 1.

Sometimes it is necessary to take both a seasonal difference and a first difference to obtain stationary data, as is shown in Figure 8.4. Here, the data are first transformed using logarithms (second panel), then seasonal differences are calculated (third panel). The data still seem somewhat non-stationary, and so a further lot of first differences are computed (bottom panel).


这一段说的是:实际上通过图像来判断是否station,如果图像符合正态分布就说明稳定了.我们目标是通过diff运算把图像变稳定.一般方法是先diff(周期) 再diff(1)这个比较常用.也就是说diff不止一次.



There is a degree of subjectivity in selecting which differences to apply. 选择diff的距离比较主观.


When both seasonal and first differences are applied, it makes no difference which is done first—the result will be the same. However, if the data have a strong seasonal pattern, we recommend that seasonal differencing be done first, because the resulting series will sometimes be stationary and there will be no need for a further first difference. If first differencing is done first, there will still be seasonality present.
一般先用seasonal的,用完之后还是不行就在用一次1戒差分.


One way to determine more objectively whether differencing is required is to use a unit root test. 
单位根测试方法来测试是否需要用差分


null hypothesis 原假设,统计学术语



In this test, the null hypothesis is that the data are stationary, and we look for evidence that the null hypothesis is false. Consequently, small p-values (e.g., less than 0.05) suggest that differencing is required. The test can be computed using the ur.kpss() function from the urca package.

利用单位根检测法如果大于0.05就说明需要用差分.



Backshift notation is particularly useful when combining differences, as the operator can be treated using ordinary algebraic rules. In particular, terms involving  
B
B  can be multiplied together.

For example, a seasonal difference followed by a first difference can be written as
(
1
−
B
)
(
1
−
B
m
)
y
t
=
(
1
−
B
−
B
m
+
B
m
+
1
)
y
t
=
y
t
−
y
t
−
1
−
y
t
−
m
+
y
t
−
m
−
1
,
(1−B)(1−Bm)yt=(1−B−Bm+Bm+1)yt=yt−yt−1−yt−m+yt−m−1,
 
the same result we obtained earlier.

Backshift记号,非常有用.写起来方便.





8.3 Autoregressive models

#现在发现sublime这种不能填图片的真是烦,对于数学公式简直就是无解的bug!!!!!!!!!!!!
这个模型就是当前时间的是之前一段时间的线性组合.






 The term autoregression indicates that it is a regression of the variable against itself.

 破手爪子没事别揉眼睛,都他妈进细菌了,耳朵也别老扣,容易中耳炎.





 Thus, an autoregressive model of order  
p
p  can be written as
y
t
=
c
+
ϕ
1
y
t
−
1
+
ϕ
2
y
t
−
2
+
⋯
+
ϕ
p
y
t
−
p
+
ε
t
,
yt=c+ϕ1yt−1+ϕ2yt−2+⋯+ϕpyt−p+εt,
 
where  
ε
t
εt  is white noise. This is like a multiple regression but with lagged values of  
y
t
yt  as predictors. We refer to this as an AR( 
p
p ) model, an autoregressive model of order  
p
p .



这段说的是ar模型有一个阶数,p介的模型,就表示当前时刻是前p个时刻这p个变量的线性函数.


结论:


For an AR(1) model:

when  
ϕ
1
=
0
ϕ1=0 ,  
y
t
yt  is equivalent to white noise;
when  
ϕ
1
=
1
ϕ1=1  and  
c
=
0
c=0 ,  
y
t
yt  is equivalent to a random walk;
when  
ϕ
1
=
1
ϕ1=1  and  
c
≠
0
c≠0 ,  
y
t
yt  is equivalent to a random walk with drift;
when  
ϕ
1
<
0
ϕ1<0 ,  
y
t
yt  tends to oscillate between positive and negative values;




8.4 Moving average models

Rather than using past values of the forecast variable in a regression, a moving average model uses past forecast errors in a regression-like model.




It is possible to write any stationary AR( 
p
p ) model as an MA( 
∞
∞ ) model.




acronym 首字母缩写 ac尖,nym :name


ARIMA is an acronym for AutoRegressive Integrated Moving Average 


If we combine differencing with autoregression and a moving average model, we obtain a non-seasonal ARIMA model. 

https://otexts.org/fpp2/non-seasonal-arima.html  有些公式很难写,还是贴上网站查询用






 We call this an ARIMA( 
p
,
d
,
q
p,d,q ) model, where

p
=
p=  order of the autoregressive part;
d
=
d=  degree of first differencing involved;
q
=
q=  order of the moving average part.




Once we start combining components in this way to form more complicated models, it is much easier to work with the backshift notation. 




Akaike’s Information Criterion (AIC), which was useful in selecting predictors for regression, is also useful for determining the order of an ARIMA model. 


AIC准则来选择模型的参数,和模型的评价



R语言的arima使用:
The auto.arima() function in R uses a variation of the Hyndman-Khandakar algorithm (Hyndman & Khandakar, 2008), which combines unit root tests, minimisation of the AICc and MLE to obtain an ARIMA model. 




下面是seasonal的arima

So far, we have restricted our attention to non-seasonal data and non-seasonal ARIMA models. However, ARIMA models are also capable of modelling a wide range of seasonal data.

A seasonal ARIMA model is formed by including additional seasonal terms in the ARIMA models we have seen so far. 





看差不多了,但是感觉这些计算机书讲的统计学还是太low,还是要从新看时间序列的书.理解参数和算法如何实现.














'''
View Code
时间序列方面的笔记

猜你喜欢