Ape Creation Call for Papers|Time Series Analysis Algorithms for Stationary Time Series Prediction Algorithms and Autoregressive Models (AR) Detailed Explanation + Python Code Implementation

Table of contents

foreword

1. Stationary time series forecasting algorithm

1. Distribution, mean and covariance functions of time series

Probability distributions

mean function

autocovariance function

autocorrelation function

2. Autocovariance and Autocorrelation Functions of Stationary Series

autocovariance function

autocorrelation function

Properties of Autocovariance Sequences and Autocorrelation Function Sequences for Stationary Sequences

3. Strictly stationary time series

4. Wide Stationary Time Series

5. Contact and distinction

connect

the difference

6. White noise sequence

definition

7. IID Sequences

definition

2. Autoregressive model (AR)

definition

Modeling steps

first step:

Step 2:

third step:

the fourth step:

Model decision

Case realization

Pay attention, prevent getting lost, if there are any mistakes, please leave a message for advice, thank you very much

see



foreword

The smoothing method took nearly a month to explain and simulate the algorithm. The explanation is very detailed in my opinion, and the code and principle are relatively simple to understand, and the code is almost not difficult to implement. If you don't have any suggestions for understanding or mastering time series analysis algorithms in detail, you can subscribe to my column: One Text Quick Learning - Common Models for Mathematical Modeling . The analysis and prediction models involved in each scene are basically available, and all the methods of the smoothing method are:

are included. Next, we will study and derive the stationary time series forecasting algorithm. However, the basic theory of the stationary time series forecasting algorithm is quite complex, and we need to have a certain understanding of the basic theory to master the algorithm more easily. Therefore, in the first chapter of our opening, we will first understand all the theoretical knowledge.

I hope that readers can make mistakes or opinions in the comment area after reading, and bloggers will maintain the blog for a long time and update it in time.


1. Stationary time series forecasting algorithm

Stationarity here refers to wide stationarity , which is characterized by the fact that the statistical properties of the sequence do not change with the translation of time, that is, the mean and covariance do not change with the translation of time . Time series refers to a sequence formed by arranging the values ​​of a certain statistical indicator of a certain phenomenon at different times in chronological order. Stationary time series Roughly speaking, a time series is said to be stationary if there is no systematic change in the mean (no trend), no systematic change in the variance, and periodic changes are strictly eliminated.

Stationary time series is the most important special type of time series analysis. So far, time series analysis is basically based on stationary time series, and its methods and theories are limited for statistical analysis of non-stationary time series.

Here we need to understand the basic properties about the time series and the distribution of the corresponding points:

1. Distribution, mean and covariance functions of time series

Probability distributions

Since it is generally impossible to determine the distribution function of time series, people pay more attention to the description of various characteristic quantities of time series, such as mean function, covariance function, autocorrelation function, partial autocorrelation function, etc. These characteristic quantities can often represent random The main characteristics of the variable.

mean function

\left \{ X_{t},t=0,t=(-1,1),t=(-2,2),..., \right \}The mean function of a time series is:

 \mu _{t}is the \left \{ X_{t} \right \}mean function, which is essentially a series of real numbers, and the mean represents the swing center of the random process at each moment.

autocovariance function

 symmetry:\gamma (t,s)=\gamma (s,t)

autocorrelation function

 The autocorrelation function describes the \left \{ X_{t} \right \}correlation structure of the time series itself. The autocorrelation function of the time series is symmetric and hasp(t,t)=1

2. Autocovariance and Autocorrelation Functions of Stationary Series

autocovariance function

If \left \{ X_{t} \right \}it is a stationary sequence, assuming EX_{t}=0, then we can use the following notation to represent the autocovariance function of a stationary sequence:

autocorrelation function

 Correspondingly, the autocorrelation function of strictly stationary series is written as:

Properties of Autocovariance Sequences and Autocorrelation Function Sequences for Stationary Sequences

Stationary time series can be divided into strictly stationary time series and wide stationary time series.

3. Strictly stationary time series

If for any n values ​​of time t , the joint distribution t_{1}<t_{2}<...<t_{n}of random variables in the sequence is independent of the integer S, that is:X_{t_{1}+S},X_{t_{2}+S}...,X_{t_{n}+S}

It is X_{t}called strictly stationary/narrowly stationary/strongly stationary time series. A strictly stationary probability distribution is independent of time.

4. Wide Stationary Time Series

For example, the time series has a finite second-order matrix, and X_{t}the following two conditions are satisfied:

The time series is called a broadly stationary process.
The mean of each random variable in a broadly stationary process is constant, and the covariance of any two variables is only related to the time interval (t-s).

5. Contact and distinction

connect

  • If a sequence is strictly stationary and has a finite second-order matrix, then the sequence must also be broadly stationary.
  • If the time series is a normal series (that is, any finite-dimensional distribution of it is a normal distribution), then the series is a strictly stationary sequence and a broadly stationary sequence is equivalent to each other.

the difference

  • A strictly stationary probability distribution is invariant over time, and the mean and autocovariance of a broadly stationary series are invariant over time.
  • A strictly stationary sequence is not necessarily a broadly stationary sequence; a broadly stationary sequence is not necessarily strictly stationary.

The stationary time series discussed in practice is a wide stationary time series, which means that at any time, the mean and variance of the sequence exist and are constant, and the autocovariance function and autocorrelation coefficient are only related to the time interval k. Only stationary time series can be used for statistical analysis, because stationarity ensures that the time series data are all from the same distribution. A unit root test can be used to test the stationarity of a time series.

If there is no correlation between the sequence values ​​of a stationary time series, it means that there is no regularity in the data before and after, and effective information cannot be mined. This kind of sequence is called a pure random sequence. Among the pure random sequences, there is a sequence called white noise sequence, which is random and the variance of each period is consistent. Stationary time series analysis lies in fully mining the relationship between time series. After the relationship in the time series is extracted, the remaining sequence should be a white noise sequence .

6. White noise sequence

definition

If the time series \left \{ X_{t} \right \}satisfies the following properties:

 Then this sequence is called white noise sequence.
White noise sequence is a special kind of wide stationary sequence, and it is also the simplest stationary sequence.

7. IID Sequences

definition

If \left \{ X_{t} \right \}the random variables in the time series X_{t}are independent random variables and X_{t}have the same distribution (when X_{t}there is a first moment, it is often assumed EX_{t}=0), then it is \left \{ X_{t} \right \}called an independent and identically distributed sequence.

 The IID sequence \left \{ X_{t} \right \}is a strictly stationary sequence
. Generally speaking, the white noise sequence and the IID sequence are two different sequences.
But when the white noise sequence is a normal sequence, it is also an independent and identically distributed sequence, at this time we call it a normal white noise sequence.

Second, the autoregressive model (Auto regressive model , referred to as AR)

definition

An autoregressive model is a statistical method of dealing with time series. It uses the same variable, such as the previous periods of x, x_{1}to x_{t-1}predict x_{t}the performance of the current period, and assumes a linear relationship between them. Because this is developed from linear regression in regression analysis, but instead of predicting y from x, use x to predict x (self); so it is called autoregression .

The autocorrelation coefficient of the p-order autoregressive model is tailed, and the partial autocorrelation coefficient p-order truncated.

 where: c is a constant term; is assumed to be a random error value \varepsilon _{t}with mean equal to 0 and standard deviation equal to ; assumed to be constant for any t.\sigma\sigma

 The text description is: the expected value of X is equal to a linear combination of one or more lag periods, plus a constant term, plus random errors.

Modeling steps

first step:

White noise detection is performed on time accumulation. If the sequence is determined to be white noise after inspection, there is no regularity in the data before and after, and effective information cannot be mined. Continue only if the detection is not a white noise sequence.

Step 2:

Check the stationarity of the sequence, if it is judged to be non-stationary by the test, then perform the stationary processing of the sequence, and go to the first step; otherwise, go to the third step;

third step:

Identify the model, estimate its parameters, and go to the fourth step;

the fourth step:

Test the applicability of the model. If the test passes, the fitted model can be obtained and the sequence can be predicted; otherwise, go to the third step;

Model decision

For the observed time series, if it is determined to be non-white noise through white noise detection, and it is determined to be stationary after the stationarity test, the model is often identified according to the correlation coefficient and partial correlation coefficient. Determine whether the problem is suitable for AR model modeling, and roughly determine the order p.

By calculating the autocorrelation coefficient SAF , and the partial correlation coefficient PACF .

If a time series meets the following two conditions:

  • ACF is tailing, that is, ACF(k) does not become 0 after k is greater than some constant.
  • PACF is truncated, that is, PACF(k) becomes 0 when k>p. ( Used to determine the order, PACF is not strictly 0 after the p-order delay, but fluctuates in a small range around 0 )

Then you can try the AR model.

Case realization

Autoregression is implemented in Python using AutoReg in the statsmodels.tsa.ar_model package.

from statsmodels.tsa.ar_model import AutoReg

 The parameters of this function are:

ar_model = AutoReg(endog,
                   lags,
                   trend='c',
                   seasonal=False,
                   exog=None,
                   hold_back=None,
                   period=None,
                   missing='none',
                   *,
                   deterministic=None,
                   old_names=False)

I will not explain them one by one here. If you want to know more about it, you can go to the official explanation:

statesmodels.tsa.ar_model.AutoReg — statesmodels

Usually only the sequence endog and order lags need to be entered:

endog: accepts an array type, a one-dimensional time series.

lags: model order, if an integer, the number of lags to include in the model or a list of lag indices to include. For example, [1, 4] will only include lags 1 and 4, while lags=4 will include lags 1, 2, 3 and 4.

from statsmodels.tsa.ar_model import AutoReg
import numpy as np
 
# 生成N(0,1)随机正态分布(白噪声)
noise = np.random.randn(200)
wnoise = (noise-np.mean(noise))/np.std(noise)
 
# 生成AR(2)线性序列 X(t)=0.5X(t-1)+0.1X(t-2)
X = [20,10]  # 初值
for i in range(200):
    x2 = 0.5*X[i+1]+0.1*X[i]+wnoise[i]
    X.append(x2)
 
# 2阶AR模型拟合(OLS)
AR2_model = AutoReg(X, 2).fit()  
# 模型预测第3到第202个时间点的数据
predict = AR2_model.predict(2,201)  
# 计算残差
residual = X[2:]-predict

After we simulate and generate the AR linear charge, we perform AR2-order fitting to obtain the predicted value, and then use matplotlib to display:

import matplotlib.pyplot as plt
plt.figure()
plt.subplot(311)
plt.plot(X[2:])
plt.subplot(312)
plt.plot(predict)
plt.subplot(313)
plt.plot(residual)

 


Pay attention, prevent getting lost, if there are any mistakes, please leave a message for advice, thank you very much

That's all for this issue. I'm fanstuck. If you have any questions, feel free to leave a message to discuss. See you in the next issue.

see

Related concepts of stationary time series

(51) Time Series Analysis II: Stationary Time Series Analysis (ARMA) MA and ARMA
models

Python implementation of autoregressive model AR(p) [case]_Noema_pku's blog-CSDN blog_python autoregressive model

Time Series Analysis - Autoregression (AR)

Guess you like

Origin blog.csdn.net/master_hunter/article/details/126619423