Article link:
SPSS software practice - ARIMA time series forecasting model https://blog.csdn.net/beiye_/article/details/125435166?spm=1001.2014.3001.5501
ARIMA model
The full name of the ARIMA (p, i, q) model is the differential autoregressive moving average model (Autoregressive Integrated Moving Average Model)
Basic principle: the model established by converting the non-stationary time series into a stationary time series and then regressing the dependent variable only on its lag value and the present value and lag value of the random error item.
The ARIMA model essentially consists of three parts, AR (p-order autoregressive model) + I (i-order difference) + MA (q-order moving average model) .
The formula for AR is:
The formula for MA is:
where is white noise with mean 0 and variance 1.
ARIMA must first determine the difference order i to ensure that the data is stable after the i-order difference; then, determine whether it is AR (q=0), MA (p=0) or ARMA (p, q are not 0).
Parameters p and q can be selected through ACF diagram and PACF diagram.
1. Stability
Stationarity is the requirement that the fitting curve obtained through the sample time series will continue to follow the existing shape "inertia" in a period of time in the future.
Stationarity requires that the mean and variance of the series do not change significantly.
2. Difference method (find i)
Difference method: the difference between time series at time t and t-1.
3. Autoregressive model (AR)
Describe the relationship between the current value and the historical value, and use the historical time data of the variable itself to predict itself. Autoregressive models must meet the requirements of stationarity.
The formula definition of P -order autoregressive process:
Where yt is the current value, is a constant term, P is the order, ri is the autocorrelation coefficient, and is the error
Limitations of autoregressive models:
- Autoregressive models use their own data to make predictions
- must be stationary
- Must have autocorrelation, if the autocorrelation coefficient ( ) is less than 0.5, it should not be used.
4. Moving average model (MA)
Moving average models focus on the accumulation of error terms in autoregressive models. The moving average model can effectively eliminate random fluctuations in forecasting.
The formula definition of the automatic regression process in the q stage:
5. Autocorrelation Function ACF (Autocorrelation Function)
An ordered sequence of random variables and their comparisons. The autocorrelation function reflects the correlation between the values of the same sequence in different time series.
official:
The value range is [-1, 1]
6. Partial Autocorrelation Function (PACF) (Partial Autocorrelation Function)
For a stationary AR(q) model, when the lag k autocorrelation coefficient p(k) is calculated, what is actually obtained is not a simple correlation between x(t) and x(tk). x(t) will also be affected by k-1 random variables x(t-1), x(t-2), ... x(tk-1), and these k-1 random variables are all related to x (tk) has a correlation, so the autocorrelation coefficient p(k) is actually mixed with the influence of other variables on x(t) and x(tk)
The partial autocorrelation function (PACK) eliminates the interference of the middle k-1 random variables. ACF also includes the influence of other variables, while the partial autocorrelation coefficient PACF is strictly the correlation of these two variables.
7. ARIMA (p, i, d) order determination
Model |
ACF |
PACF |
AR(p) |
Decay goes to zero (geometric or oscillatory) |
Censored after p order |
MA(q) |
Censored after q order |
Decay goes to zero (geometric or oscillatory) |
WEAPON(p,q) |
Attenuation tends to zero after the qth order (geometric or oscillatory) |
p-order decay tends to zero (geometric or oscillatory) |
Censored: falls within the confidence interval (95% of the points meet this rule)
ARIMA modeling process summary:
- Data collection
- Timing Diagrams and Testing Stationarity
- Stabilize the non-stationary sequence, and determine the number of differences d
- According to the number of differences d, establish a difference sequence
- Model identification and order determination, p-order and q-order determination: ACF and PACF
- Parameter Estimation of the Model
- Model fitness test
- Forecasting with an ARIMA(p,d,q) model