Mathematical Modeling Study Notes (15) Time Series Analysis

Time series analysis overview and data preprocessing

The concept of time series : also called dynamic sequence, refers to a numerical sequence in which the indicator values ​​of a certain phenomenon are arranged in time order.

The components of time series : time elements and numerical elements.

Classification of time series :

  • Period time series : the result of numerical elements reflecting the development of a phenomenon within a certain period;
  • Time point time series : Numerical elements reflect the instantaneous level of a phenomenon at a certain point in time.

Note : Time series can be cumulatively added, but time point series cannot be added. Therefore, the following gray prediction model can only be used for period time series.

Contents of time series analysis : Time series analysis can be divided into three parts: describing the past, analyzing patterns, and predicting the future.

Data preprocessing (removing missing values) : Missing value processing is the basic preprocessing of time series analysis models.

  • Missing value processing method : If the missing value occurs at the head or tail of the time series, the direct deletion method is used; if the missing value occurs in the middle of the sequence, it cannot be deleted, and the missing value replacement method can be used.
  • SPSS provides five methods to replace missing values : sequence mean; mean of nearby points; median of nearby points; linear interpolation; linear trend of nearby points.
  • SPSS performs data missing value preprocessing :

1. Open the SPSS software and import the data, click:转换→替换缺失值

Insert image description here

2. Select the variable that needs to replace missing values, specify the name of the new variable and the method to replace missing values.

Insert image description here
Data preprocessing (defining time variables) : You need to specify which attribute is a time variable to avoid errors.

  1. Open the SPSS software that imported the data and click:数据→定义日期和时间
    Insert image description here
  2. Select a case type and specify a start time.
    Insert image description here

SPSS draws time series graph :

  1. Click in sequence: 分析→时间序列预测→序列图
    Insert image description here
    2. Select the time variable and dependent variable.
    Insert image description here
    The completed picture can be double-clicked to perform related optimizations.

time series decomposition model

The premise of time series decomposition : time series decomposition can only be used if the data is periodic, that is to say, the time series decomposition model cannot be used for year data.

Decomposition elements of time series :

  • Long-term trend (T) : Statistical indicators show a continuous upward or downward trend over a long period of time, affected by long-term trend factors.
  • Seasonal trend (S) : The indicator value changes periodically due to seasonal changes. Seasons here are broad, and generally use months, quarters, and weeks as time units, and cannot use years as units.
  • Cyclic change (C) : Different from the seasonal change cycle, it often takes several years as a cycle and appears as a wave-like cyclic change on the graph.
  • Irregular changes (I) : Numerical changes caused by certain random factors. The effects of these factors are unpredictable and irregular. They can be regarded as the impact of many accidental factors on the time series, that is, disturbances in regression. item.

The above four changes are the decomposition results of the numerical changes in the time series. Sometimes these changes appear simultaneously in a time series, but sometimes only one or a few types may appear.

Superposition model and product model :

  • Usage : If the four changes are independent of each other, the superposition model should be used; if there is mutual influence, the product model should be used.
  • Recommended use : If on a time series graph, the seasonal fluctuations of the series become larger and larger as events go by, it is recommended to use the product model; if the seasonal fluctuations remain stable, it is recommended to use the superposition model. When there are no seasonal fluctuations then both decompositions are possible.

SPSS performs time series decomposition :

  • Click in sequence: 分析→时间序列预测→季节性分解
    Insert image description here
    2. Select the variables that require time series decomposition and specify whether the model is superposition or product. If the period length is an odd number, all points are selected to be equal, and if the period length is an even number, the end points are selected to be weighted by 0.5.
    Insert image description here

SPSS time series analysis results :

Insert image description here

Insert image description here

Interpretation of results :

  • Superposition model : The seasonal factor of each season indicates the level at which the indicators of that season exceed the average indicators of the whole year. If it is greater than zero, it means it is higher than the annual average index, and if it is less than zero, it means it is lower than the annual average index.
  • Multiplicative model : The seasonal factor of each season indicates how many times the indicator of that season is the average indicator of the whole year. If it is greater than one, it means it is higher than the annual average index, and if it is less than one, it means it is lower than the annual average index.

How to use the results for prediction : Add the seasonal factor to the seasonally adjusted sequence in the prediction result variable to obtain a new variable sequence. After fitting the sequence, you can use the fitting function for prediction.

SPSS Expert Modeler : Find the best fitting model from exponential smoothing and ARIMA models.

exponential smoothing model

Simple exponential smoothing model :

  • Applicable situations : Suitable for time series without trend and seasonal components.
  • Prediction principle : Each smoothed data is obtained by the weighted sum of past data. The closer the data is to the current period, the greater the weight.
  • Model shortcomings : The simple exponential smoothing model can only predict one period.

Hult linear trend model :

  • Applicable conditions : linear trend, does not contain seasonal components.
  • Forecasting principle : There are two smoothing equations (level smoothing equation and trend smoothing equation) and one forecasting equation.

Brownian linear trend model : A special case of Huot's linear trend model, which considers the horizontal smoothing parameter and the trend smoothing parameter in the model to be equal.

Damped trend model :

  • Applicable situation : The linear trend gradually weakens and does not contain seasonal components.
  • Model principle : Extended on the basis of Hult's linear trend model. The Hult linear trend model tends to overestimate long-term forecasts, and the damped linear trend model mitigates higher linear trends.

Simple seasonal model : suitable for models with stable seasonal components and no trends.

Winter additive model : suitable for seasonal components with linear trends and stability.

Winter multiplication model : suitable for seasonal components with linear trends and instability.

ARIMA model

Stationary time series :

  • Advantages of stationary time series : Stationary time series are the easiest time series to process.
  • Three conditions need to be met for a stationary time series : the mean is a fixed constant; the variance exists and is constant; the covariance is only related to the interval and has nothing to do with the time point;
  • Stationarity test : Generally, whether the time series is stationary can be judged by observing the time series diagram, or by hypothesis testing.

Note : The above requirements are called covariance stationarity, also known as weak stationarity. There is also a kind of strict stationarity, which is too demanding. Therefore, if there is no special explanation in the time series, it defaults to weak stationarity.

  • White noise sequence : A weakly stationary time series with a mean value of 0 is called a white noise sequence, so the white noise sequence is a special stationary time series. Generally, the disturbance term in the time series is regarded as a white noise sequence.

Difference equation :

  • Definition : A function equation that expresses a time series variable as a function of the lag term of the variable, time and other variables is called a difference equation.
  • Homogeneous part of the difference equation : a calculation that contains only the variable itself and its lag terms.
  • Lag operator : A convenient representation.

The homogeneous part of the difference equation is converted into a characteristic equation. The characteristic equation has p solutions. The size of the module length of these p solutions determines whether the dependent variable sequence of the ARMA(p,q) model is stationary.

P-order autoregressive model (AR model) :

  • Model structure : Use its own lag terms as independent variables for regression analysis.
  • Applicability : Autoregression can only be used to predict economic phenomena related to its own previous period, that is, economic phenomena that are greatly affected by historical factors. Autoregression is not suitable for economic phenomena that are greatly affected by social factors.
  • Note : The AR model discussed here must be a stationary time series model. If the original sequence is not stationary, it must be converted into a stationary sequence before modeling can be performed.

Note : The AR model has an algorithm specifically used to determine stationarity. Some non-stationary models can be transformed into stationary time series through the difference method.

Moving average model (MA model) :

  • Model stationarity : It can be proved that the MA(q) model must be stationary as long as q is a constant.
  • Relationship with the AR model : In order to simplify the workload of parameter estimation of the AR model, the MA model can be introduced so that the parameters in the model can be as few as possible.

Autoregressive moving average model (ARMA model) :

  • Model principle : Try to combine the autoregressive process and the moving average process.
  • Model stationarity : Stationarity is only related to the autoregressive AR part.
  • Model difficulty : It is difficult to correctly identify the order of the ARMA model.
  • Model parameter solution : The most commonly used parameter estimation method for the ARMA model is the maximum likelihood estimation method.

Model completeness test :
After the time series model estimation is completed, the residuals need to be tested for white noise. If the residual is white noise, it means that the model we selected can fully identify the patterns of time series data, so the model is acceptable; if the residual is not white noise, it means that there is still some information that has not been identified and utilized, and the model needs to be modified. Identify this part of the information.
SPSS will automatically calculate the P value. If the P value is less than 0.05, the null hypothesis will be rejected. At this time, the model is not fully recognized and needs to be corrected.

ARIMA model : Differential autoregressive moving average model. First, the original time series decomposition is differentiated to make it stationary, and then the ARMA model is applied to solve it.

SARIMA model : A model generated by including additional seasonal terms in the ARIMA model.

Steps to use SPSS Expert Modeler

Expert modeler principle : The expert modeler in SPSS automatically finds the best-fitting model for each dependent sequence. If independent variables are specified, the Expert Modeler selects those models for the content in the ARIMA model that have a statistically significant relationship with the dependent series. Model variables were transformed using difference or square root or natural log transformations, as appropriate. You can limit the Expert Modeler to search only ARIMA models or only exponential smoothing models by default, and you can specify automatic detection of outliers.

1. Click in sequence:分析→时间序列预测→创建传统模型

Insert image description here

2. Set the dependent variable of the time series. (You can also manually choose to filter only from exponential smoothing and ARIMA models)

Insert image description here

3. Set up automatic detection and modification of abnormal values. Click: 条件→离群值→自动检测离群值and check all outlier types.

Insert image description here

4. Click Statistics and check Parameter Estimates.

Insert image description here

5. Click on the graph and check the fitted value, residual autocorrelation coefficient and residual partial autocorrelation coefficient.

Insert image description here

6. Click "Save" and check the predicted value.

Insert image description here

7. Click "Options" and select "Evaluate period ends for cases between the first case and the specified date, and enter the end time.

Insert image description here

Remarks :
① The difference between predicted values ​​and fitted values : Fitted values ​​are a re-fitting of existing time series, while predicted values ​​are predictions of future development.
②Retain the role of ACF and PACF : determine whether the residual is white noise. If so, the time series model can be considered to be completely identified.

Expert modeler running results :

  • Model type :
    Insert image description here
    It can be seen from the table that the best model is the Winter additive model. Therefore, it can be judged that the time series contains a linear trend and stable seasonal components, so superposition time series decomposition can also be used.

  • Model fitting degree :
    Insert image description here
    Model evaluation index :
    ① Stationary R square and normalized BIC : You can use stationary R² or standardized BIC (BIC criterion). These two indicators take into account both the quality of the fitting and the complexity of the model. .
    ②R square : R square can also be used to evaluate the quality of the fitting results. The closer it is to 1, the better the fitting effect.

  • Model Statistics :

    Significance : If the significance is less than 0.05, it means that the residual sequence is not white noise, and if it is greater than 0.05, it means that the residual sequence is white noise.
    Number of outliers : The number of outliers.

  • Model parameters : (here are the model parameters of Winter's additive model)
    Insert image description here

  • ACF and PACF diagram :

Insert image description here
All blue bars located between two black lines indicate complete recognition of the model by ACF and PACF.

  • Fit the image :
    Insert image description here

Guess you like

Origin blog.csdn.net/hanmo22357/article/details/128799905