An article explaining the stationarity of time series

This article is reprinted from Zhizhihu, author StormBlafe, editor: Time Series Man

Original link: http://zhuanlan.zhihu.com/p/652229829

1

Why should we care about the stationarity of time series?

Reason one: The data structure of time series data is different from the traditional statistical data structure. The biggest difference is that traditional random variables can obtain multiple observation values ​​(such as dice points, which can be rolled repeatedly to obtain multiple observation values, ignoring the difference in time). In time series data, each random variable has only one observation value (for example, assuming the closing price is the random variable of the study, there is only one closing price every day, and the prices on different days follow different distributions, that is, taking into account the difference in time). In this way, only one observation value can be obtained for each distribution, which is too small to study the properties of the distribution. However, through stationarity, intrinsic correlations are found between distributions on different dates, which alleviates the problem of low estimation accuracy due to small sample size.

Reason two: The ultimate purpose of studying time series is to predict the future. But the future is unknowable. The data we have are all historical, so we can only use historical data to predict the future. But if there isn't some "similarity" between past data and future data, then this prediction makes no sense. Stationarity is to ensure the similarity between the past and the future. If the data is stationary, then it can be considered that certain properties displayed by the past data will also be displayed in the future.

Therefore, it is easier to model future data when it is similar to the present. Stationarity describes the concept that the statistical characteristics of a time series do not change over time. Therefore some time series forecasting models, such as autoregressive models, rely on the stationarity of the time series. Some time series forecasting models (e.g., autoregressive models) require stationary time series because they are easier to model because they have constant statistical properties. Therefore, if the time series is not stationary, you should try to make it stationary.

2

What is Yan Pingping?

For a time series {Xt}, each data X is a random variable and has its distribution (as shown in the figure).

272a7e9dc308af39776ccf095fba31df.jpeg

Taking m consecutive data, X1 to Xm, you can form an m-dimensional random vector, (X1, X2,..., Xm)

Since each individual random variable

The essence of Yanping is that this joint distribution does not change over time.

That is to say, when fetching data, any m pieces of data fetched continuously (whether fetched from X1 to Xm, or from Xt to Xt+m), the joint distribution of the multidimensional vectors composed of them is the same.

At this time, relax another condition and let the value of m be arbitrary.

That is, no matter how wide the window for taking data is set, as long as the same number of data are taken continuously, the joint distribution formed by them will be the same.

For example, (X1,X2,X3) and (X6,X7,X8) have the same 3-dimensional joint distribution, (X1,X2,X3,X4) and (X6, distributed.

In summary, a time series that meets the above properties is strictly stationary.

3

With Yan Pingping, why do we need Kuan Pingping?

In many cases, we have no way of knowing what the distribution of these random variables looks like.

The data we observe is just a value of a random variable that obeys an unknown distribution.

Since it is difficult to find the distribution of a single random variable, let alone the joint distribution of a multi-dimensional random vector composed of a bunch of random variables.

Therefore, although strict stationarity is a great way to ensure that past and future data are "similar", it is too idealistic and it is actually difficult to test the strict stationarity of a time series.

So the conditions could only be relaxed, thus giving rise to the concept of "wide stability".

4

What is "k-order moment"

"Moment" is a characteristic number of random distribution. The number of features, as the name suggests, reflects some characteristics of a random distribution. For example, "mathematical expectation" reflects that the value of a random variable that conforms to a certain distribution always fluctuates around a certain value; while "variance" reflects the magnitude of this fluctuation.

Moments are divided into origin moments and central moments. The first-order origin moment is the mathematical expectation, and the second-order central moment is the variance.

Usually, the moments within 2nd order (including 2nd order) are called low-order moments, and those above 2nd order are called high-order moments.

However, there are mutually deduced formulas between the two. If you know one, you can deduce the other, so it is generally just called "moment".

Among them, the k-order origin moment of a random variable is defined as the mathematical expectation of the k-th power of the random variable, that is, E(Xk). What we usually call "the existence of k-order moments" means that this mathematical expectation is not infinite (that is, less than infinity). This is the same as the definition of "limit existence".

It is worth noting that if a certain higher-order moment of a random variable exists, then a lower-order moment must also exist. Because |X|k-1≤|X|k+1.

Since the joint distribution is the same in Yanping, the moments of each order are also the same.

5

What is wide stability?

Wide stationarity is defined using the characteristic statistics of the sequence. It believes that the statistical properties of the sequence are mainly determined by its low-order moments.

When the time series meets the following three conditions:

The first condition is that the second-order moment exists at any time.

The second condition is that the expectation (first moment) of the random variable does not change over time. To put it bluntly, the mean μ does not change with time t.

The third condition, the autocorrelation coefficient between the random variables at two time points, is only related to the time difference between the two time points and does not change with the passage of time. To put it bluntly, as long as the window width (that is, the time difference between two points) is fixed, the autocorrelation coefficient is unique.

It is called wide and stable.

Due to several conditions involved in the definition, wide stationarity is also called covariance stationarity, or second-order stationarity.

6

Some conclusions on stationarity

If a time series is stationary, then:

The mean is a constant independent of t. That is, in the distribution at different points in time, the random variables fluctuate around the same value. It is reflected in the time series chart (the horizontal axis is the time axis, and the vertical axis measures the value of the random variable), that is, the entire graph fluctuates around a certain horizontal line (similar to the graph in political economy where prices fluctuate around value).

The variance is a constant independent of t. This is not clearly reflected in the definition, but since the definition shows that the autocorrelation coefficient is only related to the window width and has nothing to do with the window position, that is, time t, so you can simply take a window with a width of 0, so it is originally separated by one window width. The correlation between the data at two time points becomes the correlation between the data at the same time point and itself. Of course, the correlation coefficient between oneself and oneself is 1.

Covariance is a constant.

What is the relationship between Yan Pingping and Kuan Pingping?

The essence of strict stationarity is to limit the distribution of time series, while the essence of wide stationarity is to limit the low-order moments.

Since the conditions for wide stability are looser than strict stability, generally speaking, strict stability can deduce wide stability, but wide stability cannot infer strict stability. But there are exceptions.

Because when it is widely stationary, the conditions for the existence of second-order moments need to be satisfied. However, Yan Pingping does not need to satisfy the existence of second-order moments.

Therefore, there is no strictly stationary sequence of second-order moments and cannot satisfy wide stationarity. For example, a strictly stationary Cauchy distribution sequence does not conform to wide stationarity (the first and second moments do not exist, so wide stationarity cannot be verified).

Therefore, only when the second-order moment exists, the strictly stationary sequence satisfies the wide stationarity.

Special case: When the sequence obeys a multivariate normal distribution, a wide stationary sequence can definitely deduce strict stationarity.

The reason is that the second-order moment of a normal time series is stationary, which is equivalent to a stationary distribution (its density function shows that the n-dimensional normal distribution is only determined by its mean vector and autocovariance matrix).

normal time series

If a time series is taken out of any n (limited number) random variables, and the n-dimensional random vector composed of them all obeys the n-dimensional normal distribution, it is called a normal time series. That is the special case above.

7

How to check stationarity?

You can test the stationarity of a time series in two ways:

  • Intuitive method: visual assessment

  • Statistical method: unit root test

We will create several examples that explain the visual assessment of stationarity using the methods mentioned in Hyndman and Athanasopoulos' time series analysis textbook Forecasting: principles and practice, and expand on their use, and explain stationarity using the unit root test. test. Data come from R’s fma package.

115d7edcbe63c32f820dec6a9a1f4796.jpeg

Method 1: Intuitively assess stationarity

The simplest method is to split the time series in half and compare the mean, amplitude, and period length from the first half to the second half of the time series.

  • Mean Constant - The mean of the first half of the time series should be similar to the mean of the second half.

  • Variance Constant - The amplitude of the first half of the time series should be similar to that of the second half.

  • Covariance is independent of time - the period length in the first half of the time series should be similar to the period length in the second half. Periods should be independent in time (e.g. not weekly or monthly, etc.).

c7f3225921298caba8101cc5e93849d5.jpeg

For our example, the evaluation results are shown below:

e0738fe6b2aa7bf5a64fbf5bfc43ba7b.jpeg

Method 2: Statistical assessment of stationarity—unit root test

The unit root is a random trend called a "random walk with drift." Since randomness cannot be predicted, this means:

unit root exists: unstable (unpredictable), unit root does not exist: stationary

In order to test stationarity with unit roots, two of these two assumptions can be used as initial assumptions:

  • Null Hypothesis (H0) - The time series is stationary (no unit root exists)

  • Alternative Hypothesis (H1) - The time series is not stationary (unit root exists)

The rejection of the null hypothesis is then evaluated based on the following two methods:

p-value method:

If the p-value > 0.05, the null hypothesis cannot be rejected. If the p-value ≤ 0.05, reject the null hypothesis.

Critical value method:

If the test statistic is not as extreme as the critical value, the null hypothesis cannot be rejected. If the test statistic is more extreme than the critical value, reject the null hypothesis. When the p-value is close to 0.05, the critical value method should be used.

There are several unit root tests that can be used to check stationarity. This article will focus on the 2 most popular ones:

Augmented Dickey-Fuller test 和 Kwiatkowski-Phillips-Schmidt-Shin test

Method 3: Augmented Dickey-Fuller test


The hypotheses of the Augmented Dickey-Fuller test are:

H0: The time series is not stationary because there is a unit root (if p-value > 0.05)

H1: The time series is stationary because there is no unit root (if p-value ≤ 0.05)

in Python , we can directly use the adfuller method in the statsmodels.tsa.stattools library.

from statsmodels.tsa.stattools import adfuller
result = adfuller(df["example"].values)

If we can reject the null hypothesis of the ADF test, the time series is stationary:

e9f7473cb6079696227b67718a96cb27.jpeg

The following are the ADF test results of the sample data set:

fcbade28d80f4feb95b0a1f40495c4e9.jpeg

Method 4: Kwiatkowski-Phillips-Schmidt-Shin test

The hypotheses of the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test are [4]:

​​H0: The time series is stationary because there is no unit root (if p-value > 0.05)

H1: The time series is not stationary because there is a unit root (If p value ≤ 0.05) For

the kpss method in the statsmodels.tsa.stattools library, we need to use the parameter regression = 'ct' to specify that the null hypothesis of the test is that the data trend is stationary.

from statsmodels.tsa.stattools import kpss

result = kpss(df["example"].values,
regression = "ct")

If we cannot reject the null hypothesis of the KPSS test, the time series is stationary:

916fa095af9771ad26b9b5b63de54650.jpeg

The following are the KPSS test results of the sample data set:

6beae6984d82079d72df4cde870a8194.jpeg

8

Non-stationary time series data processing

We can apply different transformations to a non-stationary time series to make it closer to stationary: Because there are several types of stationarity, we can combine ADF and KPSS tests to determine which transformations to apply:

If the ADF test result is stationary and the KPSS test result is non-stationary, then the time series is differentially stationary - apply differencing to the time series and check again for stationarity. If the ADF test result is non-stationary and the KPSS test result is stationary, then the time series is trend stationary - you need to remove the trend and check stationarity again.


difference

Differential calculates the difference between two consecutive observations. It stabilizes the mean of the time series, thereby reducing the trend df["example_diff"] = df["example"].diff()

df99c72262ef8fa85d263a530414b221.jpeg

Detrending through model fitting

One way to remove trends from non-stationary time series is to fit a simple model (e.g., linear regression) to the data and then model the residuals of this fit.

from sklearn.linear_model import LinearRegression


# Fit model (e.g., linear model)
X = [i for i in range(0, len(airpass_df))]
X = numpy.reshape(X, (len(X), 1))
y = df["example"].values
model = LinearRegression()
model.fit(X, y)


# Calculate trend
trend = model.predict(X)


# Detrend
df["example_detrend"] = df["example"].values - trend

The result is as follows:

e141a573cab97422f82be2750a5d61c9.jpeg

It can be seen that detrending does not directly transform the series from non-stationary to stationary, but it is one step closer to stationary.


Logarithmic transformation

The logarithmic transformation stabilizes the variance of the time series.

df["example_diff"] = np.log(df["example"].value)

results as follows:

ae1f226f9c985b12db37eda12f614aab.jpeg

As you can see, neither detrending nor the logarithmic transformation of the model fit makes our example time series stationary. Therefore, it is necessary to combine different techniques to make the time series stationary. For example, logarithmic transformation is used to improve the variance constant, and then detrending is used to improve the mean constant term, and finally the sequence is stabilized.

9

Several common and special stationary and non-stationary sequences

stationary time series

  • white rant

One of the simplest stationary time series is white noise—an independent and identically distributed series with zero mean and homoscedasticity, denoted by a967dc3bc24a673f8e67c561292b3b2b.png.

When b4e7b218e7c97e531ffcedaf1ae8b09f.pngit obeys the normal distribution with mean 0, 17d49c8ff70e327f10c6b4d6c4773a94.pngit is called Gaussian white noise or normal white noise.

For any e1ed91d88812c3408d6cf34a4bc67083.pngvalues ​​with the same mean, the same variance, and independence, the covariance is 0, so the white noise sequence is stationary.

When a sequence is white noise, it means that there is no correlation before and after the sequence. Past behavior has no impact on future development. From a statistical analysis perspective, there is no value in any analytical modeling. Future trends cannot be predicted because the value of white noise is completely random.

At this time, the future prediction of the mean is the choice with the smallest residual error.

Only when the sequence is stationary and not white noise, it makes sense to apply analysis methods such as ARMA.

Usually after we model the time series, we also perform a white noise test on the residual sequence. If the residual sequence is white noise, it means that all the valuable information in the original sequence has been extracted by the model; if it is not white noise It is time to check the rationality of the model.

Simply use one line of code to generate a white noise sequence.

  • non-white noise

Stationary time series are more than just white noise. There are also stationary time series in life, but they are rare. Don’t be discouraged, many sequences can be transformed into stationary non-white noise sequences after simple processing. Intuitively, the overall mean and variance of the data do not change much, and can generally be considered stable. If you want to say that this judgment is too arbitrary, you can indeed test whether the sequence is stationary through statistical quantification, such as ADF test, KPSS test, etc.

After the seasonal difference of my country's quarterly GDP data since 2006, it can be considered a stationary time series.

import numpy as np
import pandas as pd
import akshare as ak
from matplotlib import pyplot as plt

df = ak.macro_china_gdp()
df = df.set_index('季度')
df.index = pd.to_datetime(df.index)

gdp = df['国内生产总值-绝对值'][::-1].astype('float')

gdp_diff = gdp.diff(4)

plt.figure(figsize=(12, 6))
gdp_diff.plot()
plt.show()

ba39effd2742f9ae9d9b9ca0761543c5.jpeg

non-stationary time series

Most time series are non-stationary and can generally be converted into stationary time series through methods such as difference and logarithm. If this is not achieved, stationary time series analysis methods cannot be used. Although there are various non-stationary time series analysis methods, and the quality of prediction depends on each person's ability, they are not as worry-free as stationary time series analysis.

For example, the closing price data of some stocks are non-stationary. The picture below shows the daily closing price data of Lai Yifen from 2019 to 2021. The overall trend seems to have no obvious pattern, and the fluctuations are different in different periods, so it can be considered a non-stationary time series. It would be nice if it was smooth and not white noise.

import pandas as pd
import akshare as ak
from matplotlib import pyplot as plt

df = ak.stock_zh_a_hist(symbol="603777", start_date="20190101", end_date='20210616')
df = df.set_index('日期')
df.index = pd.to_datetime(df.index)

close = df['收盘'].astype(float)
close = close[::-1]

plt.figure(figsize=(12, 6))
plt.plot(close)
plt.show()

d907db8b8b487a61a39bb827168015fa.jpeg

  • random walk

There is a special type of non-stationary time series called random walk, which is very simple and interesting.

A simple random walk process is defined as:1a19d05cc4a4cf93aad4f2a0d082ef3a.png

where e68dc279b80d3c34bee5fbeefd74a243.pngis white noise with zero mean.

Use two lines of code to simulate the random walk process:

import numpy as np
from matplotlib import pyplot as plt

y = np.random.standard_normal(size=1000)
y = np.cumsum(y)

plt.figure(figsize=(12, 6))
plt.plot(y)
plt.show()

5d0d827d9e921f71ed691e7de5e8e69f.jpeg

There is a concept in the financial field called the efficient market hypothesis, which believes that stock prices are a random walk. In other words, the example of the Shanghai Composite Index we just gave is a random walk. Compare the above random walk example chart with the previous stock price data trend chart to see if it makes a little sense.

A random walk process has perfect memory for information that occurred in the past. Like a drunkard walking, each step is taken randomly at the position of the previous step, so it can accumulate information at every step since the starting point. When I first left home, I occasionally returned, not knowing where I ended up. The mean is zero and the variance is infinite.

import numpy as np
from matplotlib import pyplot as plt

np.random.seed(5)

def random_walk():
    steps = np.random.standard_normal(size=1000)
    steps[0] = 0
    walk = np.cumsum(steps)
    return walk
    
plt.figure(figsize=(12, 6))
plt.plot(random_walk())
plt.plot(random_walk())
plt.show()

56fe0922f0eed80244fd93d800c7230b.jpeg

There is another phenomenon in life related to random walks, which is called "If you gamble for a long time, you will lose." Winning or losing every time you gamble is always uncertain (assuming a 50% winning rate, a 50-50 win or loss). The winning or losing of each gambling is a random variable, which can be considered as the step size. The amount of money in your hand will change with each win or loss 229b6bbe440d8597576ea81d634dac38.png. And changes, so the money in hand when gambling obeys the random walk model. The trend of the accumulated money in your hands is like the curve in the picture above, but it will not extend forever. When the curve touches the lower bound (when all the money in hand is lost), the game is over; or when the curve touches the upper bound (the banker has no money, he is so greedy that he cannot stop winning and must keep winning) , the game will also end. If the banker has infinite money, the gambler must lose it all. No matter how little the banker's money is, it is still much more than the gambler's principal. Therefore, I don’t know where the upper bound is, but the lower bound is very clear. The probability of the curve wandering to the upper bound is almost zero. It is the gambler who loses everything after all, not to mention that the winning rate is often less than 50%.

  • Random walk with drift term

A random walk with a drift term is just a constant added to the random walk, that's all.

30b821fb2ea7291db72e51c4ff1db7eb.png

where 9f34aafa2f995187f3a9b50cb4e5871c.pngis a constant, called the displacement term or drift term.

The drift term creates a long-term trend in the random walk sequence. The slope of the long-term trend corresponds to the drift term. If the drift term is positive, there is an increasing trend, and if the drift term is negative, there is a downward trend.

Simulate the same code:

import numpy as np
from matplotlib import pyplot as plt

np.random.seed(123)

y = np.random.standard_normal(size=100)
y1 = np.cumsum(0.2+y)
y2 = np.cumsum(-0.2+y)

l1 = np.cumsum(0.2 * np.ones(len(y1)))
l2 = np.cumsum(-0.2 * np.ones(len(y2)))

plt.figure(figsize=(12, 6))
plt.plot(y1)
plt.plot(y2)
plt.plot(l1)
plt.plot(l2)
plt.show()

6c3e770f83083da6f50998f6e51b5dd7.jpeg

Whether it is a simple random walk or a random walk with a drift term, it can be converted into a purely random stationary time series - white noise through differential means.

The first difference of the random walk is white noise:

592705312c2f1386ed39d7ff9dc87b0c.png

The first difference of a random walk with a drift term is white noise + constant ccf9d5891b5bca07910f613234201af7.png:

b5d74e2933bdbd258bc437f167457dd1.png

10

Summarize

In time series forecasting, a time series that has constant statistical properties (mean, variance, and covariance) and is independent of time is described as stationary. Stationary time series are easier to model than non-stationary time series due to their stable statistical characteristics. Therefore, many time series forecasting models assume stationarity.

Stationarity can be checked visually or statistically. Statistical methods check for unit roots. The two most popular unit root tests are ADF and KPSS. Both tools can be found in the Python stattools library.

If the time series is non-stationary, you can try to make it closer to stationary by differencing, log transformation, or detrending.

Recommended reading:

My 2022 Internet School Recruitment Sharing

My 2021 summary

A brief discussion on the difference between algorithm positions and development positions

Internet school recruitment R&D salary summary

The current situation of Internet job hunting in 2022, gold 9 silver 10 will soon become bronze 9 iron 10! !

Public account: AI snail car

Stay humble, stay disciplined, and keep improving

6da10f717c1e451742584038142588a5.jpeg

Send [Snail] to get a copy of "Hand-in-Hand AI Project" (written by AI Snail)

Send [1222] to get a good leetcode test note

Send [Four Classic Books on AI] to get four classic AI e-books

Guess you like

Origin blog.csdn.net/qq_33431368/article/details/133191287