sequentially
(Long Time Series Forecasting)
window sliding
Short-term prediction: prophet (suitable for trend prediction), arima (short-sequence prediction is accurate, but the trend is not accurate)
Long-term prediction: informer
Basic definition:
Regularity: long-term trends, seasonal changes, cyclical changes, irregular changes
Overlay model: volatility of time series plot remains constant
Product Models: Series with Increasingly Large Seasonal Fluctuations
Three basic models: exponential smoothing model, autoregressive model (AR), moving average model (MA), autoregressive moving average model (ARMA)
traditional method
Basic steps of the model
1. Classic
1> Time series decomposition
Decomposed into three parts: trend item, seasonal item, residual item, addition and multiplication, for multiplication, the logarithm can be changed into an addition item
2> Common decomposition methods
Moving average method : Estimate the cycle length, take a cycle as an interval, take the average value of a piece of data, get the trend, and then use the original data to subtract the trend, find the mean value of different blocks of data, get the cycle, and calculate the residual
X11 : This method is robust and can handle outliers and data mutations in time series
2. ARIMA
The essence is AR+I+MA
To determine the order of difference, ensure that the data is stationary. In determining AR, MA or ARMA
The best quality of piq can be determined by grid search, generally using the AIC/BIC information criterion as a variable
3.Holt model
Data without obvious trends and seasonal factors can use simple exponential smoothing (first-order holt)
transformer
(QKV calculation method reconstruction input)
(attention mechanism
Attention should be fast, decoder should output once, stacking encoder should be fast)
RNN (Recurrent Neural Network)
The input data is no longer independent of each other, and the latter data may have an impact on the former
(Time series is not limited, it can also be text data, video data)
Still a sliding window (Makarov assumption)
LTSM
Basic structure of the model
class LSTM(nn.Module):
def __init__(self, input_size=1, hidden_layer_size=100, output_size=1):
super().__init__()
self.hidden_layer_size = hidden_layer_size
self.lstm = nn.LSTM(input_size, hidden_layer_size)
self.linear = nn.Linear(hidden_layer_size, output_size)
self.hidden_cell = (torch.zeros(1,1,self.hidden_layer_size),
torch.zeros(1,1,self.hidden_layer_size)) # (num_layers * num_directions, batch_size, hidden_size)
def forward(self, input_seq):
lstm_out, self.hidden_cell = self.lstm(input_seq.view(len(input_seq) ,1, -1), self.hidden_cell)
predictions = self.linear(lstm_out.view(len(input_seq), -1))
return predictions[-1]
- Data processing can be scaled with MinMaxScaler for better results
- There are generally two forms of i input: one is to think that the vector is the input of each time step, and the other is to input (year, month, day, hour, minute, second) n-dimensional vector
- The fully connected layer is spliced in the last unit of LSTM
- Generally only supports single-step prediction (sliding window)
- Combining teacher-force and no-teacher-force (it's not clear what)
Personal project summary
Sequence stationarization test (determining d value)
Plot the sequence, perform ADF test, and then perform first-order stabilization and second-order stabilization
If the test value t is greater than three level statistics, it is non-stationary
Determining p-values and d-values
1. The p value is judged by the maximum lag point of the partial autocorrelation coefficient PACF graph, and the q value is judged by the maximum lag point of the autocorrelation coefficient ACF graph
2. Traverse search for the combination of AIC and BIC minimum parameters
3. Fit ARIMA model (p, d, q)
Measure to judge
Residual judgment
Determine the linear and normal distribution