Interval prediction | MATLAB implements QRCNN-LSTM convolutional long short-term memory neural network quantile regression time series interval prediction

Interval prediction | MATLAB implements QRCNN-LSTM convolutional long short-term memory neural network quantile regression time series interval prediction

List of effects

1
2
3

basic introduction

1. Matlab realizes the time series interval prediction model based on QRCNN-LSTM quantile regression convolutional long-term and short-term memory neural network;
2. Multi-map output, multi-index output (MAE, RMSE, MSE, R2), multi-input and single-output, Contains different confidence interval diagrams and probability density diagrams;
3. data is a data set, power data set, using variables in the past period of time to predict the target, the target is the last column, and can also be applied to load forecasting and wind speed forecasting; MainQRCNN_LSTMTS is the main program , the rest are function files, no need to run;
4. The code is of high quality, with clear comments, including data preprocessing, and handling missing values. If it is nan, replace it with the previous line, which also includes kernel density estimation;

Model description

QRCNN-LSTM is a deep learning model combining convolutional neural network (CNN) and long short-term memory neural network (LSTM) for time series forecasting. Its core idea is to use CNN to extract local features of time series, and then input these features into LSTM for overall modeling. On this basis, the QRCNN-LSTM model also introduces the concept of quantile regression (Quantile Regression), which can be used to predict values ​​in different quantile ranges of time series, thereby improving the predictive ability of the model.
Specifically, the QRCNN-LSTM model divides the time series into different intervals, and each interval contains several consecutive time steps. For each interval, the model uses CNN to extract local features, and then inputs these features into LSTM for overall modeling to obtain prediction results in this interval. In order to predict the values ​​of different quantile intervals, the model introduces multiple output nodes in the output layer of LSTM, each node corresponds to a quantile, so that the values ​​of different quantiles can be predicted at the same time.
In general, the QRCNN-LSTM model has strong capabilities in time series forecasting and can be applied in multiple fields, such as finance, meteorology, energy, etc.

  • The specific formula of the QRCNN-LSTM model is as follows:

  • First, for the input time series data xt x_txt, we can extract local features vt ( i ) v_t^{(i)} through CNNvt(i)

v t ( i ) = f ( W ( i ) ∗ x t : t + h − 1 + b ( i ) ) v_t^{(i)} = f(W^{(i)} * x_{t:t+h-1} + b^{(i)}) vt(i)=f(W(i)xt:t+h1+b(i))

  • Among them, W ( i ) W^{(i)}W( i ) sumb ( i ) b^{(i)}b( i ) respectively represent the convolution kernel and bias item in CNN,hhh represents the size of the convolution kernel,fff represents the activation function,∗ * means convolution operation,xt : t + h − 1 x_{t:t+h-1}xt:t+h1means from ttt tot + h − 1 t+h-1t+h1 time step data. Herevt ( i ) v_t^{(i)}vt(i)Indicates the iiLocal features extracted by i convolution kernels.

  • Next, we combine all local features vt ( 1 ) , vt ( 2 ) , . . . , vt ( n ) v_t^{(1)}, v_t^{(2)}, ..., v_t^{ (n)}vt(1),vt(2),...,vt(n)Stitching together and mapping it to a low-dimensional hidden layer state ht h_t through a fully connected layerht

h t = g ( W h [ v t ( 1 ) ; v t ( 2 ) ; . . . ; v t ( n ) ] + b h ) h_t = g(W_h[v_t^{(1)};v_t^{(2)};...;v_t^{(n)}] + b_h) ht=g(Wh[vt(1);vt(2);...;vt(n)]+bh)

  • Among them, W h W_hWhand bh b_hbhRepresent the weight and bias items of the fully connected layer, [ vt ( 1 ) ; vt ( 2 ) ; . . . ; vt ( n ) ] [v_t^{(1)};v_t^{(2)};. ..;v_t^{(n)}][vt(1);vt(2);...;vt(n)] means splicing all local features together,ggg represents the activation function. hereht h_thtrepresents the time step ttThe hidden state of t .

  • Finally, we use LSTM for the hidden layer state ht h_thtModeling is carried out, and multiple output nodes are introduced in the output layer to perform quantile regression, and the prediction results of different quantile intervals are obtained. Specifically, the calculation formula of LSTM is:

i t = σ ( W i i x t + b i i + W h i h t − 1 + b h i )   f t = σ ( W i f x t + b i f + W h f h t − 1 + b h f )   o t = σ ( W i o x t + b i o + W h o h t − 1 + b h o )   c t = f t ⊙ c t − 1 + i t ⊙ tanh ⁡ ( W i c x t + b i c + W h c h t − 1 + b h c )   h t = o t ⊙ tanh ⁡ ( c t ) \begin{aligned} i_t &= \sigma(W_{ii} x_t + b_{ii} + W_{hi} h_{t-1} + b_{hi}) \ f_t &= \sigma(W_{if} x_t + b_{if} + W_{hf} h_{t-1} + b_{hf}) \ o_t &= \sigma(W_{io} x_t + b_{io} + W_{ho} h_{t-1} + b_{ho}) \ c_t &= f_t \odot c_{t-1} + i_t \odot \tanh(W_{ic} x_t + b_{ic} + W_{hc} h_{t-1} + b_{hc}) \ h_t &= o_t \odot \tanh(c_t) \end{aligned} it=s ( Wiixt+bii+Whiht1+bhi) ft=s ( Wifxt+bif+Whfht1+bhf) ot=s ( Wioxt+bio+Wh oht1+bh o) ct=ftct1+itfishy ( Wicxt+bic+Whcht1+bhc) ht=otfishy ( ct)

  • 其中, i t , f t , o t i_t, f_t, o_t it,ft,otDenote the activation values ​​of the input gate, forget gate and output gate respectively, σ \sigmaσ represents the sigmoid function,⊙ \odot means element-wise product,ct c_tctIndicates cell state, tanh ⁡ \tanhtanh stands for the hyperbolic tangent function. In the output layer, we introduceKKK output nodes, each node corresponds to a quantile, assuming thekkthThe output of k nodes isyk , t y_{k,t}yk,t, then its calculation formula is:

y k , t = W o u t , k h t + b o u t , k y_{k,t} = W_{out,k} h_t + b_{out,k} yk,t=Wout,kht+bout,k

  • Among them, W out , k W_{out,k}Wout,k b o u t , k b_{out,k} bout,kRespectively represent the kkthWeight and bias terms for k nodes. At training time, we optimize the model using a quantile loss function:

L τ = { τ ( yt − y ^ τ , t ) if yt − y ^ τ , t ≥ 0 ( τ − 1 ) ( yt − y ^ τ , t ) otherwise L_{\tau} = \begin{cases} \tau(y_t - \hat{y}{\tau,t}) & \text{if } y_t - \hat{y}{\tau,t} \geq 0 \ (\tau - 1)(y_t - \ hat{y}_{\tau,t}) & \text{otherwise}\end{cases}Lt={ t ( yty^t ,t)if yty^t ,t0 ( t 1)(yty^t , t)otherwise

  • Among them, τ \tauτ represents the quantile,yt y_tytRepresents the true value, y ^ τ , t \hat{y}_{\tau,t}y^t , tRepresents the τ \tau predicted by the modelτ quantile value.

The principle of the QRCNN-LSTM model is to combine CNN and LSTM, two different types of neural networks, use CNN to extract local features of time series, and then use LSTM to model the whole, so as to better capture time series. Long-term dependencies and local features. At the same time, the model also introduces the concept of quantile regression, which can predict the value of time series in different quantile intervals, which improves the predictive ability of the model.

programming

  • Complete program and data acquisition method: private message blogger.
%-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
% 加载数据
load data.mat
%-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
% 模型参数
numFilters = 128; % 滤波器数量
filterSize = 5; % 滤波器大小
numHiddenUnits = 256; % LSTM隐单元数量
dropoutRate = 0.2; % dropout率
numQuantiles = 3; % 分位数数量
learningRate = 0.001; % 学习率
numEpochs = 50; % 训练轮数
%-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
% 构建QRCNN-LSTM模型
inputLayer = sequenceInputLayer(inputSize);
convLayer = convolution1dLayer(filterSize,numFilters,'Padding','same');
lstmLayer = lstmLayer(numHiddenUnits,'OutputMode','last');
dropoutLayer = dropoutLayer(dropoutRate);
fcLayer = fullyConnectedLayer(numQuantiles);
outputLayer = regressionLayer();
%-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
layers = [inputLayer
          convLayer
          lstmLayer
          dropoutLayer
          fcLayer
          outputLayer];
%-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
% 设置训练选项
options = trainingOptions('adam', ...
    'LearnRateSchedule','piecewise', ...
    'LearnRateDropFactor',0.1, ...
    'LearnRateDropPeriod',5, ...
    'MaxEpochs',numEpochs, ...
    'MiniBatchSize',32, ...
    'GradientThreshold',1, ...
    'Shuffle','every-epoch', ...
    'Verbose',1, ...
    'Plots','training-progress');
%-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
% 训练模型
net = trainNetwork(XTrain,YTrain,layers,options);

% 预测
YPred = predict(net,XTest);
%-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
% 展示结果
figure
plot(YTest)
hold on
plot(YPred)
legend('True','Predicted')
xlabel('Time')
ylabel('Data')

References

[1] https://blog.csdn.net/kjm13182345320/article/details/127931217
[2] https://blog.csdn.net/kjm13182345320/article/details/127418340

Guess you like

Origin blog.csdn.net/kjm13182345320/article/details/130613835