Interval prediction | MATLAB implements multivariate time series interval prediction based on QRCNN-BiLSTM-Multihead-Attention convolutional neural network combined with bidirectional long-term and short-term memory neural network

Interval prediction | MATLAB implements QRCNN-BiLSTM-Multihead-Attention convolutional neural network combined with bidirectional long and short-term memory neural network multivariate time series interval prediction

List of effects

1
2

3
4
5
6
7

basic introduction

1. Matlab realizes multi-head attention multivariate time series interval prediction based on QRCNN-BiLSTM-Multihead-Attention convolutional neural network combined with bidirectional long-term and short-term memory neural network; 2. Multi-
map output, point prediction and multi-index output (MAE, MAPE, RMSE , MSE, R2), interval prediction multi-point ratio output (interval coverage rate PICP, interval average width percentage PINAW), multiple input single output, including point prediction map, different confidence interval prediction map, error analysis map, kernel density estimation probability density Figure;
Interval coverage rate: 0.44848, interval average width percentage: 5.7921
3. data is a data set, power data set, using multiple associated variables to predict the last column of power data, which is also applicable to load forecasting and wind speed forecasting; MainQRCNN_BiLSTM_MATTNTS The main program, the rest are function files, no need to run;
4. The code is of high quality, with clear comments, including data preprocessing, processing missing values, if it is nan, delete it, and also includes kernel density estimation; 5. The operating environment is
Matlab2021 and above.

Model description

  • QRCNN-BiLSTM-Multihead-Attention is a neural network architecture for processing sequential data, such as text or time series data. The structure consists of four main components:
  1. CNN-BiLSTM : This layer is a convolutional neural network for sequential data with a quasi-recurrent structure. It combines the advantages of CNN and RNN to achieve efficient parallel processing and capture long-term dependencies.

  2. BiLSTM (Bidirectional Long Short-Term Memory Network) : This layer is a recurrent neural network for bidirectional processing of sequential data. It enables the network to capture dependencies in both forward and reverse directions.

  • At time step t, the output of the BiLSTM layer, denoted h_t, is computed as follows:

  • h_t = [h_t^f; h_t^b] = [LSTM^f(x_t); LSTM^b(x_t)]

  • Among them, LSTM f and LSTM b are forward and backward LSTM units, respectively, and [;] represent connections.

  1. Multihead Attention : This layer is an attention mechanism that allows the network to selectively focus on different parts of the input sequence. It does this by computing multiple attention heads in parallel, each focusing on a different part of the input sequence.
  • At time step t, the output of the Multihead Attention layer, denoted as h_t^*, is calculated as follows:

  • h_t^* = Concat(head_1, head_2, …, head_k) * W^o

  • Among them, k is the number of attention heads, head_i is the i-th attention head, and W^o is a learnable parameter. Each attention head is calculated as follows:

  • head_i = Attention(q_i, K, V) = softmax(\frac{q_i K^T}{\sqrt{d_k}}) V

  • where q_i is the i-th query vector, K and V are the key and value vectors, and d_k is the dimension of the key vector.

  1. Output layer : This layer takes the output of the previous layer as input and finally outputs a prediction, such as classification or regression.
  • The output of the model, denoted y, is computed as follows:

  • y = softmax(W^h * h_T^* + b)

  • Among them, h_T^* is the output of the Multihead Attention layer at the last time step, and W^h and b are learnable parameters. The softmax function is used to obtain a probability distribution of possible output classes.

programming

  • Complete program and data acquisition method: private message blogger.
ntrain=round(nwhole*num_size);
	ntest =nwhole-ntrain;
	% 准备输入和输出训练数据
	input_train =input(:,temp(1:ntrain));
	output_train=output(:,temp(1:ntrain));
	% 准备测试数据
	input_test =input(:, temp(ntrain+1:ntrain+ntest));
	output_test=output(:,temp(ntrain+1:ntrain+ntest));
	%% 数据归一化
	method=@mapminmax;
	[inputn_train,inputps]=method(input_train);
	inputn_test=method('apply',input_test,inputps);
	[outputn_train,outputps]=method(output_train);
	outputn_test=method('apply',output_test,outputps);
	% 创建元胞或向量,长度为训练集大小;
	XrTrain = cell(size(inputn_train,2),1);
	YrTrain = zeros(size(outputn_train,2),1);
	for i=1:size(inputn_train,2)
		XrTrain{
    
    i,1} = inputn_train(:,i);
		YrTrain(i,1) = outputn_train(:,i);
	end
	% 创建元胞或向量,长度为测试集大小;
	XrTest = cell(size(inputn_test,2),1);
	YrTest = zeros(size(outputn_test,2),1);
	for i=1:size(input_test,2)
		XrTest{
    
    i,1} = inputn_test(:,i);
		YrTest(i,1) = outputn_test(:,i);
	end

	%% 创建混合网络架构
%%  区间覆盖率
RangeForm = [T_sim(:, 1), T_sim(:, end)];
Num = 0;

for i = 1 : length(T_train)
    Num = Num +  (T_train(i) >= RangeForm(i, 1) && T_train(i) <= RangeForm(i, 2));
end

picp = Num / length(T_train);     


    S = cumtrapz(X,Y);
    Index = find(abs(m-S)<=1e-2);
    Q = X(max(Index));


References

[1] https://blog.csdn.net/kjm13182345320/article/details/127931217
[2] https://blog.csdn.net/kjm13182345320/article/details/127418340

Guess you like

Origin blog.csdn.net/kjm13182345320/article/details/130999557