Regression prediction | MATLAB implements CNN-BiGRU-Attention multi-input single-output regression prediction

Regression prediction | MATLAB implements CNN-BiGRU-Attention multi-input single-output regression prediction

predictive effect

1
2
3
4
5
67

8

10

basic introduction

MATLAB implements CNN-BiGRU-Attention multi-input single-output regression prediction, and CNN-GRU combines attention mechanism with multi-input single-output regression prediction.

Model description

Matlab implements CNN-BiGRU-Attention multivariate regression prediction
1. data is a data set, the format is excel, 7 input features, 1 output feature;
2. MainCNN_BiGRU_Attention.m is the main program file, just run it;
3. Command window Output R2, MAE, MAPE, MSE and MBE, the data and program content can be obtained in the download area;

11

12
13

Note that the program and data are placed in a folder, and the operating environment is Matlab2020b and above.
4. Attention mechanism module:
SEBlock (Squeeze-and-Excitation Block) is a new structural unit that focuses on the channel dimension and adds a channel attention mechanism to the model. This mechanism adds the importance of each feature channel The degree of weight is used to enhance or suppress the corresponding channels for different tasks, so as to extract useful features. The internal operation process of this module is shown in the figure, which is generally divided into three steps: first, the Squeeze compression operation compresses the features of the spatial dimension and keeps the number of feature channels unchanged. Fusing global information is global pooling, and converting each two-dimensional feature channel into a real number. The real number calculation formula is shown in the formula. The real number is obtained by dividing the sum of features obtained by k channels by the value of the spatial dimension, and the spatial dimension is H*W. The second is the Excitation incentive operation, which consists of two fully connected layers and a Sigmoid function. As shown in the formula, s is the output of the incentive operation, σ is the activation function sigmoid, W2 and W1 are the corresponding parameters of the two fully connected layers, and δ is the activation function ReLU, which first reduces the dimensionality of the features and then increases them. The last is the Reweight operation, which weights the previous input features channel by channel to complete the redistribution of the original features on each channel.

1
2

programming

  • Complete program and data acquisition method 1: program exchange of equal value;
  • Complete program and data acquisition method 2: Private message bloggers reply CNN-BiGRU-Attention multi-input single-output regression prediction acquisition.
%%  划分训练集和测试集
P_train = res(1: num_train_s, 1: f_)';
T_train = res(1: num_train_s, f_ + 1: end)';
M = size(P_train, 2);

P_test = res(num_train_s + 1: end, 1: f_)';
T_test = res(num_train_s + 1: end, f_ + 1: end)';
N = size(P_test, 2);

%%  数据归一化
[p_train, ps_input] = mapminmax(P_train, 0, 1);
p_test = mapminmax('apply', P_test, ps_input);

[t_train, ps_output] = mapminmax(T_train, 0, 1);
t_test = mapminmax('apply', T_test, ps_output);

%%  数据平铺
%   将数据平铺成1维数据只是一种处理方式
%   也可以平铺成2维数据,以及3维数据,需要修改对应模型结构
%   但是应该始终和输入层数据结构保持一致
p_train =  double(reshape(p_train, f_, 1, 1, M));
p_test  =  double(reshape(p_test , f_, 1, 1, N));
t_train =  double(t_train)';
t_test  =  double(t_test )';

%%  数据格式转换
for i = 1 : M
    Lp_train{
    
    i, 1} = p_train(:, :, 1, i);
end

for i = 1 : N
    Lp_test{
    
    i, 1}  = p_test( :, :, 1, i);
end
    
%%  建立模型
lgraph = layerGraph();                                                 % 建立空白网络结构

tempLayers = [
    sequenceInputLayer([f_, 1, 1], "Name", "sequence")                 % 建立输入层,输入数据结构为[f_, 1, 1]
    sequenceFoldingLayer("Name", "seqfold")];                          % 建立序列折叠层
lgraph = addLayers(lgraph, tempLayers);                                % 将上述网络结构加入空白结构中

tempLayers = convolution2dLayer([3, 1], 32, "Name", "conv_1");         % 卷积层 卷积核[3, 1] 步长[1, 1] 通道数 32
lgraph = addLayers(lgraph,tempLayers);                                 % 将上述网络结构加入空白结构中
%% 赋值
L2Regularization =abs(optVars(1)); % 正则化参数
InitialLearnRate=abs(optVars(2)); % 初始学习率
NumOfUnits = abs(round(optVars(3))); % 隐藏层节点数

%%  输入和输出特征个数
inputSize    = size(input_train, 1);   %数据输入x的特征维度
numResponses = size(output_train, 1);   %数据输出y的维度

%%  设置网络结构
opt.layers = [ ...
    sequenceInputLayer(inputSize)     %输入层,参数是输入特征维数
    bilstmLayer(NumOfUnits)        %学习层,BiLSTM函数 ,
    dropoutLayer(0.2)                  %权重丢失率
    fullyConnectedLayer(numResponses)   %全连接层,也就是输出的维数
    regressionLayer];    %回归层,该参数说明是在进行回归问题,而不是分类问题

%%  设置网络参数
opt.options = trainingOptions('adam', ...             % 优化算法Adam
    'MaxEpochs', 100, ...                            % 最大训练次数,推荐180
    'GradientThreshold', 1, ...                      %梯度阈值,防止梯度爆炸
    'ExecutionEnvironment','cpu',...   %对于大型数据集合、长序列或大型网络,在 GPU 上进行预测计算通常比在 CPU 上快。其他情况下,在 CPU 上进行预测计算通常更快。
    'InitialLearnRate', InitialLearnRate, ... % 初始学习率
    'LearnRateSchedule', 'piecewise', ...             % 学习率调整
    'LearnRateDropPeriod',120, ...                   % 训练80次后开始调整学习率
    'LearnRateDropFactor',0.2, ...                  % 指定初始学习率 0.005,在 100 轮训练后通过乘以因子 0.2 来降低学习率。
    'L2Regularization', L2Regularization, ...       % 正则化参数
    'Verbose', 0, ...                                 % 关闭优化过程
    'Plots', 'none');                                 % 不画出曲线 

References

[1] https://blog.csdn.net/kjm13182345320/article/details/129679476?spm=1001.2014.3001.5501
[2] https://blog.csdn.net/kjm13182345320/article/details/129659229?spm=1001.2014.3001.5501
[3] https://blog.csdn.net/kjm13182345320/article/details/129653829?spm=1001.2014.3001.5501

Guess you like

Origin blog.csdn.net/kjm13182345320/article/details/131611450