Multi-input multi-output long-term and short-term neural network regression analysis based on matlab programming

​Table
of Contents Background
Abstract
Basic definition of LSTM
Steps for LSTM implementation
Multiple-input multiple-output regression analysis based on long-term and short-term neural network LSTM
MATALB code
Effect diagram
Results analysis
Outlook
Reference papers

back view

Futures forecasting is a mathematical problem that is difficult to predict. There are many models for research, but they all have limitations and the accuracy rate is not high. In this paper, LSTM is used for soybean futures price forecasting, and deep learning programming is realized. No fitting formula is required, and approximation The effect is good.
Abstract
LSTM principle, MATALB programming long-term and short-term neural network LSTM soybean futures price prediction,

Basic definition of LSTM

LSTM is a type of neural network that contains LSTM blocks (blocks) or others. In literature or other materials, LSTM blocks may be described as intelligent network units because they can memorize values ​​of indefinite length of time. There is a The gate can determine whether the input is important enough to be remembered and whether it can be output.
There are four S function units at the bottom of Figure 1. The leftmost function may become the input of the block according to the situation. The three on the right will pass through the gate to determine whether the input can be passed to the block. The second on the left is the input gate. If the output here is similar If it is at zero, the value here will be blocked and will not enter the next layer. The third one from the left is the forget gate, when this produces a value close to zero, the value remembered in the block will be forgotten. The fourth and rightmost input is the output gate, which can determine whether the input in the block memory can be output.
Figure 1 LSTM model
Figure 1 LSTM model
LSTM has many versions, one of the important versions is GRU (Gated Recurrent Unit), according to Google's test, the most important in LSTM is the Forget gate, followed by the Input gate, and the last is the output gate

training method

In order to minimize the training error, the gradient descent method (Gradient descent), such as: applying the sequential backward pass algorithm, can be used to modify the weights each time according to the error. The main problem of gradient descent in recurrent neural networks (RNN), first discovered in 1991, is that the error gradient disappears exponentially with the length of time between events. When the LSTM block is set, the error is also calculated backwards, affecting every gate in the input stage from the output, until this value is filtered out. So the normal backward transfer neural network is an effective way to train LSTM blocks to remember long-term values.

The steps of lstm

1, the first step of LSTM is to determine what information we will discard from the cell state, this strategy has a sigmoid layer decision called forget gate. Input ht-1 and xt forget gate and output a number between 0 and 1 corresponding to each number in the cell state ct-1. 1 means "totally keep", 0 means "totally forget".

Let's go back to our language model example trying to predict what the next word will be based on all previous words. In this problem, the gender of the current subject may be included in the cell state, so the correct pronoun can be predicted. When we see the gender of a new subject, we want to forget the gender of the old subject.

=(W*[h-1,x]+b)

The next step will be to decide what new information we save in the cell state. Consists of two parts; first, the sigmoid layer of the "input gate layer" decides which values ​​we will update, and second, the tanh layer creates a vector of new candidate values ​​ct-1 that can be added to the state. In the next step, we'll combine these two to create a status update.

In our language model example, we want to add the gender of the new subject to the cell state, replacing the old subject we forgot about

=(W*[h-1,x]+b)

=tanh(W*[h-1,x]+b)

Now it's time to update the old cell state ct-1 to the new cell state ct, the previous steps have already decided what to do, we just need to actually do it. We multiply the old state by ft, forgetting what we previously decided to forget, then we add *Ct . This is the new candidate value, scaled by our decision to update each state's value.

In the case of a language model, we actually discard information about the gender of the old subject and add new information, just as we did in the previous steps.

C=C-1+(1-)

Finally we need to decide what we want to output, this output will be based on our cell state but will be a filtered version, first we run a sigmoid layer which decides which parts of the cell state we want to output, then we put the cell The state is set to tanh (pushes a value between -1 and 1) and multiplied by the output of the sigmoid gate so that we only output the part we decide on.

For the language model example, since it just sees a subject, he might want to output information related to animals in case what happens next, for example, he outputs whether the subject is singular or plural, while we know what form the verb should be with Yoke.

O=(W[h-1,x]+b)

h=O*tanh©

​Multi-input and multi-output lstm neural network regression analysis based on MATLAB programming

clc
clc
clear
close all
num = xlsread('test data.xlsx',2);
num1 = xlsread('training data.xlsx');
num2 = xlsread('test data 1.xlsx');

train_data1 = num1(:,1:5)';% input data of training data
test_data1 = num1(:,7:8)';% output data of training data
[train_data,inputps]=mapminmax(train_data1,0,1) ;% normalization of the input data of the training data
[trainout, outputps]=mapminmax(test_data1,0,1);% normalization of the output data of the training data

data_length=size(train_data,1);
data_num=size(train_data,2);
%% network parameter initialization
% node number setting
input_num=5;
cell_num=12;
output_num=2;

% 网络中门的偏置
bias_input_gate=rand(1,cell_num);
bias_forget_gate=rand(1,cell_num);
bias_output_gate=rand(1,cell_num);
% ab=1.2;
% bias_input_gate=ones(1,cell_num)/ab;
% bias_forget_gate=ones(1,cell_num)/ab;
% bias_output_gate=ones(1,cell_num)/ab;
%网络权重初始化
ab=15;
weight_input_x=rand(input_num,cell_num)/ab;
weight_input_h=rand(output_num,cell_num)/ab;
weight_inputgate_x=rand(input_num,cell_num)/ab;
weight_inputgate_c=rand(cell_num,cell_num)/ab;
weight_forgetgate_x=rand(input_num,cell_num)/ab;
weight_forgetgate_c=rand(cell_num,cell_num)/ab;
weight_outputgate_x=rand(input_num,cell_num)/ab;
weight_outputgate_c=rand(cell_num,cell_num)/ab;

%hidden_output权重
weight_preh_h=rand(cell_num,output_num);

%Network state initialization
cost_gate=1e-10;
h_state=rand(output_num,data_num);
cell_state=rand(cell_num,data_num);
%% Network training learning
for iter=1:3000
iter
yita=0.1; % Weight adjustment for each iteration Scale
for m=1:data_num

    %前馈部分
    if(m==1)
        gate=tanh(train_data(:,m)'*weight_input_x);
        input_gate_input=train_data(:,m)'*weight_inputgate_x+bias_input_gate;
        output_gate_input=train_data(:,m)'*weight_outputgate_x+bias_output_gate;
        for n=1:cell_num
            input_gate(1,n)=1/(1+exp(-input_gate_input(1,n)));
            output_gate(1,n)=1/(1+exp(-output_gate_input(1,n)));
        end
        forget_gate=zeros(1,cell_num);
        forget_gate_input=zeros(1,cell_num);
        cell_state(:,m)=(input_gate.*gate)';
    else
        gate=tanh(train_data(:,m)'*weight_input_x+h_state(:,m-1)'*weight_input_h);
        input_gate_input=train_data(:,m)'*weight_inputgate_x+cell_state(:,m-1)'*weight_inputgate_c+bias_input_gate;
        forget_gate_input=train_data(:,m)'*weight_forgetgate_x+cell_state(:,m-1)'*weight_forgetgate_c+bias_forget_gate;
        output_gate_input=train_data(:,m)'*weight_outputgate_x+cell_state(:,m-1)'*weight_outputgate_c+bias_output_gate;
        for n=1:cell_num
            input_gate(1,n)=1/(1+exp(-input_gate_input(1,n)));
            forget_gate(1,n)=1/(1+exp(-forget_gate_input(1,n)));
            output_gate(1,n)=1/(1+exp(-output_gate_input(1,n)));
        end
        cell_state(:,m)=(input_gate.*gate+cell_state(:,m-1)'.*forget_gate)';   
    end
    pre_h_state=tanh(cell_state(:,m)').*output_gate;
    h_state(:,m)=(pre_h_state*weight_preh_h)';
    %误差计算
    Error=h_state(:,m)-trainout(:,m);
    Error_Cost(1,iter)=sum(Error.^2);
    if(Error_Cost(1,iter)<cost_gate)
        flag=1;
        break;
    else
        [   weight_input_x,...
            weight_input_h,...
            weight_inputgate_x,...
            weight_inputgate_c,...
            weight_forgetgate_x,...
            weight_forgetgate_c,...
            weight_outputgate_x,...
            weight_outputgate_c,...
            weight_preh_h ]=fun_weight(m,yita,Error,...
                                               weight_input_x,...
                                               weight_input_h,...
                                               weight_inputgate_x,...
                                               weight_inputgate_c,...
                                               weight_forgetgate_x,...
                                               weight_forgetgate_c,...
                                               weight_outputgate_x,...
                                               weight_outputgate_c,...
                                               weight_preh_h,...
                                               cell_state,h_state,...
                                               input_gate,forget_gate,...
                                               output_gate,gate,...
                                               train_data,pre_h_state,...
                                               input_gate_input,...
                                               output_gate_input,...
                                               forget_gate_input);

    end
end
if(Error_Cost(1,iter)<cost_gate)
    break;
end

end
plot(Error_Cost,'k-*');
title('Error graph');

%% 121 tests after use
% data loading
test_final11=num((601:721),1:10)';% input data of test data
test_final1=mapminmax('apply',test_final11,inputps);
test_output=num(( 601:721),11)'; % output data of test data
% feedforward
m=601:721;
lstmout= [];
for ii = 1:121
test_final = test_final1(:,ii);
gate=tanh(test_final' *weight_input_x+h_state(:,m(ii)-1)'*weight_input_h);
input_gate_input=test_final'*weight_inputgate_x+cell_state(:,m(ii)-1)'*weight_inputgate_c+bias_input_gate;
forget_gate_input=test_final'*weight_for getgate_x+ cell_state(:,m(ii)-1)'*weight_forgetgate_c+bias_forget_gate;
output_gate_input=test_final'*weight_outputgate_x+cell_state(:,m(ii)-1)'*weight_outputgate_c+bias_output_gate;
for n=1:cell_num
input_gate(1,n)=1/(1+exp(-input_gate_input(1,n)));
forget_gate(1,n)=1/(1+exp(-forget_gate_input(1,n)));
output_gate(1,n)=1/(1+exp(-output_gate_input(1,n)));
end
cell_state_test=(input_gate.*gate+cell_state(:,m(ii)-1)’.*forget_gate)‘;
pre_h_state=tanh(cell_state_test’).output_gate;
h_state_test=(pre_h_state
weight_preh_h)';
lstmout = [lstmout h_state_test];

end
% test_output
% lstmout
% test_output1 = (mapminmax(‘reverse’,test_output,outputps));
lstmout1= (mapminmax(‘reverse’,lstmout,outputps));
figure (2)

plot(lstmout1,'r-*')
hold on
plot(test_output,'b-o')
xlabel('test set sample number')
ylabel('spread')
title('long-short term memory neural network')
legend( 'Output Spread','Actual Spread')

% xlim([1 12])

result graph

insert image description here
insert image description here

Result analysis

It can be seen from the figure that based on the regression analysis of the long-term and short-term neural network LSTM, the prediction is accurate and the generality is good.

Outlook

Long-term and short-term neural networks have unique advantages in dealing with time-dependent problems. The prediction results are smoother, more stable, and can be adjusted. LSTM can be combined with other algorithms, such as particle swarm optimization of LSTM parameters, DBN+LSTM, wait

reference paper

Encyclopedia

Guess you like

Origin blog.csdn.net/abc991835105/article/details/129848031