Matlab Simulation of LSTM Network Data Prediction Based on PSO Optimization

Table of contents

1. Algorithm simulation effect

2. Algorithms involve an overview of theoretical knowledge

3. MATLAB core program

4. Complete algorithm code file


1. Algorithm simulation effect

The matlab2022a simulation results are as follows:

 

2. Algorithms involve an overview of theoretical knowledge

       In PSO, each particle in the swarm is represented as a vector. In the context of portfolio optimization, this is a vector of weights representing the allocated capital for each asset. Vectors are converted to positions in a multidimensional search space. Each particle also remembers its best historical position. For each iteration of PSO, the global optimal position is found. This is the best optimal position in the group. Once the global optimal position is found, each particle will be closer to its local optimal position and global optimal position. When performed in multiple iterations, the process yields a good solution to the problem, as the particles converge on a near-optimal solution.

       Ray et al. combined the PSO algorithm with the Pareto sorting mechanism. The Pareto sorting method is used to select a group of elite solutions, and the selection of the global optimal particle is selected from them by means of roulette. In actual operation, only a small number of individuals have a high probability of selection, and the population diversity is not maintained well. Coello et al. selected the best position of the group in the PSO algorithm by introducing the Pareto competition mechanism and the particle knowledge base. The knowledge base is used to store the flight experience of the particles after each flight cycle, and the knowledge base is updated by considering a geography-based system defined in terms of the value of the objective function for each particle. This knowledge base is used by motes to identify a leader to guide the search. At the same time, the non-inferior solution is determined by comparing the candidate individuals with the comparison set randomly selected from the population, so the parameters of the comparison set have a crucial impact on the success of the algorithm. If the parameter is too large, premature convergence is prone to occur, and if the parameter is too small, the number of non-inferior solutions selected from the population may be too small.

       PSO simulates the predation behavior of birds. Imagine the scenario: a flock of birds is randomly searching for food. There is only one piece of food in this area. All birds don't know where the food is. But they know how far away the food is from their current location. So what is the optimal strategy for finding food? The simplest and most effective is to search the area around the bird that is currently closest to the food.

      During the entire search process, the flock of birds let other birds know their current position by passing on their information to each other. Through such cooperation, they can judge whether they have found the optimal solution, and at the same time pass the information of the optimal solution to the whole The flock of birds, and eventually the entire flock of birds can gather around the food source, that is, the optimal solution has been found.

      In PSO, the solution of each optimization problem is a bird in the search space, which we call "particles". All particles have a fitness value determined by an optimized function, and each particle also has a velocity that determines the direction and distance they fly. Then the particles follow the current optimal particle to search in the solution space.

       PSO is initialized as a group of random particles (random solutions). The optimal solution is then found through iterations, and in each iteration, the particles update themselves by tracking two "extrema". The first one is the optimal solution found by the particle itself, which is called the individual extremum. Another extreme value is the optimal solution currently found by the entire population, and this extreme value is the global mechanism. In addition, instead of the whole population, only a part of it can be used as the neighbor of the particle, then the extremum among all the neighbors is the local extremum.
        PSO is initialized as a group of random particles (random solutions). The optimal solution is then found through iteration. In each iteration, the particle updates itself by tracking two "extrema" (pbest and gbest). After finding these two optimal values, the particle updates its speed and position through the following formula.

The first part in the formula (1) is called the memory item, which represents the influence of the magnitude and direction of the last velocity; the
second part in the formula (1) is called the self-awareness item, which points from the current point to the best point of the particle itself A vector, indicating that the particle’s action comes from its own experience; the
third part in formula (1) is called the group cognitive item, which is a vector from the current point to the best point of the population, reflecting the coordination and cooperation among particles and knowledge sharing. Particles determine the next movement through their own experience and the best experience among their peers.


————————————————

        Long-term short-term memory network (LSTM, Long Short-Term Memory) is a time cyclic neural network, which is specially designed to solve the long-term dependence problem of general RNN (cyclic neural network). All RNNs have a A chained form of repeated neural network modules. In standard RNNs, this repeated structural module has only a very simple structure, such as a tanh layer.

        The Long-Short Term Memory (LSTM) paper was first published in 1997. Due to the unique design structure, LSTM is suitable for processing and predicting important events with very long intervals and delays in time series. LSTMs usually perform better than temporal recurrent neural networks and Hidden Markov Models (HMMs), such as for non-segmented continuous handwriting recognition. In 2009, the artificial neural network model built with LSTM won the ICDAR handwriting recognition competition. LSTM is also widely used for autonomous speech recognition. In 2013, it used the TIMIT natural speech database to achieve a record of 17.7% error rate. As a nonlinear model, LSTM can be used as a complex nonlinear unit to construct larger deep neural networks.

       LSTM is a type of neural network that contains LSTM blocks (blocks) or others. In literature or other materials, LSTM blocks may be described as intelligent network units because they can memorize values ​​for an indefinite length of time. There is a The gate can determine whether the input is important enough to be remembered and whether it can be output. There are four S function units at the bottom of Figure 1. The leftmost function may become the input of the block according to the situation. The three on the right will pass through the gate to determine whether the input can be passed to the block. The second on the left is the input gate. If the output here is similar If it is at zero, the value here will be blocked and will not enter the next layer. The third one from the left is the forget gate, when this produces a value close to zero, the value remembered in the block will be forgotten. The fourth and rightmost input is the output gate, which can determine whether the input in the block memory can be output. There are many versions of LSTM. One of the important versions is GRU (Gated Recurrent Unit). According to Google's test, the most important thing in LSTM is the Forget gate, followed by the Input gate, and the last is the Output gate. ​​​​​​​​

3. MATLAB core program

.........................................................
for i=1:Iter
    i
    for j=1:Npeop
        if fitness(x1(j,:))<pbest1(j)
           p1(j,:)   = x1(j,:);%变量
           pbest1(j) = fitness(x1(j,:));
        end
        if pbest1(j)<gbest1
           g1     = p1(j,:);%变量
           gbest1 = pbest1(j);
        end
        
        v1(j,:) = v1(j,:)+c1*rand*(p1(j,:)-x1(j,:))+c2*rand*(g1-x1(j,:));
        x1(j,:) = x1(j,:)+v1(j,:);             
    end
    gb1(i)=gbest1;
end

figure;
plot(gb1,'-bs',...
    'LineWidth',1,...
    'MarkerSize',6,...
    'MarkerEdgeColor','k',...
    'MarkerFaceColor',[0.9,0.0,0.0]);

xlabel('优化迭代次数');
ylabel('适应度值');


zhongjian1_num = round(g1(1));  
xue            = g1(2);

%模型训练
layers = [ ...
    sequenceInputLayer(shuru_num)    
    lstmLayer(zhongjian1_num)        
    fullyConnectedLayer(shuchu_num)  
    regressionLayer];
 
.................................................................

figure
plot(output_train,'-bs',...
    'LineWidth',1,...
    'MarkerSize',6,...
    'MarkerEdgeColor','k',...
    'MarkerFaceColor',[0.9,0.0,0.0]);
hold on
plot(test_simu,'-r>',...
    'LineWidth',1,...
    'MarkerSize',6,...
    'MarkerEdgeColor','k',...
    'MarkerFaceColor',[0.9,0.9,0.0]);
hold off
legend(["真实值" "预测值"])
xlabel("样本")
title("训练集")

figure
plot(YValidationy,'-bs',...
    'LineWidth',1,...
    'MarkerSize',6,...
    'MarkerEdgeColor','k',...
    'MarkerFaceColor',[0.9,0.0,0.0]);
hold on
plot(test_simuy,'-r>',...
    'LineWidth',1,...
    'MarkerSize',6,...
    'MarkerEdgeColor','k',...
    'MarkerFaceColor',[0.9,0.9,0.0]);
hold off
legend(["真实值" "预测值"])
xlabel("样本")
title("验证集")

figure
plot(output_test,'-bs',...
    'LineWidth',1,...
    'MarkerSize',6,...
    'MarkerEdgeColor','k',...
    'MarkerFaceColor',[0.9,0.0,0.0]);
hold on
plot(test_simu1,'-r>',...
    'LineWidth',1,...
    'MarkerSize',6,...
    'MarkerEdgeColor','k',...
    'MarkerFaceColor',[0.9,0.9,0.0]);
hold off
legend(["真实值" "预测值"])
xlabel("样本")
title("测试集")
A983

4. Complete algorithm code file

V

Guess you like

Origin blog.csdn.net/hlayumi1234567/article/details/130117982