Principle part
LSTM was proposed in 1997 and is already an "old" method in terms of publication time. Like other neural networks, LSTM can be used for classification, regression, and time series prediction. For an introduction to the principle part, please refer to this blog . This article mainly involves using matlab to implement LSTM.
code part
Task: Taking penicillin fermentation process simulation data as an example, use LSTM modeling to predict quality variables.
Introduction to the simulation process of penicillin fermentation process: There are 18 process variables in total, 15 of which are measurable variables, and the remaining 3 are generally used as quality variables. A total of 30 batches of data were generated, with each batch running for 400 hours and sampling time of 1 hour, of which 25 batches were used for training and 5 batches were used for testing. The data
used in this article is downloaded and the penicillin concentration is predicted based on the matlab deep learning toolbox.
Data normalization
XTrain_mu = mean([XTrain{
:}],2);
XTrain_sig = std([XTrain{
:}],0,2);
XTest_mu = mean([XTest{
:}],2);
XTest_sig = std([XTest{
:}],0,2);
YTrain_mu = mean([YTrain{
:}],2);
YTrain_sig = std([YTrain{
:}],0,2);
YTest_mu = mean([YTest{
:}],2);
YTest_sig = std([YTest{
:}],0,2);
for i = 1:numel(XTrain)
XTrain{
i} = (XTrain{
i} - XTrain_mu) ./ XTrain_sig ;
YTrain{
i}=(YTrain{
i} - YTrain_mu) ./ YTrain_sig;
end
for i = 1:numel(XTest)
XTest{
i}=(XTest{
i} - XTest_mu) ./ XTest_sig;
YTest{
i}=(YTest{
i} - YTest_mu) ./ YTest_sig;
end
Define network structure
numResponses = size(YTrain{
1},1);
numHiddenUnits = 200;
numFeatures=15;%变量个数
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits,'OutputMode','sequence')
fullyConnectedLayer(50)
dropoutLayer(0.5)
fullyConnectedLayer(numResponses)
regressionLayer];
maxEpochs = 90;
Set hyperparameters
options = trainingOptions('adam', ...
'MaxEpochs',maxEpochs, ...
'InitialLearnRate',0.01, ...
'GradientThreshold',1, ...
'Shuffle','never', ...
'Plots','training-progress',...
'Verbose',0);
Model training
net = trainNetwork(XTrain,YTrain,layers,options);
regression prediction
YPred = predict(net,XTest);
Output visualization
idx = randperm(numel(YPred),4);
figure
for i = 1:numel(idx)
subplot(2,2,i)
plot(YTest{
idx(i)},'--')
hold on
plot(YPred{
idx(i)},'.-')
hold off
title("Test Observation " + idx(i))
xlabel("Time Step")
ylabel("青霉素浓度")
rmse = sqrt(mean((YPred{
i} - YTest{
i}).^2))
end
legend(["True" "Predicted"],'Location','southeast')
result
Training process:
Regression prediction:
overall code
XTrain_mu = mean([XTrain{
:}],2);
XTrain_sig = std([XTrain{
:}],0,2);
XTest_mu = mean([XTest{
:}],2);
XTest_sig = std([XTest{
:}],0,2);
YTrain_mu = mean([YTrain{
:}],2);
YTrain_sig = std([YTrain{
:}],0,2);
YTest_mu = mean([YTest{
:}],2);
YTest_sig = std([YTest{
:}],0,2);
for i = 1:numel(XTrain)
XTrain{
i} = (XTrain{
i} - XTrain_mu) ./ XTrain_sig ;
YTrain{
i}=(YTrain{
i} - YTrain_mu) ./ YTrain_sig;
end
for i = 1:numel(XTest)
XTest{
i}=(XTest{
i} - XTest_mu) ./ XTest_sig;
YTest{
i}=(YTest{
i} - YTest_mu) ./ YTest_sig;
end
numResponses = size(YTrain{
1},1);
numHiddenUnits = 200;
numFeatures=15;
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits,'OutputMode','sequence')
fullyConnectedLayer(50)
dropoutLayer(0.5)
fullyConnectedLayer(numResponses)
regressionLayer];
maxEpochs = 90;
options = trainingOptions('adam', ...
'MaxEpochs',maxEpochs, ...
'InitialLearnRate',0.01, ...
'GradientThreshold',1, ...
'Shuffle','never', ...
'Plots','training-progress',...
'Verbose',0);
net = trainNetwork(XTrain,YTrain,layers,options);
YPred = predict(net,XTest);
idx = randperm(numel(YPred),4);
figure
for i = 1:numel(idx)
subplot(2,2,i)
plot(YTest{
idx(i)},'--')
hold on
plot(YPred{
idx(i)},'.-')
hold off
title("Test Observation " + idx(i))
xlabel("Time Step")
ylabel("青霉素浓度")
rmse = sqrt(mean((YPred{
i} - YTest{
i}).^2))
end
legend(["True" "Predicted"],'Location','southeast')
Note: All mainstream networks on the market can be built by yourself using matlab's deep learning toolbox to avoid complex environment configuration. It is still very useful if you are not engaged in algorithm research and is highly recommended.