[Time series forecasting] Based on BP, RNN, LSTM, CNN-LSTM algorithm multi-feature (multi-influencing factors) electricity load forecasting [nanny-level hands-on teaching]

Series Article Directory

Principles of deep learning ----- linear regression + gradient descent method Principles
of deep learning ----- logistic regression algorithm
Principles of deep learning ----- fully connected neural network Principles
of deep learning ----- convolutional neural network
depth Learning principle-----recurrent neural network (RNN, LSTM)
time series forecasting-----based on BP, LSTM, CNN-LSTM neural network algorithm, single-feature electricity load forecasting
time series forecasting (multi-features)-- ---Multi-feature electricity load forecasting based on BP, LSTM, CNN-LSTM neural network algorithm


Series of teaching videos

Quick introduction to deep learning and actual combat
[hands-on teaching] based on BP neural network single-feature electricity load forecasting
[hands-on teaching] based on RNN, LSTM neural network single-feature electricity load forecasting
[hands-on teaching] based on CNN-LSTM neural network single-feature electricity consumption Load forecasting
[Multi-feature forecasting] Multi-feature electric load forecasting based on BP neural network
[Multi-feature forecasting] Multi-feature power load forecasting based on RNN and LSTM [Multi-feature forecasting
] Multi-feature power load forecasting based on CNN-LSTM network


foreword

  In the time series forecasting task, it is generally divided into two types of tasks according to the input. The first task is a single-feature input task. For example, when predicting the closing price of a stock, the data input to the neural network is only the stock. Closing price, that is, using the closing price to predict the closing price. At this time, the closing price data is both feature X and label Y; there is also a task of multi-feature input. Under such a task, the neural network The input feature is not a single feature, but multi-featured. For example, in the case of stock forecasting, to predict the closing price of a stock, you can input the opening price, highest price, lowest price, and closing price of the stock to predict the closing price. However, in the time prediction task of multi-feature input, data preprocessing may be slightly more complicated than that of single-feature input task.
  Analogous to the above-mentioned forecasting task of stock closing price, it is the same in the power load forecasting task. The single-feature input power load forecasting input is the power load value, and the power load value is used to predict the power load value; the multi-feature power load forecasting is In the project to be explained next in this article, the input features in this project include temperature, humidity, electricity price and load characteristics, and use the above characteristics to predict the power load. Specifically, it explains the use of BP neural network, RNN, LSTM neural network, The combined CNN-LSTM neural network performs multi-feature power load forecasting and compares them at the end.


1. Multi-factor power load data analysis

1.1. Data display

  The data set of deep learning experiment on power load forecasting with multiple influencing factors is as follows:
  specific data can be obtained from the following courses, which include explanations from theory to practice.
  [Multi-feature forecast] Multi-feature electric load forecast based on BP neural network
  [Multi-feature forecast] Multi-feature power load forecast based on RNN and LSTM
  [Multi-feature forecast] Multi-feature power load forecast based on CNN-LSTM network
insert image description here
  From the screenshot of the data, you can It can be seen that the data contains dry bulb temperature, dew point temperature, wet bulb temperature, humidity, electricity price and electricity load data; the data is from 2006 to 2010, a total of 5 years of data. The sampling point of the data is sampled every half an hour, and there are 48 sampling points in a day. Except for the 17,568 sampling points in 2008, there are 17,520 sampling points in 2006, 2007, 2009, and 2010. Therefore, the total data There are 87648 sample points.

1.2. Comparison of power load and influencing factors

  From an intuitive point of view, weather factors and electricity prices have a certain relationship with electricity consumption. Therefore, the power load and weather electricity price are drawn on a graph; the specific graph is as follows:
insert image description here
  Through analysis, it can be found that whether it is dry bulb temperature, dew point temperature or wet bulb temperature, their changes over time, and their changing trends are extremely similar, so it can be judged that there is a strong correlation between them. By observing the waveform of the power load waveform and the temperature data, it is found that when the temperature is higher or lower, the power load data is larger, so it can be determined that there is a strong correlation between the temperature data and the power load data , the size of the temperature has a great influence on the size of the electric load data. Then observe the humidity data and electricity price data. Compared with the temperature data, the humidity data cannot observe obvious regular characteristics and periodic characteristics. The electricity price data has no obvious fluctuations except that there are large price changes in certain time periods. It can not be clearly observed from the figure that humidity and electricity price have obvious influence on electricity load.

1.3. Analyze the law in the power load

  The picture shows the annual power load curve of the data from 2006 to 2010 for five years.
insert image description here
  From Figure 2-6, it can be seen that during the five-year period from 2006 to 2010, the annual electricity load trend is roughly the same, showing a certain periodicity; the annual electricity consumption peak is roughly from December to February, June to August, this is obviously related to the season.
  In order to have a more comprehensive understanding and deeper analysis of the data. Select the weekly power load data of the penultimate week of April each year. As shown in the figure, the power load data in the figure are from April 17 to 23, 2006, April 16 to 22, 2007, April 21 to 27, 2008, and April 20 to 26, 2009. Weekly power load data for 5 weeks from April 19 to 25, 2010.
insert image description here
  It can be seen from the figure that the trend of the weekly power load data in the penultimate week of April every year is very similar, especially the data on Saturday and Sunday are very consistent. Compared with the electricity consumption data of other years, the electricity load on April 17, 2006 and April 25, 2008 showed a significant decrease, but the trend of electricity consumption is similar. The principle leading to the characteristics of the curve may be that the power data is related to certain randomness and various external influencing factors. The power load curve of the same week in the same month in different years may be due to abnormal weather, power outages, etc. at that time. There are certain differences due to the external causes of the series. Although there are certain differences in the weekly power load data of different years, common laws can also be drawn from them; no matter what year it is, the data of a week shows obvious periodic changes, and the waveform of a week is similar to a sine wave, which is very Obviously, it is related to people's daily routines. When people rest at night, the power consumption drops sharply, resulting in a very low value of power consumption at night, and a large amount of power energy is consumed to maintain work and life during the day. Therefore, the consumption of electricity presents a state of sharp rise as people's rest is completed; it repeatedly presents a state of a cycle.
  From the above analysis, we can draw a basic conclusion that the weather influence factors have certain influence factors on electricity consumption, and the power load data is a typical time series data, because the data constantly presents periodic changes, This change has a strong correlation with people's daily routines; therefore, the electricity consumption data at the previous moment has a certain impact on the subsequent electricity consumption, which is a typical feature of time series data.
  The multi-feature power load forecasting experiment data in this paper is a sampling point every half an hour, that is, 48 ​​points will be sampled a day. Since the power load data analyzed above changes periodically with people's work and rest, and factors such as weather and electricity prices also affect it; therefore, the weather in the first 48 sampling points is used in the multi-feature power load forecasting Factors, electricity prices, and load data are used as the input features of the model to predict the power load data of the 49th sampling point; this is the law that continues to scroll down.


2. Multi-feature electric load forecasting based on BP neural network

2.1. BP neural network model applied to multi-feature power load forecasting

  Now back to the multi-feature power load forecasting task, from the essence of the task, the multi-feature power load forecasting is essentially a regression forecasting task, but compared with the single-feature power load in the input of the neural network, there are more in the output. Factors affecting the weather and electricity price; As far as the construction of the neural network is concerned, the construction of the network has not changed much. Single-feature power load forecasting, data processing, and building a data format that can be learned by the BP neural network are a little more complicated.
  I always think that for beginners, using deep learning to train a network model for one's own tasks has two points that are more complicated; the first point is the construction of the environment, and the second point is data processing. You can take a look at the step-by-step tutorial.
  [Multi-feature forecasting] Multi-feature electric load forecasting based on BP neural network
  [Multi-feature forecasting] Multi-feature power load forecasting based on RNN and LSTM
  [Multi-feature forecasting] Multi-feature power load forecasting based on CNN-LSTM network
  Let's take a look at it as a whole How does BP neural network use multi-feature power load data for power load forecasting task. Like the previous single-feature power load forecasting, the structure of the BP neural network model is as follows:
insert image description here
  From the analysis of the above figure, it can be found that compared with the single-feature load forecasting model, the only difference between them is the difference in input data, The input data of a single feature only has the power load data of the previous sampling points, while the input data of the multi-feature power load forecasting contains factors affecting the power load and power load data in the previous sampling points; their final predicted value is the power load value; therefore These two models are obviously different in the processing of power load data, which is also one of the difficulties of the power load forecasting model.

2.2. Data preprocessing and data set division

  The following is the preprocessing and division part of the power load forecasting data set:

# 进行数据归一化,将数据归一化到0-1之间
scaler = MinMaxScaler(feature_range=(0, 1))
train = scaler.fit_transform(train)
val = scaler.fit_transform(val)

"""
进行训练集数据特征和对应标签的划分,其中前面48个采样点中的天气特征、电价特征和负荷特征
来预测第49个点的电力负荷值。
"""

# 设置训练集的特征列表和对应标签列表
x_train = []
y_train = []

for i in np.arange(48, len(train)):
    x_train.append(train[i - 48:i, :])
    y_train.append(train[i, 5])
    
# 将训练集由list格式变为array格式
x_train, y_train = np.array(x_train), np.array(y_train)
x_train, y_train = np.reshape(x_train, (x_train.shape[0], 48*6)), np.reshape(y_train, (y_train.shape[0], 1))

# 设置训练集的特征列表和对应标签列表
x_val = []
y_val = []

for i in np.arange(48, len(val)):
    x_val.append(val[i - 48:i, :])
    y_val.append(val[i, 5])

# 将训练集由list格式变为array格式
x_val, y_val = np.array(x_val), np.array(y_val)
x_val, y_val = np.reshape(x_val, (x_val.shape[0], 48*6)), np.reshape(y_val, (y_val.shape[0], 1))

  First, normalize the acquired data. The physical dimensions of different features in the power load data are different, so it may cause some physical quantities to have relatively large values, which will make them larger when calculating the neural network. It is mistaken to believe that this feature is more important, and it will also make the calculation of the time network slower. Therefore, the first step in data prediction is to perform normalized preprocessing on it, and normalize all features to between 0-1. between.
  At the same time, set the list of features and labels corresponding to the training set and verification set to install the corresponding data and labels, so that the subsequent neural network can use the data features to learn, and reverse the error calculated by using the forward propagation output value and the real label. Update the weight parameters of the neural network to the propagation, so that the neural network can continuously learn useful information from the data.


2.3. Model building

The following is the BP neural network model built using the deep learning framework:

# 利用keras搭建BP神经网络,该网络隐藏层一共有两层,神经元分别为10
model = Sequential()
model.add(Dense(10, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(1))


# 对模型进行编译,选用Adam优化器,学习率为0.01
model.compile(optimizer=keras.optimizers.Adam(0.01), loss='mean_squared_error')

# 将训练集和测试集放入网络进行训练,每批次送入的数据为512个数据,一共训练30轮,将测试集样本放入到神经网络中测试其验证集的loss值
history = model.fit(x_train, y_train, batch_size=512, epochs=30, validation_data=(x_val, y_val))

# 保存训练好的模型
model.save('BP_model.h5')

  At this time, the BP neural network has been built, and the training of the neural network can be carried out at this time. At the same time, the trained network model is saved in BP_model.h5, and the trained model parameters can be used for power load prediction and reasoning.
  The following figure shows the training process of the BP electric load network:
  From the figure below, we can see that the loss of the training set and verification set of the neural network is continuously decreasing, and it is obvious that the neural network model is constantly converging. Take a look at the final loss value comparison chart of the BP neural network as shown below: As shown in the
insert image description here
  figure above, this figure records the loss values ​​of the training set and the verification set during the 30 rounds of training of the BP neural network. Obviously, after 15 rounds The neural network model has converged.


2.4. Model prediction

  From the project folder, you can see that the model file that has been saved and trained has been generated. Using the model parameter file, you can directly perform model inference on the characteristics of the input data. The generated model file is shown in the figure below: At this time, you can use the trained model
insert image description here
  to Test to test and denormalize the predicted values. The specific code is as follows:


# 导入训练好的模型权重文件
model = load_model("BP_model.h5")


# 测试集输入模型进行预测
predicted = model.predict(x_test)
# print(predicted.shape)
# print(test.shape)

# 将真实值标签进行反归一化操作,方便后续和真实值进行对比
real = np.concatenate((test[48:, :-1], y_test), axis=1)
real = scaler.inverse_transform(real)
real = real[:, 5]


# 将模型预测出的值进行反归一化操作
prediction = np.concatenate((test[48:, :-1], predicted), axis=1)
prediction = scaler.inverse_transform(prediction)
prediction = prediction[:, 5]

  Compare the predicted value with the real label, and draw the comparison chart of the predicted value and the real value as follows.
insert image description here
  From the comparison in the above figure, it can be found that the BP neural network has a certain effect on power load forecasting, but it can be seen that there are certain errors between the peak and peak valley predictions and the actual values. Specifically, the evaluation index of the model can be used to evaluate the model.


# 计算模型的评价指标
R2 = r2_score(real, prediction)
MAE = mean_absolute_error(real, prediction)
RMSE = np.sqrt(mean_squared_error(real, prediction))
MAPE = np.mean(np.abs((real-prediction) / prediction))

# 打印模型的评价指标
print('R2:', R2)
print('MAE:', MAE)
print('RMSE:', RMSE)
print('MAPE:', MAPE)

  The specific evaluation index calculation is as follows.
insert image description here
  More detailed explanations and complete codes can be obtained in my course:
  [Multi-feature forecasting] Multi-feature power load forecasting based on BP neural network
  [Multi-feature forecasting] Multi-feature power load forecasting based on RNN and LSTM
  [Multi-Feature Forecasting] Based on CNN-LSTM network multi-feature electricity load forecasting


3. Multi-feature electric load forecasting based on RNN and LSTM neural network

3.1. RNN and LSTM neural network models are applied to multi-feature power load forecasting

In the previous article, the principle   of the RNN/LSTM neural network model has been explained in detail. If you don’t understand it, you can read this article in detail . You can take a look at the course I recorded, which explains this model in detail, quick introduction to deep learning and actual combat .
  Now we know that the model of the RNN neural network and the model of the LSTM neural network can be represented by the following graph
insert image description here
  : The structure diagram, the following figure is the structure diagram of the LSTM neural network. Obviously, except for the different neural network units, the RNN neural network and the LSTM neural network are basically the same. Then let's take the LSTM neural network as an example to see how the LSTM neural network uses multi-featured power load data for power load forecasting tasks.
insert image description here
  As shown in the figure above, it shows how multi-featured power load data is input into the LSTM neural network; here it is important to emphasize that this figure should not be viewed as a whole, but should be viewed from left to right, that is, in order. The data input for 3 time steps is plotted in the figure. First, input the load influencing factors and power load data values ​​at time t-1 to calculate the LSTM neural network unit, and the calculated values ​​are sent to two places respectively, and the first one is output to the next layer as the output of the hidden layer of the neural network , the second one is used as the input of the next time step; therefore, the data input at time t is the load influencing factors and power load data at this time, and the output result of the time step at the previous time, so the input at time t includes good information at the previous moment. The output at time t has the same two outputs as at time t-1. The calculation at time t+1 is the same as the calculation at time t, so the calculation is continued according to the above-mentioned rule until the calculation ends at the last set time.
  The data in this article is a sampling point every half an hour, that is, 48 ​​points will be sampled a day. Since the power load data analyzed above changes periodically with people's work and rest, and factors such as weather and electricity prices also affect it; therefore, the weather in the first 48 sampling points is used in the multi-feature power load forecasting Factors, electricity prices, and load data are used as the input features of the model to predict the power load data of the 49th sampling point; this is the law that continues to scroll down.
  Therefore, in this experiment, the LSTM neural network must continuously calculate 48 time steps until the final predicted value output. Here I want to emphasize that calculating 48 time steps does not mean that there are 48 calculation units mentioned above, but that the same unit repeats calculations 48 times, so such a neural network is also called a cyclic neural network. Similarly, the calculation process of RNN neural network and LSTM neural network is the same.


3.2. Data preprocessing and data set division

  The following is the preprocessing and division part of the power load forecasting data set:

# 进行数据归一化,将数据归一化到0-1之间
scaler = MinMaxScaler(feature_range=(0, 1))
train = scaler.fit_transform(train)
val = scaler.fit_transform(val)
"""
进行训练集数据特征和对应标签的划分,其中前面48个采样点中的天气特征、电价特征和负荷特征
来预测第49个点的电力负荷值。
"""
# 设置训练集的特征列表和对应标签列表
x_train = []
y_train = []

# 将前48个采样点的天气特征和电价特征和负荷特征作为训练特征添加到列表中
# 按照上述规律不断滑动取值
for i in np.arange(48, len(train)):
    x_train.append(train[i - 48:i, :])
    y_train.append(train[i, 5])

# 将训练集由list格式变为array格式,LSTM神经网络的输入格式3维,样式为(输入样本数,时间步,特征数量)
x_train, y_train = np.array(x_train), np.array(y_train)

# 设置训练集的特征列表和对应标签列表
x_val = []
y_val = []

# 将前48个采样点的天气特征和电价特征和负荷特征作为训练特征添加到列表中
# 按照上述规律不断滑动取值
for i in np.arange(48, len(val)):
    x_val.append(val[i - 48:i, :])
    y_val.append(val[i, 5])

# 将训练集由list格式变为array格式
x_val, y_val = np.array(x_val), np.array(y_val)

  The above is the data preprocessing of the LSTM neural network model and the division of the training set and the verification set. It is the same as the preprocessing of the previous BP neural network data set. After the data is obtained, the data must be normalized. This step While accelerating the update of the parameters of the neural network, it is also conducive to improving the prediction accuracy of the neural network.
  It should be emphasized here that the data format of the output network model of RNN and LSTM is [number of samples, time step, number of features]; the number of samples is easy to understand, that is, how many samples are there in the training set; the time step is the cyclic neural network How many times does the network need to calculate in total? For example, the experiment in this case uses the influence factors and load data of the previous 48 time steps as features to predict the data of the 49th point, then the cyclic neural network needs to be calculated 48 times, each time Calculate the characteristic data of a time step; the characteristic data is also easy to understand. In this experiment, it is the influencing factors plus the load, specifically the dry bulb temperature, dew point temperature, wet bulb temperature, humidity, electricity price and electricity load data. Features (note here that the power load data is both a feature and a label). So let's take a look at the data dimensions of the final data set:
insert image description here
  Obviously, the training set has 67600 samples, the time step is 48, and the feature is 6; obviously the validation set has 9952 samples, the time step is 48, and the feature is 6 .

3.3. Model building

  The following is the RNN neural network model built using the deep learning framework:

# 利用keras搭建RNN神经网络,该网络隐藏层一共有两层,神经元分别为10
model = Sequential()
model.add(SimpleRNN(10, return_sequences=True, activation='relu'))
model.add(SimpleRNN(10, return_sequences=False, activation='relu'))
model.add(Dense(5, activation='relu'))
model.add(Dense(1))

# 对模型进行编译,选用Adam优化器,学习率为0.01
model.compile(optimizer=keras.optimizers.Adam(0.01), loss='mean_squared_error')

# 将训练集和测试集放入网络进行训练,每批次送入的数据为512个数据,一共训练30轮,将测试集样本放入到神经网络中测试其验证集的loss值
history = model.fit(x_train, y_train, batch_size=512, epochs=30, validation_data=(x_val, y_val))

# 保存训练好的模型
model.save('RNN_model.h5')

The following is the LSTM neural network model built using the deep learning framework:

# 利用keras搭建LSTM神经网络,该网络隐藏层一共有两层,神经元分别为10
model = Sequential()
model.add(LSTM(10, return_sequences=True, activation='relu'))
model.add(LSTM(10, return_sequences=False, activation='relu'))
model.add(Dense(5, activation='relu'))
model.add(Dense(1))

# 对模型进行编译,选用Adam优化器,学习率为0.01
model.compile(optimizer=keras.optimizers.Adam(0.01), loss='mean_squared_error')

# 将训练集和测试集放入网络进行训练,每批次送入的数据为512个数据,一共训练30轮,将测试集样本放入到神经网络中测试其验证集的loss值
history = model.fit(x_train, y_train, batch_size=512, epochs=30, validation_data=(x_val, y_val))

# 保存训练好的模型
model.save('LSTM_model.h5')

  The following uses the LSTM neural network as an example to explain. The training process of the LSTM neural network is shown in the figure below. It can be seen that the loss value is continuously decreasing. After the model training is completed, the loss value of the model is as follows: Obviously, the network is close to convergence in about 5 rounds
insert image description here
  . For BP neural network, the convergence is faster.


3.4. Model prediction

  After the model training is completed, the trained model is saved, and the model parameter file can be used to directly perform model inference on the characteristics of the input data. The generated model parameters are as follows: At this time, the trained
insert image description here
  model can be used to test the test, and the predicted prediction Values ​​are denormalized. The specific code is as follows:

# 导入训练好的模型参数
model = load_model("LSTM_model.h5")


# 测试集输入模型进行预测
predicted = model.predict(x_test)


# 将真实值标签进行反归一化操作,方便后续和真实值进行对比
real = np.concatenate((test[48:, :-1], y_test), axis=1)
real = scaler.inverse_transform(real)
real = real[:, 5]

# 将模型预测出的值进行反归一化操作
prediction = np.concatenate((test[48:, :-1], predicted), axis=1)
prediction = scaler.inverse_transform(prediction)
prediction = prediction[:, 5]

  Compare the predicted value with the real label, and draw the comparison chart of the predicted value and the real value as follows.

insert image description here
  From the comparison in the above figure, it can be found that the effect of LSTM neural network on power load forecasting is very good, and the model can be evaluated by using the evaluation index of the model.
  The specific evaluation index code is as follows

# 计算模型的评价指标
R2 = r2_score(real, prediction)
MAE = mean_absolute_error(real, prediction)
RMSE = np.sqrt(mean_squared_error(real, prediction))
MAPE = np.mean(np.abs((real-prediction) / prediction))

# 打印模型的评价指标
print('R2:', R2)
print('MAE:', MAE)
print('RMSE:', RMSE)
print('MAPE:', MAPE)

  The specific evaluation indicators are calculated as follows:
insert image description here
  For more detailed explanations and complete codes, you can go to my course to obtain:
  [Multi-feature forecasting] Multi-feature power load forecasting based on BP neural network
  [Multi-feature forecasting] Multi-feature power load forecasting based on RNN and LSTM Forecasting
  [Multi-feature Forecasting] Multi-feature electricity load forecasting based on CNN-LSTM network


4. Multi-feature electric load forecasting based on CNN-LSTM neural network

  Compared with a single neural network model, in some scenarios, the effect of a combined model is often better than that of a single model, because the combined model often has complementary advantages. The CNN-LSTM network model is the first combined neural network model explained in this series. It is not difficult to find out from the name of the neural network model that the neural network model is a combination of CNN neural network and LSTM neural network to learn and predict power load data. But here I want to emphasize that the CNN neural network used here is a one-dimensional convolutional neural network; and the convolutional neural network that everyone comes into contact with during the learning process of deep learning is basically a two-dimensional convolutional neural network, because compared to other As far as the neural network is concerned, the two-dimensional convolutional neural network is indeed better than other neural networks in the feature extraction of images; but this paper uses the combined neural network model to extract the features of the power load data. From the above analysis, we can Know that the electric load data is a typical time series model data. Therefore, one-dimensional convolutional neural network can be used to extract time series data. Before learning one-dimensional convolutional neural network, it is recommended to learn two-dimensional convolutional neural network. Here are related articles and corresponding video courses . .

4.1. One-dimensional convolution operation and pooling operation

4.1.1. One-dimensional convolution operation

  Compared with the two-dimensional convolution operation, which performs convolution operations by sliding left, right, up and down on the entire feature map, the one-dimensional convolution operation only performs continuous sliding on the data feature H for convolution operations. Specifically as shown in the figure.
insert image description here
  As shown in the above figure, assuming that the dimension of the data at this time is H×W, then it is only necessary to set the height of the convolution kernel at this time. As shown in the figure, if the height of the convolution kernel is set to 3, then the height of the convolution kernel The dimension size must be 3×W, because the one-dimensional convolution operation only slides in one direction; of course, the convolution operation can also be operated in two-dimensional convolution operations such as setting the stride. Use the set convolution kernel to continuously slide on the data features, multiply the corresponding factors with the convolution kernel and the corresponding receptive field, and finally sum. Repeat the operation until you can no longer swipe down. The specific operation is shown in the figure above.
  Of course, the data for one-dimensional convolution is not necessarily single-channel data. The data may have multiple channels. At the same time, it is hoped to artificially set the channel of the output data after convolution operation. The specific calculation is shown in the figure below: As shown in the figure above
insert image description here
  , Suppose the dimension of the input data is H×W×C; then the number of channels of the corresponding convolution kernel must be C at this time, and the width of the convolution kernel of the one-dimensional convolution operation must be the same as the width of the feature map, then The width of the convolution kernel is also W; at the same time, set the height of the convolution kernel to FH, then the dimension of the convolution kernel is FH×W×C; if you want to set the number of channels of the output data, then you must set the convolution The number of kernels; assuming that the channel size of the data to be output is FN, then the number of convolution kernels must be set to FN. Therefore, the dimension of the final output data by calculation is OH×1×FN.

4.1.2. One-dimensional pooling operation

  Like two-dimensional pooling, one-dimensional convolution operations have two pooling methods: maximum pooling and average pooling. Let's take a look at it in detail.
  (1) Maximum pooling
  Maximum pooling is to select the maximum value for the selected area, which is the same as the two-dimensional pooling operation, as shown in the figure below:
insert image description here
  (2) Average pooling
  Average pooling is to select the maximum value for the selected area The average value of the area is selected, which is the same as the two-dimensional pooling operation, as shown in the following figure:
insert image description here

4.2. CNN-LSTM model applied to electric load forecasting with multiple influencing factors

  After the analysis of the above one-dimensional convolution operation and pooling operation, the combined neural network model of CNN-LSTM as shown in the figure below is built.
insert image description here
  It can be seen from the figure that the input of the neural network network is multi-featured power load data, and the data is sampled in 30 minutes, so the sampling of a day is 48 time points, and the power load data is a typical time series data. The previous data The data of time has a certain influence on the following data. The data was analyzed earlier. The data is based on the data of the day, so 48 points are taken as the input of the data, and the influence characteristics of the data are promising. There are six characteristics of temperature, dew point temperature, wet bulb temperature, humidity, electricity price and electricity load data, so the input data has a dimension of 48×6.
  First, one-dimensional convolution operation is performed on the data. It can be seen from the above that the data after one-dimensional convolution operation is an N×1-dimensional data. In the figure, seven convolution kernels are used to perform one-dimensional convolution algorithm, so the obtained 7 N×1-dimensional data; that is, to obtain a data channel with 7 data channels and a length and width of N×1 data.
  Then perform a one-dimensional pooling operation on the convoluted data. Regardless of whether it is two-dimensional pooling or one-dimensional pooling, the number of data channels after the pooling operation remains unchanged. The length of the data after the one-dimensional pooling operation is smaller than the original (of course, it may remain the same or become larger, depending on the specific parameter values ​​​​of the pooling operation). The data after the pooling operation can also be subjected to one-dimensional convolution operation, but in order to facilitate the observation of the overall structure, the model in this paper draws a convolution and a pooling operation.
  Finally, the feature structure extracted by the convolutional neural network is input into the LSTM neural network, and the time series information is learned by the LSTM neural network. And at the end of the neural network structure, the fully-connected neural network layer is connected, and the fully-connected neural network predicts the previously extracted and learned features, and finally outputs the predicted value.

4.3. Data preprocessing and data set division

  The following is the preprocessing and division part of the power load forecasting data set:

# 进行数据归一化,将数据归一化到0-1之间
scaler = MinMaxScaler(feature_range=(0, 1))
train = scaler.fit_transform(train)
val = scaler.fit_transform(val)

"""
进行训练集数据特征和对应标签的划分,其中前面48个采样点中的天气特征、电价特征和负荷特征
来预测第49个点的电力负荷值。
"""
# 设置训练集的特征列表和对应标签列表
x_train = []
y_train = []
# 将前48个采样点的天气特征和电价特征和负荷特征作为训练特征添加到列表中
# 按照上述规律不断滑动取值
for i in np.arange(48, len(train)):
    x_train.append(train[i - 48:i, :])
    y_train.append(train[i, 5])

# 将训练集由list格式变为array格式, LSTM神经网络的输入格式3维,样式为(输入样本数,时间步,特征数量)
x_train, y_train = np.array(x_train), np.array(y_train)

# 设置训练集的特征列表和对应标签列表
x_val = []
y_val = []

# 将前48个采样点的天气特征和电价特征和负荷特征作为训练特征添加到列表中
# 按照上述规律不断滑动取值
for i in np.arange(48, len(val)):
    x_val.append(val[i - 48:i, :])
    y_val.append(val[i, 5])

# 将训练集由list格式变为array格式
x_val, y_val = np.array(x_val), np.array(y_val)

4.4. Model building

  The following is the CNN-LSTM neural network model built using the deep learning framework:

model = Sequential()
model.add(Conv1D(filters=32, kernel_size=3, strides=1, activation="relu"))
model.add(MaxPooling1D(pool_size=2, strides=1,))
model.add(Conv1D(filters=64, kernel_size=2, strides=1, activation="relu"))
model.add(MaxPooling1D(pool_size=3, strides=1))
model.add(LSTM(10, return_sequences=True, activation='relu'))
model.add(LSTM(10, return_sequences=False, activation='relu'))
model.add(Dense(5, activation='relu'))
model.add(Dense(1))

# 对模型进行编译,选用Adam优化器,学习率为0.01
model.compile(optimizer=keras.optimizers.Adam(0.01), loss='mean_squared_error')

# 将训练集和测试集放入网络进行训练,每批次送入的数据为512个数据,一共训练30轮,将测试集样本放入到神经网络中测试其验证集的loss值
history = model.fit(x_train, y_train, batch_size=512, epochs=1, validation_data=(x_val, y_val))
# print(model.summary())

# 保存训练好的模型
model.save('CNN_LSTM_model.h5')

  The process of training the CNN-LSTM neural network is shown in the figure below, and you can see that the loss value is constantly declining:
insert image description here

4.5. Model prediction

  After the model training is completed, the trained model can be saved, and the model parameter file can be used to directly perform model inference on the characteristics of the input data. The generated model parameters are as follows: At this time, the trained
insert image description here
  model can be used to test the test set, and the predicted Predicted values ​​are denormalized. The specific code is as follows:

# 导入训练好的模型参数
model = load_model("CNN_LSTM_model.h5")


# 测试集输入模型进行预测
predicted = model.predict(x_test)


# 将真实值标签进行反归一化操作,方便后续和真实值进行对比
real = np.concatenate((test[48:, :-1], y_test), axis=1)
real = scaler.inverse_transform(real)
real = real[:, 5]

# 将模型预测出的值进行反归一化操作
prediction = np.concatenate((test[48:, :-1], predicted), axis=1)
prediction = scaler.inverse_transform(prediction)
prediction = prediction[:, 5]

  Compare the predicted value with the real label, and draw the comparison chart of the predicted value and the real value as follows.

insert image description here
  It seems that the prediction results of the CNN-LSTM neural network are not as good as the prediction results of a single LSTM neural network. Specifically, look at the evaluation indicators of the model.
  The specific evaluation index codes are as follows:

# 计算模型的评价指标
R2 = r2_score(real, prediction)
MAE = mean_absolute_error(real, prediction)
RMSE = np.sqrt(mean_squared_error(real, prediction))
MAPE = np.mean(np.abs((real-prediction) / prediction))

# 打印模型的评价指标
print('R2:', R2)
print('MAE:', MAE)
print('RMSE:', RMSE)
print('MAPE:', MAPE)

  The specific evaluation indicators are calculated as follows:
insert image description here
  For more detailed explanations and complete codes, you can go to my course to obtain:
  [Multi-feature forecasting] Multi-feature power load forecasting based on BP neural network
  [Multi-feature forecasting] Multi-feature power load forecasting based on RNN and LSTM Forecasting
  [Multi-feature Forecasting] Multi-feature electricity load forecasting based on CNN-LSTM network


5. Summary of comparative analysis of multi-feature power load forecasting models

  We have built BP, LSTM, and CNN-LSTM neural networks, and tested them with the test set, and calculated the index of the corresponding model's evaluation index. Next, we will use the value of such evaluation index to evaluate the above-built neural network. model for analysis. The specific evaluation index values ​​are shown in the table below:

Model / bid evaluation R2 MAE RMSE MAPE
BP 0.9734 170.7927 202.6840 0.0208
LSTM 0.9924 86.0592 108.9543 0.0106
CNN-LSTM 0.9894 96.7660 128.6885 0.0118

  As shown in the above table, it is the evaluation index of multi-feature electric load forecasting using BP, LSTM, and CNN-LSTM neural network in this paper; the evaluation indexes include R2, MAE, RMSE, and MAPE; R2 means that the value closer to 1 is better, and MAE , RMSE, and MAPE are the closer to 0, the better; from the table, it can be concluded that LSTM has the best effect, followed by CNN-LSTM, and BP has the worst effect. The analysis can draw the following conclusions:
  1 , The amount of data in this article is relatively sufficient, so the final three algorithm models are actually very good in terms of effect. It can predict the electric load value very well.
  2. Judging from a large number of papers, it is finally proved that the effect of CNN-LSTM is better than that of LSTM, but the effect obtained in this paper is that LSTM has a better effect. The reasons are as follows: (1) LSTM and
CNN- The model parameters of LSTM have not been tuned, so it is not necessarily the optimal model.
(2) This article does not carry out statistics under a large number of experiments, but only the experimental results obtained from one or two experiments, which may be accidental.
(3) The data in this article is relatively sufficient. Whether the superiority of the CNN-LSTM model can be shown when the data is relatively small.


6. Subsequent update plan of power load forecasting model

  The following articles and teaching videos related to the power load forecasting model of related algorithms are expected to be updated by the end of 23. If you want to continue to learn power load forecasting, or if you need to complete the design, you can follow me and bookmark this article. I will continue Constantly update relevant articles and teaching videos.
   6.1. Single-feature electricity load forecasting based on BP, LSTM, CNN-LSTM neural network algorithms (end)
  6.2. Multi-feature electricity load forecasting based on BP, RNN, LSTM, CNN-LSTM algorithms
  6.3. Bayesian optimization-based BP, RNN, LSTM, CNN-LSTM algorithm power load forecasting
  6.4, BP, RNN, LSTM, CNN-LSTM algorithm power load forecasting based on particle swarm optimization
  6.5, BP, RNN, LSTM, CNN-based ant colony algorithm optimization LSTM algorithm power load forecasting
  6.6, BP, RNN, LSTM, CNN-LSTM algorithm power load forecasting based on genetic algorithm
  6.7, BP, RNN, LSTM, CNN-LSTM algorithm power load forecasting based on wolf pack algorithm
  6.8, wolf-based Group algorithm optimized BP, RNN, LSTM, CNN-LSTM algorithm power load forecasting
  6.9, attention mechanism based BP, RNN, LSTM, CNN-LSTM algorithm power load forecasting
  6.10, transformer algorithm based power load forecasting

Guess you like

Origin blog.csdn.net/didiaopao/article/details/127038492