Time series prediction - CNN-LSTM implements multi-variable multi-step photovoltaic prediction (Tensorflow)

Table of contents

1 Data processing

1.1 Import library files

1.2 Import data set

1.3 Missing value analysis

2 Construct training data

3 Model training

3.1 CNN-LSTM network 

3.2 Model training

4 Model prediction

 Column link:https://blog.csdn.net/qq_41921826/category_12495091.html

Number of calculations

1.1 Import library files

import scipy
import pandas as pd
import numpy as np
import math
import datetime
from matplotlib import pyplot as plt

# 导入深度学习框架tensorflow
import tensorflow as tf    
from tensorflow import keras 
from tensorflow.keras import Sequential, layers, callbacks
from tensorflow.keras.layers import Input, Reshape,Conv2D, MaxPooling2D, LSTM, Dense, Dropout, Flatten, Reshape

from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error, mean_absolute_percentage_error 

# 忽略警告信息
import warnings
plt.rcParams['font.sans-serif'] = ['SimHei']     # 显示中文
plt.rcParams['axes.unicode_minus'] = False  # 显示负号
plt.rcParams.update({'font.size':18})  #统一字体字号

1.2 Import data set

The experimental data set uses data set 6: Australian electricity load and price forecast data (Download link), the data set includes the data set includes Date, hour, dry bulb temperature, dew point temperature, wet bulb temperature, humidity, electricity price, power load characteristics, time interval 30 minutes. Select two years of data to conduct experiments and visualize the data:

# 导入数据
data_raw = pd.read_excel("E:\\课题\\08数据集\\澳大利亚电力负荷与价格预测数据\\澳大利亚电力负荷与价格预测数据.xlsx")
data_raw = data_raw[-365*24*2*2-1:-1]
from itertools import cycle
# 可视化数据
def visualize_data(data, row, col):
    cycol = cycle('bgrcmk')
    cols = list(data.columns)
    fig, axes = plt.subplots(row, col, figsize=(16, 4))
    if row == 1 and col == 1:  # 处理只有1行1列的情况
        axes = [axes]  # 转换为列表,方便统一处理
    for i, ax in enumerate(axes.flat):
        if i < len(cols):
            ax.plot(data.iloc[:,i], c=next(cycol))
            ax.axis('off')  # 如果数据列数小于子图数量,关闭多余的子图

visualize_data(data_raw.iloc[:,2:], 2, 3)


​Looking at some of the power data individually, we found that there are strong regularities.


1.3 Missing value analysis

First check the data information and find that there are no missing values.


Further statistics on missing values ​​show that the data is relatively complete and there are no missing values. Other outliers anddata processing can be handled by yourself.


2 Construct training data

Select a data set and remove time features

data = data_raw.iloc[:,2:].values

Constructing training data is also the key to truly predicting the future. First set the predicted timesteps time step, predict_steps predicted step size (the predicted step size should be smaller than the total prediction step size), length the total prediction step size, and the parameters can be changed as needed.

timesteps = 48*7 #构造x,为96*5个数据,表示每次用前96*5个数据作为一段
predict_steps = 48 #构造y,为96个数据,表示用后96个数据作为一段
length = 48 #预测多步,预测96个数据
feature_num = 5 #特征的数量

To predict the predict_steps data of the next day through the timesteps data of the first 5 days, the data set needs to be rolled and divided (that is, the features of the first timesteps row and the label training of the last predict_steps row can be used to predict the future through the timesteps row features during subsequent predictions. predict_steps labels). Because it is multi-variable, features and labels are divided separately.

# 构造数据集,用于真正预测未来数据
# 整体的思路也就是,前面通过前timesteps个数据训练后面的predict_steps个未来数据
# 预测时取出前timesteps个数据预测未来的predict_steps个未来数据。
# 单变量划分只需对单个变量划分,多变量划分特征和标签分开划分
def create_dataset(datasetx, datasety=None, timesteps=36, predict_size=6):
    datax = []  # 构造x
    datay = []  # 构造y
    for each in range(len(datasetx) - timesteps - predict_size):
        x = datasetx[each:each + timesteps]
        # 判断是否是单变量分解还是多变量分解
        if datasety is not None:
            y = datasety[each + timesteps:each + timesteps + predict_size]
            y = datasetx[each + timesteps:each + timesteps + predict_size, 0]
    return datax, datay

​​Before data processing, the data needs to be normalized and divided according to the above method. The divided data and normalized model are returned here. Because it is multi-variable, features and labels are normalized separately, otherwise they are normalized later. There will be issues of information leakage. The function is defined as follows:

# 数据归一化操作
def data_scaler(datax, datay=None, timesteps=36, predict_steps=6):
    # 数据归一化操作
    scaler1 = MinMaxScaler(feature_range=(0, 1))   
    datax = scaler1.fit_transform(datax)
    # 用前面的数据进行训练,留最后的数据进行预测
    # 判断是否是单变量分解还是多变量分解
    if datay is not None:
        scaler2 = MinMaxScaler(feature_range=(0, 1))
        datay = scaler2.fit_transform(datay)
        trainx, trainy = create_dataset(datax, datay, timesteps, predict_steps)
        trainx = np.array(trainx)
        trainy = np.array(trainy)
        return trainx, trainy, scaler1, scaler2
        trainx, trainy = create_dataset(datax, timesteps=timesteps, predict_size=predict_steps)
        trainx = np.array(trainx)
        trainy = np.array(trainy)
        return trainx, trainy, scaler1, None

Then the data is divided and normalized according to the above function. Using the 96*5 data of the first five days to predict the 96 data of the next day, the data set needs to be rolled and divided (that is, the features of the first 96*5 rows and the label training of the last 96 rows can be used for subsequent predictions through 96* 5 rows of features predict 96 labels in the future)

datax = data[:,:-1]
datay = data[:,-1].reshape(data.shape[0],1)
trainx, trainy, scaler1, scaler2 = data_scaler(datax, datay)

​3 Model explanation

3.1 CNN-LSTM network 

CNN-LSTM is a hybrid neural network that combines the feature extraction capabilities of CNN with the long-term memory capabilities of LSTM for time series.

CNN mainly consists of four layers, namely input layer, convolution layer, activation layer (Relu function) and pooling layer. Each layer will process the data and send it to the next layer, the most important of which is the convolution layer. The role of this layer is to perform convolution calculations on the feature data and transfer the calculated results to the activation layer, activation function. Filter the data. The last layer is the LSTM layer. This layer performs further dimensional biasing, weight correction, etc. on the model based on the feature data processed by the CNN to prepare for the next step of outputting higher-accuracy prediction values. In LSTM During the training process, since the neural network includes input, forget and output gates, the usual approach is to control the accuracy of the algorithm by increasing or decreasing the number of forget gates and input gates.

Source: Research on short-term wind power prediction method based on improved CNN-LSTM

For the data input to CNN-LSTM, first, local features are extracted through the convolutional layer of CNN, and the extracted feature vectors are passed to the pooling layer for downsampling of feature vectors and compression of data volume. Then, the feature vectors processed by the convolution layer and the pooling layer are converted into one-dimensional vectors through a flattening layer and input into the LSTM. A random deactivation layer is added after each layer of LSTM to prevent the model from overfitting.

3.2 Model training

First, build the general operation of the model, and then use the training data trainx and trainy for training, and perform training for 50 epochs, with each batch containing 64 samples. At this time, input_shape is the shape of each x when dividing the data set. (It is recommended to use GPU for training, because my computer performance is limited, it is recommended to increase the epochs value)

def CNN_LSTM_model_train(trainx, trainy, timesteps, feature_num, predict_steps):
    # 调用GPU加速
    gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
    # 定义CNN-LSTM模型
    start_time = datetime.datetime.now()
    model = Sequential()
    model.add(Input((timesteps, feature_num)))
    model.add(Reshape((timesteps, feature_num, 1)))
    model.add(MaxPooling2D(pool_size=2, strides=1, padding="same"))
    model.add(Reshape((timesteps, -1)))
    model.add(LSTM(128, return_sequences=True, dropout=0.2))  # 添加dropout层
    model.add(LSTM(64, return_sequences=False, dropout=0.2))  # 添加dropout层
    model.add(Dense(64, activation="relu"))  # 增加Dense层节点数量
    model.compile(loss="mean_squared_error", optimizer="adam", metrics=['mse'])
    # 模型训练
    model.fit(trainx, trainy, epochs=50, batch_size=128)
    end_time = datetime.datetime.now()
    running_time = end_time - start_time
    # 保存模型

    # 返回构建好的模型
    return model

Train on divided data 

model = CNN_LSTM_model_train(trainx, trainy, timesteps, feature_num, predict_steps)

4 Model prediction

First load the trained model

# 加载模型
from tensorflow.keras.models import load_model
model = load_model('BiLSTM_model.h5')

Prepare the data that needs to be predicted, retain 6 days of data during training, use the data of the first 5 days as input to predict, and compare the predicted results with the true value of the last day.

y_true = datay[-timesteps-predict_steps:-timesteps]
x_pred = datax[-timesteps:]

Predict, calculate and visualize errors, encapsulating these steps as functions.​​​​​​​

# 预测并计算误差和可视化
def predict_and_plot(x, y_true, model, scaler, timesteps):
    # 变换输入x格式,适应LSTM模型
    predict_x = np.reshape(x, (1, timesteps, feature_num))  
    # 预测
    predict_y = model.predict(predict_x)
    predict_y = scaler.inverse_transform(predict_y)
    y_predict = []
    # 计算误差
    r2 = r2_score(y_true, y_predict)
    rmse = mean_squared_error(y_true, y_predict, squared=False)
    mae = mean_absolute_error(y_true, y_predict)
    mape = mean_absolute_percentage_error(y_true, y_predict)
    print("r2: %.2f\nrmse: %.2f\nmae: %.2f\nmape: %.2f" % (r2, rmse, mae, mape))
    # 预测结果可视化
    cycol = cycle('bgrcmk')
    plt.figure(dpi=100, figsize=(14, 5))
    plt.plot(y_true, c=next(cycol), markevery=5)
    plt.plot(y_predict, c=next(cycol), markevery=5)
    plt.legend(['y_true', 'y_predict'])
    return y_predict
y_predict = predict_and_plot(x_pred, y_true, model, scaler2, timesteps)

Finally, the visual results and calculated errors are obtained, and the model prediction effect can be further improved through parameter adjustment and data processing.

  • r2: 0.19
  • ​​rmse: 725.34
  • there is: 640.73
  • maps: 0.08

Guess you like

Origin blog.csdn.net/qq_41921826/article/details/134933531