Time series forecasting - BiLSTM implements multi-variable multi-step photovoltaic forecasting (Tensorflow)

Table of contents

1 Data processing

1.1 Import library files

1.2 Import data set

1.3 Missing value analysis

2 Construct training data

3 Model training

3.1 BiLSTM network 

3.2 Model training

4 Model prediction


Number of calculations

1.1 Import library files

import time
import datetime
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt  
from sampen import sampen2  # sampen库用于计算样本熵
from vmdpy import VMD  # VMD分解库

import tensorflow as tf 
from sklearn.cluster import KMeans
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error, mean_absolute_percentage_error 
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, LSTM, GRU
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping

# 忽略警告信息
import warnings
warnings.filterwarnings('ignore')  

1.2 Import data set

The experimental data set uses data set 8: Xinjiang photovoltaic wind power data set (Download link). The data set includes component temperature (℃), Temperature (°), air pressure (hPa), humidity (%), total radiation (W/m2), direct radiation (W/m2), scattered radiation (W/m2), actual power generation (mw) characteristics, time interval 15min. Visualize your data:

# 导入数据
data_raw = pd.read_excel("E:\\课题\\08数据集\\新疆风电光伏数据\\光伏2019.xlsx")
data_raw
from itertools import cycle
# 可视化数据
def visualize_data(data, row, col):
    cycol = cycle('bgrcmk')
    cols = list(data.columns)
    fig, axes = plt.subplots(row, col, figsize=(16, 4))
    fig.tight_layout()
    if row == 1 and col == 1:  # 处理只有1行1列的情况
        axes = [axes]  # 转换为列表,方便统一处理
    for i, ax in enumerate(axes.flat):
        if i < len(cols):
            ax.plot(data.iloc[:,i], c=next(cycol))
            ax.set_title(cols[i])
        else:
            ax.axis('off')  # 如果数据列数小于子图数量,关闭多余的子图
    plt.subplots_adjust(hspace=0.6)
    plt.show()

visualize_data(data_raw.iloc[:,1:], 2, 4)

​Looking at some of the power data individually, we found that there are strong regularities.

Because it is only a single variable prediction, only the actual power generation (mw) data is selected for the experiment:

1.3 Missing value analysis

First check the data information and find that there are no missing values.

data_raw.info()

 Further statistics on missing values

data_raw.isnull().sum()

2 Construct training data

Constructing training data is also the key to truly predicting the future. First set the predicted timesteps time step, predict_steps predicted step size (the predicted step size should be smaller than the total prediction step size), length the total prediction step size, and the parameters can be changed as needed.

timesteps = 96*5 #构造x,为96*5个数据,表示每次用前96*5个数据作为一段
predict_steps = 96 #构造y,为96个数据,表示用后96个数据作为一段
length = 96 #预测多步,预测96个数据
feature_num = 7 #特征的数量

Using the timesteps data of the first five days to predict the next day's data predict_steps, the data set needs to be rolled and divided (that is, the features of the first timesteps row and the label training of the last predict_steps row). In subsequent predictions, the timesteps row features can be used to predict the future. predict_steps labels). Because it is multi-variable, features and labels are divided separately, otherwise there will be information leakage problems during normalization.

# 构造数据集,用于真正预测未来数据
# 整体的思路也就是,前面通过前timesteps个数据训练后面的predict_steps个未来数据
# 预测时取出前timesteps个数据预测未来的predict_steps个未来数据。
def create_dataset(datasetx,datasety,timesteps=36,predict_size=6):
    datax=[]#构造x
    datay=[]#构造y
    for each in range(len(datasetx)-timesteps - predict_steps):
        x = datasetx[each:each+timesteps]
        y = datasety[each+timesteps:each+timesteps+predict_steps]
        datax.append(x)
        datay.append(y)
    return datax, datay

Before data processing, the data needs to be normalized and divided according to the above method. The divided data and normalized model are returned here. The function is defined as follows:

# 数据归一化操作
def data_scaler(datax,datay):
    # 数据归一化操作
    scaler1 = MinMaxScaler(feature_range=(0,1))
    scaler2 = MinMaxScaler(feature_range=(0,1))
    datax = scaler1.fit_transform(datax)
    datay = scaler2.fit_transform(datay)
    # 用前面的数据进行训练,留最后的数据进行预测
    trainx, trainy = create_dataset(datax[:-timesteps-predict_steps,:],datay[:-timesteps-predict_steps,0],timesteps, predict_steps)
    trainx = np.array(trainx)
    trainy = np.array(trainy)
    
    return trainx, trainy, scaler1, scaler2

Then the data is divided and normalized according to the above function. Using the 96*5 data of the first 5 days to predict the 96 data of the next day, the data set needs to be rolled and divided (that is, the features of the first 96*5 rows and the label training of the last 96 rows can be used for subsequent predictions through 96* 5 rows of features predict 96 labels in the future)

datax = df_vmd[:,:-1]
datay = df_vmd[:,-1].reshape(df_vmd.shape[0],1)
trainx, trainy, scaler1, scaler2 = data_scaler(datax, datay)

Model explanation

3.1 BiLSTM network 

Long Short-Term Memory (LSTM) is a time-cyclic neural network, which is specially designed to solve the long-term dependency problem of general RNN.
By design, all RNNs have a chain form of repeated neural network modules. In standard RNN, this repeated structural module has only a very simple structure, such as a tanh layer. The LSTM neural network uses a gating mechanism to replace the simple hidden layer neurons of the recurrent neural network, which can solve the problem of long-term dependence and performs well in handling timing issues.


LSTM neural network

Traditional LSTM networks can only encode forward according to historical states and cannot consider the impact of reverse sequences. The change of electric load data is closely related to the development of time. Future data is usually similar to past data. In order to predict more comprehensively and accurately, it is necessary to consider the reverse direction. Sequence effects. Bi-directional Long Short-Term Memory (Bi-directional Long Short-Term Memory, BiLSTM) introduces the idea of ​​bi-directional calculation, which can realize simultaneous forward and reverse calculation based on the original LSTM network. Calculation can extract forward and backward information at the same time, better mine the time series characteristics of load data, and further improve the accuracy of the prediction model.



BiLSTM Neural Network

You can use Bidirectional() to build a BiLSTM model and perform the training process. The main implementation code is as follows:

    model.add(Bidirectional(LSTM(units=50, return_sequences=True), input_shape=(timesteps, feature_num)))
    model.add(Bidirectional(LSTM(units=100, return_sequences=True), input_shape=(timesteps, feature_num)))
    model.add(Bidirectional(LSTM(units=150)))
  •  units=50:Indicates that there are 50 neurons in the LSTM layer
  • return_sequences=True:Indicates that the layer returns the entire sequence rather than just the last one of the output sequence
  • input_shape=(timesteps, feature_num):Indicates that the shape of the input data is (timesteps, feature_num), where timesteps and feature_num are the number of time steps and features of the predefined input data.

The first line of code adds another bidirectional LSTM layer to the model, using units=50个 neurons.

The second line of code adds another bidirectional LSTM layer to the model, similar to the previous line, but this time using units=100个 neurons.

The third line of code adds another bidirectional LSTM layer to the model, this time without setting return_sequences=True, indicating that this layer does not return the entire sequence, instead only returning the last value of the output sequence.
 

3.2 Model training

First, build the general operation of the model, and then use the training data trainx and trainy for training, and perform training for 50 epochs, with each batch containing 64 samples. At this time, input_shape is the shape of each x when dividing the data set. (It is recommended to use GPU for training. Because my computer performance is limited, it is recommended to increase the epochs value; you can also increase the units in the LSTM network in sequence)

# # 创建BiLSTM模型
def BiLSTM_model_train(trainx, trainy):
    # 调用GPU加速
    gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
              
    # BiLSTM网络构建 
    start_time = datetime.datetime.now()
    model = Sequential()
    model.add(Bidirectional(LSTM(units=50, return_sequences=True), input_shape=(timesteps, feature_num)))
    model.add(Bidirectional(LSTM(units=100, return_sequences=True), input_shape=(timesteps, feature_num)))
    model.add(Bidirectional(LSTM(units=150)))
    model.add(Dropout(0.1))
    model.add(Dense(predict_steps))
    model.compile(loss='mse', optimizer='adam')
    # 模型训练
    model.fit(trainx, trainy, epochs=50, batch_size=64)
    end_time = datetime.datetime.now()
    running_time = end_time - start_time
    # 保存模型
    model.save('BiLSTM_model.h5')
    
    # 返回构建好的模型
    return model
model = BiLSTM_model_train(trainx, trainy)

4 Model prediction

First load the trained model

# 加载模型
from tensorflow.keras.models import load_model
model = load_model('BiLSTM_model.h5')

Prepare the data that needs to be predicted, retain 6 days of data during training, use the data of the first 5 days as input to predict, and compare the predicted results with the true value of the last day.

y_true = datay[-timesteps-predict_steps:-timesteps]
x_pred = datax[-timesteps:]

Predict, calculate and visualize errors, encapsulating these steps as functions.

# 预测并计算误差和可视化
def predict_and_plot(x, y_true, model, scaler, timesteps):
    # 变换输入x格式,适应LSTM模型
    predict_x = np.reshape(x, (1, timesteps, feature_num))  
    # 预测
    predict_y = model.predict(predict_x)
    predict_y = scaler.inverse_transform(predict_y)
    y_predict = []
    y_predict.extend(predict_y[0])
    
    # 计算误差
    r2 = r2_score(y_true, y_predict)
    rmse = mean_squared_error(y_true, y_predict, squared=False)
    mae = mean_absolute_error(y_true, y_predict)
    mape = mean_absolute_percentage_error(y_true, y_predict)
    print("r2: %.2f\nrmse: %.2f\nmae: %.2f\nmape: %.2f" % (r2, rmse, mae, mape))
    
    # 预测结果可视化
    cycol = cycle('bgrcmk')
    plt.figure(dpi=100, figsize=(14, 5))
    plt.plot(y_true, c=next(cycol), markevery=5)
    plt.plot(y_predict, c=next(cycol), markevery=5)
    plt.legend(['y_true', 'y_predict'])
    plt.xlabel('时间')
    plt.ylabel('功率(kW)')
    plt.show()
    
    return y_predict
    
y_predict_nowork = predict_and_plot(x_pred, y_true, model, scaler2, timesteps)

Finally, the visualization results were obtained. It was found that the visualization results were not very good. The model prediction effect can be further improved through parameter adjustment and data processing.

​  

Guess you like

Origin blog.csdn.net/qq_41921826/article/details/134899327