[Numerical prediction case] (3) LSTM time series power prediction, complete with Tensorflow code

Hello everyone, today I will share with you how to use the recurrent neural network LSTM to complete time series prediction. This article is for the prediction of a single feature, and the next one is the prediction of multiple features . Full code at the end of the article

1. Import the toolkit

Here, GPU-accelerated computing is used to speed up the training of the network.

import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
# 调用GPU加速
gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

2. Get the dataset

The data set needs to be taken by yourself: https://pan.baidu.com/s/1uWW7w1Ci04U3d8YFYPf3Cw  Extraction code : 00qw 

Read battery time series data with the help of pandas library, two columns of feature data, time and battery

filepath = 'energy.csv'
data = pd.read_csv(filepath)

3. Data preprocessing

Since the prediction is based on time series, the index in the data is changed to time, and the AFP power feature column is taken as the training feature .

Due to the large difference between the maximum value and the minimum value of the original data, in order to avoid the data affecting the stability of network training, the feature data for training is standardized.

temp = data['AEP_MW'] # 获取电力数据
temp.index = data['Datetime'] # 将索引改为时间序列
temp.plot()  # 绘图展示

temp_mean = temp[:train_num].mean()  # 均值
temp_std = temp[:train_num].std()  # 标准差
# 标准化
inputs_feature = (temp - temp_mean) / temp_std

Plot the original data distribution

4. Divide the dataset

First, feature values ​​and their corresponding label values ​​need to be selected through a time series sliding window . For example, for prediction at a certain time point, it is stipulated that for every 20 feature values, a label value is predicted. Since there is only one column of feature data, it is equivalent to predicting the 21st data with the first 20 data . Similarly, for a certain time segment prediction, use the 1st to 20th data to predict the 21st to 30th power.

start_index 这么多数据选择从哪个开始,一般从0开始取序列
indices=range(i, i+history_size) 代表窗口序列的索引,i表示每个窗口的起始位置,窗口中所有数据的索引
def database(dataset, start_index, end_index, history_size, target_size):
    data = []  # 存放特征值
    labels = []  # 存放目标值
    # 初始的取值片段[0:history_size]
    start_index = start_index + history_size

    # 如果不指定特征值终止索引,就取到最后一个分区前
    if end_index is None:
        end_index = len(dataset) - target_size
    # 遍历整个电力数据,取出特征及其对应的预测目标
    for i in range(start_index, end_index):
        indices = range(i - history_size, i) # 窗口内的所有元素的索引
        # 保存特征值和标签值
        data.append(np.reshape(dataset[indices], (history_size, 1)))
        labels.append(dataset[i+target_size]) # 预测未来几个片段的天气数据
    # 返回数据集
    return np.array(data), np.array(labels)

Next, you can divide the training set, validation set, and test set in the original data set , accounting for 90:9.8:0.2 respectively.

# 取前90%个数据作为训练集
train_num = int(len(data) * 0.90)
# 90%-99.8%用于验证
val_num = int(len(data) * 0.998)
# 最后1%用于测试

# 窗口为20条数据,预测下一时刻气温
history_size = 20

# 训练集
x_train, y_train = database(inputs_feature.values, 0, train_num, 
                            history_size, target_size)

# 验证集
x_val, y_val = database(inputs_feature.values, train_num, val_num,
                          history_size, target_size)

# 测试集
x_test, y_test = database(inputs_feature.values, val_num, None,
                          history_size, target_size)

# 查看数据信息
print('x_train.shape:', x_train.shape)  # x_train.shape: (109125, 20, 1)

5. Construct the dataset

Convert the divided numpy type training set and validation set to tensor type for network training. Use the shuffle() function to shuffle the training set data, and the batch() function to specify how many sets of data to train at each step. With the help of the iterator iter(), use the next() function to fetch a batch of data from the dataset for verification.

# 训练集
train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_ds = train_ds.shuffle(10000).batch(128)
# 验证集
val_ds = tf.data.Dataset.from_tensor_slices((x_val, y_val))
val_ds = val_ds.batch(128) 

# 查看数据信息
sample = next(iter(train_ds))
print('x_batch.shape:', sample[0].shape, 'y_batch.shape:', sample[1].shape)
print('input_shape:', sample[0].shape[-2:])
# x_batch.shape: (128, 20, 1) y_batch.shape: (128,)
# input_shape: (20, 1)

6. Model building

Since the amount of data in this case is relatively small and there is only one feature, there is no need to use a complex network. One LSTM layer is used to extract features, and a fully connected layer is used to output prediction results.

# 构造输入层
inputs = keras.Input(shape=sample[0].shape[-2:])
# 搭建网络各层
x = keras.layers.LSTM(8)(inputs)
x = keras.layers.Activation('relu')(x)
outputs = keras.layers.Dense(1)(x)  # 输出结果是1个
# 构造模型
model = keras.Model(inputs, outputs)
# 查看模型结构

The network architecture is as follows:

Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 20, 1)]           0         
lstm_1 (LSTM)                (None, 8)                 320       
activation_1 (Activation)    (None, 8)                 0         
dense_1 (Dense)              (None, 1)                 9         
Total params: 329
Trainable params: 329
Non-trainable params: 0

7. Network training

First, compile the model, use the adam optimizer to set the learning rate to 0.01, use the mean absolute error as the loss function during network training, and the network iterates 20 times. The regression problem cannot set the metrics monitoring indicator as the accuracy rate, which is generally used for classification problems.

opt = keras.optimizers.Adam(learning_rate=0.001)  # 优化器

model.compile(optimizer=opt, loss='mae')  # 平均误差损失

history = model.fit(train_ds, epochs=epochs, validation_data=val_ds)

The training process is as follows:

Epoch 1/20
853/853 [==============================] - 5s 5ms/step - loss: 0.4137 - val_loss: 0.0878
Epoch 2/20
853/853 [==============================] - 4s 5ms/step - loss: 0.0987 - val_loss: 0.0754
Epoch 19/20
853/853 [==============================] - 4s 5ms/step - loss: 0.0740 - val_loss: 0.0607
Epoch 20/20
853/853 [==============================] - 4s 4ms/step - loss: 0.0736 - val_loss: 0.0628

8. View training information

The history variable holds all the information about the training process, and we plot the training set loss and validation set loss curves.

history_dict = history.history  # 获取训练的数据字典
train_loss = history_dict['loss']  # 训练集损失
val_loss = history_dict['val_loss']  # 验证集损失

plt.plot(range(epochs), train_loss, label='train_loss')  # 训练集损失
plt.plot(range(epochs), val_loss, label='val_loss')  # 验证集损失
plt.legend()  # 显示标签

9. Prediction Phase

Predict the previously divided test set , save the weights of the network trained in the model , use the predict() function to predict the power y_predict corresponding to the feature x_test , the actual value y_test, and the graph shows the degree of deviation between the predicted value and the actual value. Indicators such as the variance or standard deviation between the predicted value and the true value can also be calculated to indicate the accuracy of the prediction.

y_predict = model.predict(x_test)  # 对测试集的特征值进行预测

# x_test 等同于经过预处理后的 temp[val_num:-20].values
dates = temp[val_num:-20].index  # 获取时间索引

fig = plt.figure(figsize=(10,5))
# 真实值
axes = fig.add_subplot(111)
axes.plot(dates, y_test, 'bo', label='actual')
# 预测值,红色散点
axes.plot(dates, y_predict, 'ro', label='predict')
# 设置横坐标刻度

plt.legend()  # 注释
plt.grid()  # 网格

Since x_test corresponds to the feature information indexed after val_num in the original data , find the time dates corresponding to each element in x_test as the x-axis scale

Guess you like

Origin blog.csdn.net/dgvv4/article/details/124349963