Time series forecast 18: ConvLSTM realizes electricity consumption / power generation forecast

Following the above, this article introduces the ConvLSTM model to achieve electricity consumption / power generation forecast.


LSTM article dealing with the task of electricity consumption / generation forecast:

[Part1] Encoder-Decoder LSTM model to realize electricity consumption / generation prediction
[Part2] CNN-LSTM model to realize electricity consumption / generation prediction
[Part3]



1. ConvLSTM

1.1 CNN model

A further extension of the CNN-LSTM method is to perform CNN convolution (such as how CNN reads input sequence data) as part of the LSTM for each time step. This combination is called ConvLSTM, and like CNN-LSTM, it is also used for spatiotemporal data. Unlike LSTM, which directly reads data to calculate internal states and state transitions, and CNN-LSTM, which interprets the output of CNN models, ConvLSTM directly uses convolution as part of the input to read LSTM cells. The Keras library provides the ConvLSTM2D class, which supports the ConvLSTM model of two-dimensional data. It can be configured as a one-dimensional multivariate time series forecast. By default, ConvLSTM2D shape such requests for data input: [samples,timesteps,rows,cols,channels].

Each time step of the data is defined as an image of (row × column) data points. We are dealing with a one-dimensional sequence of total power consumption. If we assume that we use two weeks of data as input, the behavior is 1, and the column is 14. ConvLSTM will read these data at once, that is, LSTM reads a 14-day time step and performs convolution on these time steps.

In our task, 14 days can be divided into two subsequences, each subsequence is 7 days long. ConvLSTM can then read two time steps and perform CNN processing on the 7-day data in each time step. Therefore, for the selected frame the problem, ConvLSTM2D input shape is: [n,2,1,7,1]. Parameter Description:

  • Samples: n, represents the number of samples in the training data set.
  • Timesteps (timesteps): 2, means that the sampling data with a window width of 14 days is divided into two subsequences.
  • Rows: 1, indicates the one-dimensional shape of each subsequence, that is, how many rows there are.
  • Columns (cols): 7, indicating how many columns there are in each subsequence.
  • Channels: 1. The concept in image recognition tasks, the number of channels. In the time series prediction task, it is actually features. This concept has been repeatedly mentioned and emphasized in previous articles. Because the business demand in this example is to predict the total daily power consumption of the next week through the total daily power consumption, the number of channels (the number of features) is 1, which represents the total daily power consumption. If you want to add other features, this size should be changed accordingly. Looking at the data set again, it is clear.
    Insert picture description here
    Other configurations can also be explored, such as using the total power consumption of the previous 21 days as input and dividing it into 3 subsequences, and / or providing all eight functions or channels as inputs. The data input of ConvLSTM2D requires that the training data set must be reshaped into [样本,时间步长,行,列,通道]([samples, timesteps, rows, cols, channels])a structure. Compared with the complete code of CNN-LSTM, the following modifications need to be made on this basis:

1. Reshape the shape of the training sample:

train_x = train_x.reshape((train_x.shape[0], n_steps, 1, n_length, n_features))

2. Set the input size parameters of the ConvLSTM model:

model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu',input_shape=(sw_width, 1, n_length, n_features)))
model.add(Flatten())

3. Reshape the shape of the test sample:

input_x = input_x.reshape((1, sw_width, 1, n_length, 1))

1.2 Complete code

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# 设置中文显示
plt.rcParams['font.sans-serif'] = ['Microsoft JhengHei']
plt.rcParams['axes.unicode_minus'] = False

import math
import sklearn.metrics as skm
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.layers import RepeatVector, TimeDistributed
from tensorflow.keras.layers import ConvLSTM2D


def split_dataset(data):
    '''
    该函数实现以周为单位切分训练数据和测试数据
    '''
    # data为按天的耗电量统计数据,shape为(1442, 8)
    # 测试集取最后一年的46周(322天)数据,剩下的159周(1113天)数据为训练集,以下的切片实现此功能。
    train, test = data[1:-328], data[-328:-6]
    train = np.array(np.split(train, len(train)/7)) # 将数据划分为按周为单位的数据
    test = np.array(np.split(test, len(test)/7))
    return train, test

def evaluate_forecasts(actual, predicted):
    '''
    该函数实现根据预期值评估一个或多个周预测损失
    思路:统计所有单日预测的 RMSE
    '''
    scores = list()
    for i in range(actual.shape[1]):
        mse = skm.mean_squared_error(actual[:, i], predicted[:, i])
        rmse = math.sqrt(mse)
        scores.append(rmse)
    
    s = 0 # 计算总的 RMSE
    for row in range(actual.shape[0]):
        for col in range(actual.shape[1]):
            s += (actual[row, col] - predicted[row, col]) ** 2
    score = math.sqrt(s / (actual.shape[0] * actual.shape[1]))
    print('actual.shape[0]:{}, actual.shape[1]:{}'.format(actual.shape[0], actual.shape[1]))
    return score, scores

def summarize_scores(name, score, scores):
    s_scores = ', '.join(['%.1f' % s for s in scores])
    print('%s: [%.3f] %s\n' % (name, score, s_scores))
    
def sliding_window(train, sw_width=7, n_out=7, in_start=0):
    '''
    该函数实现窗口宽度为7、滑动步长为1的滑动窗口截取序列数据
    '''
    data = train.reshape((train.shape[0] * train.shape[1], train.shape[2])) # 将以周为单位的样本展平为以天为单位的序列
    X, y = [], []
    
    for _ in range(len(data)):
        in_end = in_start + sw_width
        out_end = in_end + n_out
        
        # 保证截取样本完整,最大元素索引不超过原序列索引,则截取数据;否则丢弃该样本
        if out_end < len(data):
            # 训练数据以滑动步长1截取
            train_seq = data[in_start:in_end, 0]
            train_seq = train_seq.reshape((len(train_seq), 1))
            X.append(train_seq)
            y.append(data[in_end:out_end, 0])
        in_start += 1
        
    return np.array(X), np.array(y)

def conv_lstm_model(train, sw_width, n_steps, n_length, in_start=0, verbose_set=0, epochs_num=20, batch_size_set=4):
    '''
    该函数定义 Encoder-Decoder LSTM 模型
    '''
    train_x, train_y = sliding_window(train, sw_width, in_start=0)
    n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
    
    train_x = train_x.reshape((train_x.shape[0], n_steps, 1, n_length, n_features))
    train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
    
    model = Sequential()
    model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu',
                         input_shape=(n_steps, 1, n_length, n_features)))
    model.add(Flatten())
    model.add(RepeatVector(n_outputs))
    model.add(LSTM(200, activation='relu', return_sequences=True))
    model.add(TimeDistributed(Dense(100, activation='relu')))
    model.add(TimeDistributed(Dense(1)))
    
    model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
    print(model.summary())
    
    model.fit(train_x, train_y,
              epochs=epochs_num, batch_size=batch_size_set, verbose=verbose_set)
    return model

def forecast(model, pred_seq, sw_width, n_length, n_steps):
    '''
    该函数实现对输入数据的预测
    '''
    data = np.array(pred_seq)
    data = data.reshape((data.shape[0]*data.shape[1], data.shape[2]))
    
    input_x = data[-sw_width:, 0] # 获取输入数据的最后一周的数据
    input_x = input_x.reshape((1, n_steps, 1, n_length, 1))
    
    yhat = model.predict(input_x, verbose=0) # 预测下周数据
    yhat = yhat[0] # 获取预测向量
    return yhat

def evaluate_model(model, train, test, sd_width, n_length, n_steps):
    '''
    该函数实现模型评估
    '''
    history_fore = [x for x in train]
    predictions = list() # 用于保存每周的前向验证结果;
    for i in range(len(test)):
        yhat_sequence = forecast(model, history_fore, sd_width, n_length, n_steps) # 预测下周的数据
        predictions.append(yhat_sequence) # 保存预测结果
        history_fore.append(test[i, :]) # 得到真实的观察结果并添加到历史中以预测下周
    
    predictions = np.array(predictions) # 评估一周中每天的预测结果
    score, scores = evaluate_forecasts(test[:, :, 0], predictions)
    return score, scores

def model_plot(score, scores, days, name):
    '''
    该函数实现绘制RMSE曲线图
    '''
    plt.figure(figsize=(8,6), dpi=150)
    plt.plot(days, scores, marker='o', label=name)
    plt.grid(linestyle='--', alpha=0.5)
    plt.ylabel(r'$RMSE$', size=15)
    plt.title('Conv-LSTM 模型预测结果',  size=18)
    plt.legend()
    plt.show()
    
def main_run(dataset, sw_width, days, name, in_start, verbose, epochs, batch_size, n_steps, n_length):
    '''
    主函数:数据处理、模型训练流程
    '''
    # 划分训练集和测试集
    train, test = split_dataset(dataset.values)
    # 训练模型
    model = conv_lstm_model(train, sw_width,  n_steps, n_length, in_start, verbose_set=0, epochs_num=20, batch_size_set=4)
    # 计算RMSE
    score, scores = evaluate_model(model, train, test, sw_width, n_length, n_steps)
    # 打印分数
    summarize_scores(name, score, scores)
    # 绘图
    model_plot(score, scores, days, name)
    
    print('------头发不够,帽子来凑-----')
    
    
if __name__ == '__main__':
    
    dataset = pd.read_csv('household_power_consumption_days.csv', header=0, 
                   infer_datetime_format=True, engine='c',
                   parse_dates=['datetime'], index_col=['datetime'])
    
    days = ['sun', 'mon', 'tue', 'wed', 'thr', 'fri', 'sat']
    name = 'Conv-LSTM'
    
    # 定义序列的数量和长度
    '''
    n_steps:子序列划分的数量,本例为2,将14天的数据划分为两个7的子序列;
    n_length:子序列每行的元素数,即列数。
    '''
    n_steps, n_length = 2, 7
    
    sliding_window_width= n_length * n_steps
    input_sequence_start=0
    
    epochs_num=20
    batch_size_set=16
    verbose_set=0
    
    
    main_run(dataset, sliding_window_width, days, name, input_sequence_start,
             verbose_set, epochs_num, batch_size_set, n_steps, n_length)

Output:

Model: "sequential_12"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv_lst_m2d_2 (ConvLSTM2D)  (None, 1, 5, 64)          50176     
_________________________________________________________________
flatten_3 (Flatten)          (None, 320)               0         
_________________________________________________________________
repeat_vector_8 (RepeatVecto (None, 7, 320)            0         
_________________________________________________________________
lstm_16 (LSTM)               (None, 7, 200)            416800    
_________________________________________________________________
time_distributed_16 (TimeDis (None, 7, 100)            20100     
_________________________________________________________________
time_distributed_17 (TimeDis (None, 7, 1)              101       
=================================================================
Total params: 487,177
Trainable params: 487,177
Non-trainable params: 0
_________________________________________________________________
None
actual.shape[0]:46, actual.shape[1]:7
Conv-LSTM: [382.156] 391.3, 386.4, 340.5, 388.9, 364.4, 309.1, 473.6

Run the example to summarize the performance of the test set. Experiments show that using two convolutional layers makes the model more stable than using only a single layer. It can be seen that in this case, the model performs well, with an overall RMSE score of approximately 382 kW.


Expand

  • Input size: the number of input days to explore the model, such as 3 days, 21 days, 30 days, etc.
  • Model adjustment: adjust the structure and hyperparameters of the model, and further improve model performance.
  • Data scaling: Explore whether data scaling (such as standardization and normalization) can be used to improve the performance of LSTM models.
  • Learning diagnosis: Use diagnosis (such as training learning curve and validation loss and mean square error) to help adjust the structure and hyperparameters of the LSTM model.

to sum up

Three articles describe how to develop LSTM to perform multi-step time series forecasting of household electricity consumption. The main contents are as follows:

  • How to develop and evaluate univariate and multivariable Encoder-Decoder LSTM models for multi-step time series forecasting.
  • How to develop and evaluate the CNN-LSTM Encoder-Decoder model for multi-step time series prediction.
  • How to develop and evaluate ConvLSTM Encoder-Decoder model for multi-step time series prediction.

Regarding the time series prediction power consumption prediction task, the next article begins to introduce time series classification tasks, such as human behavior recognition and vehicle driving behavior recognition.


Published 167 original articles · praised 686 · 50,000+ views

Guess you like

Origin blog.csdn.net/weixin_39653948/article/details/105447616