基于LSTM及其变种网络的时序数据建模实战教程

本文翻译自大神【Jason Brownlee】的实战教程《How to Develop LSTM Models for Time Series Forecasting》。

长短期记忆网络或简称LSTM可以用于时间序列预测。

有很多类型的LSTM模型可用于每种特定类型的时间序列预测问题。

在本教程中，您将发现如何为一系列标准时间序列预测问题开发一套LSTM模型。

本教程的目的是提供每种类型的时间序列问题的每个模型的独立示例作为模板，您可以复制并适应特定的时间序列预测问题。

完成本教程后，您将知道：

如何开发用于单变量时间序列预测的LSTM模型。
如何为多元时间序列预测开发LSTM模型。
如何开发用于多步时间序列预测的LSTM模型。
这是一个重要的大型职位；您可能需要将其添加书签以供将来参考。

在我的新书中，通过25个循序渐进的教程和完整的源代码，探索如何使用LSTM和更多内容构建用于多变量和多步时间序列预测的模型。

让我们开始吧。

How to Develop LSTM Models for Time Series Forecasting

教程概述

在本教程中，我们将探讨如何为时间序列预测开发一套不同类型的LSTM模型。

这些模型是在一些人为设计的时间序列问题上进行演示的，这些问题旨在使所解决的时间序列问题更具风味。模型的选择配置是任意的，并且没有针对每个问题进行优化；那不是目标。

本教程分为四个部分。他们是：

单变量LSTM模型
多元LSTM模型
多步骤LSTM模型
多元多步LSTM模型

单变量LSTM模型

LSTM可用于对单变量时间序列预测问题进行建模。

这些都是由单个观测值序列组成的问题，需要一个模型来从一系列先前观测值中学习，以预测序列中的下一个值。

我们将展示LSTM模型用于单变量时间序列预测的多种变体。

本节分为六个部分。他们是：

资料准备
单层LSTM
堆叠式LSTM
双向LSTM
CNN LSTM
转换STM
这些模型中的每一个都针对一步式单变量时间序列预测进行了演示，但可以轻松地进行调整并用作其他类型的时间序列预测问题的模型的输入部分。

数据准备

在对单变量序列进行建模之前，必须先进行准备。

LSTM模型将学习将过去的观测序列作为输入映射到输出观测的函数。因此，观察序列必须转换为LSTM可以学习的多个示例。

考虑给定的单变量序列：

[10, 20, 30, 40, 50, 60, 70, 80, 90]

我们可以将序列分为多个称为样本的输入/输出模式，其中三个时间步长用作输入，一个时间步长用作输出，用于正在学习的单步预测。

X,				y
10, 20, 30		40
20, 30, 40		50
30, 40, 50		60
...

下面的split_sequence（）函数实现了此行为，并将给定的单变量序列拆分为多个样本，其中每个样本具有指定数量的时间步长，输出为单个时间步长。

def split_sequence(sequence, n_steps):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the sequence
		if end_ix > len(sequence)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

我们可以在上面的小型人为数据集上演示此功能。

下面列出了完整的示例。

# univariate data preparation
from numpy import array

# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the sequence
		if end_ix > len(sequence)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# summarize the data
for i in range(len(X)):
	print(X[i], y[i])

运行示例将单变量序列分为六个样本，其中每个样本具有三个输入时间步长和一个输出时间步长。

[10 20 30] 40
[20 30 40] 50
[30 40 50] 60
[40 50 60] 70
[50 60 70] 80
[60 70 80] 90

现在，我们知道如何准备建模的单变量序列，让我们看一下开发LSTM模型，该模型可以学习从香草LSTM开始的输入到输出的映射。

单层LSTM

Vanilla LSTM是一种LSTM模型，它具有LSTM单位的单个隐藏层，以及用于进行预测的输出层。

# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

定义的关键是输入的形状。这就是模型在时间步长和要素数量方面期望作为每个样本的输入的内容。

我们正在处理单变量系列，因此对于一个变量，特征数是一个。

输入的时间步数是在准备数据集作为split_sequence（）函数的参数时选择的数目。

每个样本的输入形状在第一个隐藏层的定义的input_shape参数中指定。

我们几乎总是有多个样本，因此，模型将期望训练数据的输入部分具有尺寸或形状：

[samples, timesteps, features]

上一节中的split_sequence（）函数将输出形状为[samples，timesteps]的X，因此我们可以轻松地对其重塑形状，以使一个要素具有额外的尺寸。

# reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))

在这种情况下，我们定义了一个模型，该模型在隐藏层中具有50个LSTM单位，在输出层中预测单个数值。

该模型使用有效的Adam版本随机梯度下降法拟合，并使用均方误差或“ mse”损失函数进行了优化。

定义模型后，我们可以将其拟合到训练数据集上。

# fit model
model.fit(X, y, epochs=200, verbose=0)

模型拟合后，我们可以使用它进行预测。

我们可以通过提供输入来预测序列中的下一个值：

[70, 80, 90]

并期望模型预测如下：

[100]

该模型期望输入形状为带有[样本，时间步长，特征]的三维，因此，在进行预测之前，我们必须对单个输入样本进行整形。

# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)

我们可以将所有这些结合在一起，并演示如何开发用于单变量时间序列预测的Vanilla LSTM并进行单个预测。

# univariate lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense

# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the sequence
		if end_ix > len(sequence)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行示例将准备数据，拟合模型并做出预测。

考虑到算法的随机性，您的结果可能会有所不同；尝试运行该示例几次。

我们可以看到模型预测了序列中的下一个值。

[[102.09213]]

堆叠式LSTM

多个隐藏的LSTM层可以在所谓的LSTM堆叠模型中一层一层地堆叠。

LSTM层需要三维输入，默认情况下，LSTM会产生二维输出，作为序列末尾的解释。

我们可以通过在层上设置return_sequences = True参数，使LSTM在输入数据中的每个时间步长输出一个值来解决此问题。这使我们可以将来自隐藏LSTM层的3D输出作为下一个输入。

因此，我们可以如下定义堆叠式LSTM。

# define model
model = Sequential()
model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(n_steps, n_features)))
model.add(LSTM(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

我们可以把它们结合在一起；下面列出了完整的代码示例。

# univariate stacked lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense

# split a univariate sequence
def split_sequence(sequence, n_steps):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the sequence
		if end_ix > len(sequence)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(n_steps, n_features)))
model.add(LSTM(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行该示例可预测序列中的下一个值，我们期望该值为100。

[[102.47341]]

双向LSTM

在某些序列预测问题上，允许LSTM模型向前和向后学习输入序列并连接这两种解释可能是有益的。这称为双向LSTM。

通过将第一个隐藏层包装在称为“双向”的包装层中，我们可以实现双向LSTM进行单变量时间序列预测。

定义双向LSTM以向前和向后读取输入的示例如下。

# define model
model = Sequential()
model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

下面列出了用于单变量时间序列预测的双向LSTM的完整示例。

# univariate bidirectional lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import Bidirectional

# split a univariate sequence
def split_sequence(sequence, n_steps):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the sequence
		if end_ix > len(sequence)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))
# define model
model = Sequential()
model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行该示例可预测序列中的下一个值，我们期望该值为100。

[[101.48093]]

卷积LSTM

卷积神经网络，简称CNN，是为处理二维图像数据而开发的一种神经网络。

CNN可以非常有效地从一维序列数据（例如单变量时间序列数据）中自动提取和学习特征。

CNN模型可以在具有LSTM后端的混合模型中使用，其中CNN用于解释输入的子序列，这些子序列一起作为序列提供给LSTM模型进行解释。这种混合模型称为CNN-LSTM。

第一步是将输入序列分成可被CNN模型处理的子序列。例如，我们可以先将单变量时间序列数据拆分为输入/输出样本，并以四个步骤作为输入，一个步骤作为输出。然后可以将每个样本分为两个子样本，每个子样本具有两个时间步长。 CNN可以解释两个时间步长的每个子序列，并提供对LSTM模型的子序列的时间序列解释，以作为输入进行处理。

我们可以对其进行参数化，并将子序列的数量定义为n_seq，将每个子序列的时间步长定义为n_steps。然后可以将输入数据重塑为所需的结构：

[samples, subsequences, timesteps, features]

例如：

# choose a number of time steps
n_steps = 4
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, subsequences, timesteps, features]
n_features = 1
n_seq = 2
n_steps = 2
X = X.reshape((X.shape[0], n_seq, n_steps, n_features))

当分别读取数据的每个子序列时，我们希望重用相同的CNN模型。

这可以通过将整个CNN模型包装在TimeDistributed包装器中来实现，该包装器将对每个输入应用一次整个模型，在这种情况下，对每个输入子序列应用一次。

CNN模型首先具有一个卷积层，用于读取子序列，该卷积层需要指定多个过滤器并指定内核大小。过滤器的数量是输入序列的读取或解释的数量。内核大小是输入序列的每个“读取”操作所包含的时间步数。

卷积层之后是最大池化层，该层将过滤器贴图精简到其大小的1/2（包括最显着的特征）。然后将这些结构展平为单个一维矢量，以用作LSTM层的单个输入时间步长。

model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu'), input_shape=(None, n_steps, n_features)))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))

接下来，我们可以定义模型的LSTM部分，该部分解释CNN模型对输入序列的读取并做出预测。

model.add(LSTM(50, activation='relu'))
model.add(Dense(1))

我们可以将所有这些结合在一起；下面列出了用于单变量时间序列预测的CNN-LSTM模型的完整示例。

# univariate cnn lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import TimeDistributed
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D

# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the sequence
		if end_ix > len(sequence)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 4
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, subsequences, timesteps, features]
n_features = 1
n_seq = 2
n_steps = 2
X = X.reshape((X.shape[0], n_seq, n_steps, n_features))
# define model
model = Sequential()
model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu'), input_shape=(None, n_steps, n_features)))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=500, verbose=0)
# demonstrate prediction
x_input = array([60, 70, 80, 90])
x_input = x_input.reshape((1, n_seq, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行该示例可预测序列中的下一个值，我们期望该值为100。

[[101.69263]]

转换LSTM

与CNN-LSTM相关的LSTM类型是ConvLSTM，其中输入的卷积读取直接内置到每个LSTM单元中。

ConvLSTM是为读取二维时空数据而开发的，但可以适用于单变量时间序列预测。

该层期望输入为二维图像序列，因此输入数据的形状必须为：

[samples, timesteps, rows, columns, features]

出于我们的目的，我们可以将每个样本分成多个子序列，其中时间步长将成为子序列数或n_seq，列将是每个子序列的时间步数或n_steps。在处理一维数据时，行数固定为1。

现在，我们可以将准备好的样本重塑为所需的结构。

# choose a number of time steps
n_steps = 4
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, timesteps, rows, columns, features]
n_features = 1
n_seq = 2
n_steps = 2
X = X.reshape((X.shape[0], n_seq, 1, n_steps, n_features))

根据过滤器的数量，我们可以将ConvLSTM定义为单层，而根据（行，列），可以将二维内核大小定义为ConvLSTM。当我们处理一维序列时，内核中的行数始终固定为1。

然后必须对模型的输出进行展平，然后才能对其进行解释和进行预测。

model.add(ConvLSTM2D(filters=64, kernel_size=(1,2), activation='relu', input_shape=(n_seq, 1, n_steps, n_features)))
model.add(Flatten())

下面列出了用于单步单变量时间序列预测的ConvLSTM的完整示例。

# univariate convlstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import ConvLSTM2D

# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the sequence
		if end_ix > len(sequence)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 4
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, timesteps, rows, columns, features]
n_features = 1
n_seq = 2
n_steps = 2
X = X.reshape((X.shape[0], n_seq, 1, n_steps, n_features))
# define model
model = Sequential()
model.add(ConvLSTM2D(filters=64, kernel_size=(1,2), activation='relu', input_shape=(n_seq, 1, n_steps, n_features)))
model.add(Flatten())
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=500, verbose=0)
# demonstrate prediction
x_input = array([60, 70, 80, 90])
x_input = x_input.reshape((1, n_seq, 1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行该示例可预测序列中的下一个值，我们期望该值为100。

[[103.68166]]

现在，我们已经研究了用于单变量数据的LSTM模型，现在我们将注意力转向多变量数据。

多元LSTM模型

多元时间序列数据是指每个时间步长有多个观察值的数据。

对于多元时间序列数据，我们可能需要两个主要模型；他们是：

多输入系列。
多个并联系列。
让我们依次来看一下。

多变量输入系列

一个问题可能具有两个或多个并行输入时间序列，一个输出时间序列取决于输入时间序列。

输入时间序列是并行的，因为每个序列在相同的时间步长上都有观测值。

我们可以通过两个并行输入时间序列的简单示例来证明这一点，其中输出序列是输入序列的简单加法。

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])

我们可以将这三个数据数组重塑为单个数据集，其中每一行是一个时间步长，每一列是一个单独的时间序列。这是将并行时间序列存储在CSV文件中的标准方法。

# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

下面列出了完整的示例。

# multivariate data preparation
from numpy import array
from numpy import hstack
# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
print(dataset)

运行示例将打印数据集，每个时间步长一行，两个输入和一个输出并行时间序列中的每一个列一列。

[[ 10  15  25]
 [ 20  25  45]
 [ 30  35  65]
 [ 40  45  85]
 [ 50  55 105]
 [ 60  65 125]
 [ 70  75 145]
 [ 80  85 165]
 [ 90  95 185]]

与单变量时间序列一样，我们必须将这些数据构造为具有输入和输出元素的样本。

LSTM模型需要足够的上下文来学习从输入序列到输出值的映射。 LSTM可以支持并行输入时间序列作为单独的变量或功能。因此，我们需要将数据拆分为样本，以保持跨两个输入序列的观察顺序。

如果我们选择三个输入时间步，则第一个样本将如下所示：

输入：

10, 15
20, 25
30, 35

输出：

也就是说，将每个并行序列的前三个时间步长作为模型的输入，并且模型将其与第三时间步长（在这种情况下为65）的输出序列中的值相关联。

我们可以看到，在将时间序列转换为输入/输出样本以训练模型时，我们将不得不丢弃输出时间序列中的某些值，而在先前的时间步长中输入时间序列中没有值。反过来，选择输入时间步长的大小将对使用多少训练数据有重要影响。

我们可以定义一个名为split_sequences（）的函数，该函数将采用我们定义的数据集，其中时间行的行以及并行序列的列和返回输入/输出样本。

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the dataset
		if end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

我们可以使用每个输入时间序列的三个时间步长作为输入在数据集上测试此功能。

下面列出了完整的示例。

# multivariate data preparation
from numpy import array
from numpy import hstack

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the dataset
		if end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps = 3
# convert into input/output
X, y = split_sequences(dataset, n_steps)
print(X.shape, y.shape)
# summarize the data
for i in range(len(X)):
	print(X[i], y[i])

首先运行示例将打印X和y分量的形状。

我们可以看到X分量具有三维结构。

第一维是样本数，在这种情况下为7。第二维是每个样本的时间步数，在这种情况下为3，即为函数指定的值。最后，最后一个维度指定并行时间序列的数量或变量的数量，在这种情况下，两个并行序列的数量为2。

这是LSTM预期作为输入的确切三维结构。数据准备就绪，无需进一步调整。

然后，我们可以看到打印了每个样本的输入和输出，显示了两个输入序列中每个序列的三个时间步长以及每个样本的相关输出。

(7, 3, 2) (7,)

[[10 15]
 [20 25]
 [30 35]] 65
[[20 25]
 [30 35]
 [40 45]] 85
[[30 35]
 [40 45]
 [50 55]] 105
[[40 45]
 [50 55]
 [60 65]] 125
[[50 55]
 [60 65]
 [70 75]] 145
[[60 65]
 [70 75]
 [80 85]] 165
[[70 75]
 [80 85]
 [90 95]] 185

现在，我们准备在此数据上拟合LSTM模型。

可以使用上一部分中的LSTM的任何变体，例如Vanilla，Stacked，Bidirectional，CNN或ConvLSTM模型。

我们将使用Vanilla LSTM，其中通过input_shape参数为输入层指定了时间步长和并行序列（特征）。

# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

进行预测时，模型需要两个输入时间序列的三个时间步长。

我们可以提供以下输入值来预测输出序列中的下一个值：

80,	 85
90,	 95
100, 105

具有三个时间步长和两个变量的一个样本的形状必须为[1、3、2]。

我们期望序列中的下一个值为100 + 105或205。

# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)

完整实例如下所示：

# multivariate lstm example
from numpy import array
from numpy import hstack
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the dataset
		if end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps = 3
# convert into input/output
X, y = split_sequences(dataset, n_steps)
# the dataset knows the number of features, e.g. 2
n_features = X.shape[2]
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行示例将准备数据，拟合模型并做出预测。

[[208.13531]]

多并联系列

另一个时间序列问题是存在多个并行时间序列并且必须为每个时间序列预测一个值的情况。

例如，给定上一节中的数据：

[[ 10  15  25]
 [ 20  25  45]
 [ 30  35  65]
 [ 40  45  85]
 [ 50  55 105]
 [ 60  65 125]
 [ 70  75 145]
 [ 80  85 165]
 [ 90  95 185]]

我们可能希望为下一个时间步预测三个时间序列中每个时间序列的值。

这可能称为多元预测。

同样，必须将数据分为输入/输出样本以训练模型。

该数据集的第一个样本为：
输入：

10, 15, 25
20, 25, 45
30, 35, 65

输出：

40, 45, 85

下面的split_sequences（）函数将多个具有时间步长的行的并行时间序列和每列一个序列的多个并行时间序列拆分为所需的输入/输出形状。

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the dataset
		if end_ix > len(sequences)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

我们可以在人为的问题上证明这一点。下面列出了完整的示例。

# multivariate output data prep
from numpy import array
from numpy import hstack

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the dataset
		if end_ix > len(sequences)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps = 3
# convert into input/output
X, y = split_sequences(dataset, n_steps)
print(X.shape, y.shape)
# summarize the data
for i in range(len(X)):
	print(X[i], y[i])

首先运行示例将打印准备的X和y分量的形状。

X的形状是三维的，包括样本数（6），每个样本选择的时间步数（3）以及并行时间序列或特征的数目（3）。

y的形状是二维的，正如我们可能期望的样本数（6）和每个要预测的样本的时间变量数（3）一样。

数据已准备好在LSTM模型中使用，该模型期望每个样本的X和y分量具有三维输入和二维输出形状。

然后，打印每个样本，显示每个样本的输入和输出分量。

(6, 3, 3) (6, 3)

[[10 15 25]
 [20 25 45]
 [30 35 65]] [40 45 85]
[[20 25 45]
 [30 35 65]
 [40 45 85]] [ 50  55 105]
[[ 30  35  65]
 [ 40  45  85]
 [ 50  55 105]] [ 60  65 125]
[[ 40  45  85]
 [ 50  55 105]
 [ 60  65 125]] [ 70  75 145]
[[ 50  55 105]
 [ 60  65 125]
 [ 70  75 145]] [ 80  85 165]
[[ 60  65 125]
 [ 70  75 145]
 [ 80  85 165]] [ 90  95 185]

现在，我们准备在此数据上拟合LSTM模型。

可以使用上一部分中的LSTM的任何变体，例如Vanilla，Stacked，Bidirectional，CNN或ConvLSTM模型。

我们将使用堆叠式LSTM，其中通过input_shape参数为输入层指定了时间步长和并行序列（特征）。并行序列数也用于规范值的数量，以通过输出层中的模型进行预测；再次，这是三个。

# define model
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_features))
model.compile(optimizer='adam', loss='mse')

通过为每个序列提供三个时间步长的输入，我们可以预测三个并行序列中每个序列的下一个值。

70, 75, 145
80, 85, 165
90, 95, 185

用于进行单个预测的输入形状必须为1个样本，3个时间步长和3个特征，或者为[1、3、3]

# demonstrate prediction
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)

我们希望向量输出为：

[100, 105, 205]

我们可以将所有这些结合在一起，并在下面演示用于多变量输出时间序列预测的Stacked LSTM。

# multivariate output stacked lstm example
from numpy import array
from numpy import hstack
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the dataset
		if end_ix > len(sequences)-1:
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps = 3
# convert into input/output
X, y = split_sequences(dataset, n_steps)
# the dataset knows the number of features, e.g. 2
n_features = X.shape[2]
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_features))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=400, verbose=0)
# demonstrate prediction
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行示例将准备数据，拟合模型并做出预测。

[[101.76599 108.730484 206.63577 ]]

多步长LSTM模型

需要对未来的多个时间步长进行预测的时间序列预测问题可以称为多步时间序列预测。

具体地说，这些是预测范围或间隔超过一个时间步长的问题。

LSTM模型有两种主要类型，可用于多步预测：他们是：

向量输出模型
编码器-解码器模型
在研究这些模型之前，我们首先来看一下用于多步预测的数据准备。

数据准备

与单步预测一样，用于多步时间序列预测的时间序列必须分为具有输入和输出成分的样本。

输入和输出组件都将包含多个时间步长，并且可能具有也可能不具有相同数量的步长。

例如，给定单变量时间序列：

[10, 20, 30, 40, 50, 60, 70, 80, 90]

我们可以使用最后三个时间步作为输入，并预测接下来的两个时间步。

第一个样本如下所示：

输入：

[10, 20, 30]

输出：

[40, 50]

下面的split_sequence（）函数实现了此行为，并将给定的单变量时间序列拆分为具有指定数量的输入和输出时间步长的样本。

# split a univariate sequence into samples
def split_sequence(sequence, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out
		# check if we are beyond the sequence
		if out_end_ix > len(sequence):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

我们可以在小的人为数据集上演示此功能。

下面列出了完整的示例。

# multi-step data preparation
from numpy import array

# split a univariate sequence into samples
def split_sequence(sequence, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out
		# check if we are beyond the sequence
		if out_end_ix > len(sequence):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps_in, n_steps_out = 3, 2
# split into samples
X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
# summarize the data
for i in range(len(X)):
	print(X[i], y[i])

运行示例将单变量序列分为输入和输出时间步长，并打印每个变量的输入和输出分量。

[10 20 30] [40 50]
[20 30 40] [50 60]
[30 40 50] [60 70]
[40 50 60] [70 80]
[50 60 70] [80 90]

现在，我们知道了如何为多步预测准备数据，让我们看一些可以学习此映射的LSTM模型。

向量输出模型

与其他类型的神经网络模型一样，LSTM可以直接输出向量，该向量可以解释为多步预测。

在上一节中看到了这种方法，每个输出时间序列的一个时间步被预测为向量。

与先前部分中用于单变量数据的LSTM一样，必须首先对准备好的样品进行整形。 LSTM期望数据具有[样本，时间步长，特征]的三维结构，在这种情况下，我们只有一个特征，因此重整很简单。

# reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))

使用n_steps_in和n_steps_out变量中指定的输入和输出步骤数，我们可以定义一个多步骤时间序列预测模型。

可以使用任何呈现的LSTM模型类型，例如Vanilla，Stacked，Bidirectional，CNN-LSTM或ConvLSTM。下面定义了用于多步预测的Stacked LSTM。

# define model
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')

该模型可以对单个样本做出预测。通过提供输入，我们可以预测超出数据集末尾的下两个步骤：

[70, 80, 90]

我们希望预测的输出为：

[100, 110]

如模型所期望的那样，对于1个样本，输入的3个时间步长和单个特征，进行预测时，输入数据的单个样本的形状必须为[1、3、1]。

# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)

综合所有这些，下面列出了用于单步时间序列的多步预测的Stackable LSTM。

# univariate multi-step vector-output stacked lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense

# split a univariate sequence into samples
def split_sequence(sequence, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out
		# check if we are beyond the sequence
		if out_end_ix > len(sequence):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps_in, n_steps_out = 3, 2
# split into samples
X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
# reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=50, verbose=0)
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行示例预测并按顺序打印接下来的两个时间步。

[[100.98096 113.28924]]

编码器-解码器模型

专为预测可变长度输出序列而开发的模型称为Encoder-Decoder LSTM。

该模型是为存在输入和输出序列的预测问题而设计的，即所谓的序列到序列或seq2seq问题，例如将文本从一种语言翻译为另一种语言。

该模型可用于多步时间序列预测。

顾名思义，该模型由两个子模型组成：编码器和解码器。

编码器是负责读取和解释输入序列的模型。编码器的输出是固定长度的向量，代表模型对序列的解释。传统上，该编码器是Vanilla LSTM模型，但也可以使用其他编码器模型，例如Stacked，Bidirectional和CNN模型。

model.add(LSTM(100, activation='relu', input_shape=(n_steps_in, n_features)))

解码器将编码器的输出用作输入。

首先，重复编码器的固定长度输出，在输出序列中的每个所需时间步一次。

model.add(RepeatVector(n_steps_out))

然后将该序列提供给LSTM解码器模型。该模型必须在输出时间步长中为每个值输出一个值，这可以由单个输出模型来解释。

model.add(LSTM(100, activation='relu', return_sequences=True))

我们可以使用相同的一个或多个输出层在输出序列中进行每个一步的预测。这可以通过将模型的输出部分包装在TimeDistributed包装器中来实现。

model.add(TimeDistributed(Dense(1)))

下面列出了用于多步时间序列预测的编码器-解码器模型的完整定义。

# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(RepeatVector(n_steps_out))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')

与其他LSTM模型一样，必须将输入数据重塑为[样本，时间步长，特征]的预期三维形状。

X = X.reshape((X.shape[0], X.shape[1], n_features))

对于Encoder-Decoder模型，训练数据集的输出或y部分也必须具有此形状。这是因为模型将为每个输入样本预测具有给定数量特征的给定时间步长。

y = y.reshape((y.shape[0], y.shape[1], n_features))

下面列出了用于多步时间序列预测的Encoder-Decoder LSTM的完整示例。

# univariate multi-step encoder-decoder lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed

# split a univariate sequence into samples
def split_sequence(sequence, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out
		# check if we are beyond the sequence
		if out_end_ix > len(sequence):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps_in, n_steps_out = 3, 2
# split into samples
X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
# reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))
y = y.reshape((y.shape[0], y.shape[1], n_features))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(RepeatVector(n_steps_out))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=100, verbose=0)
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行示例预测并按顺序打印接下来的两个时间步。

[[[101.9736  
  [116.213615]]]

多元多步LSTM模型

在前面的部分中，我们研究了单变量，多变量和多步时间序列预测。

迄今为止，针对不同的问题，有可能混合使用LSTM模型的不同类型。这也适用于涉及多变量和多步骤预测的时间序列预测问题，但可能更具挑战性。

在本节中，我们将提供用于多变量多步时间序列预测的数据准备和建模的简短示例，以作为缓解此挑战的模板，特别是：

多输入多步输出。
多路并行输入和多步输出。
也许最大的绊脚石是数据准备，所以这是我们将重点关注的地方。

多输入多步输出

存在那些多元时间序列预测问题，其中输出序列是独立的，但取决于输入时间序列，并且输出序列需要多个时间步长。

例如，考虑上一节中的多元时间序列：

[[ 10  15  25]
 [ 20  25  45]
 [ 30  35  65]
 [ 40  45  85]
 [ 50  55 105]
 [ 60  65 125]
 [ 70  75 145]
 [ 80  85 165]
 [ 90  95 185]]

我们可以使用两个输入时间序列中每个时间序列的三个先前时间步长来预测输出时间序列的两个时间步长。

输入：

10, 15
20, 25
30, 35

输出：

65
85

下面的split_sequences（）函数实现了此行为。

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out-1
		# check if we are beyond the dataset
		if out_end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

我们可以在人为设计的数据集上对此进行演示。

下面列出了完整的示例。

# multivariate multi-step data preparation
from numpy import array
from numpy import hstack

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out-1
		# check if we are beyond the dataset
		if out_end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps_in, n_steps_out = 3, 2
# covert into input/output
X, y = split_sequences(dataset, n_steps_in, n_steps_out)
print(X.shape, y.shape)
# summarize the data
for i in range(len(X)):
	print(X[i], y[i])

首先运行示例将打印准备好的训练数据的形状。

我们可以看到，样本输入部分的形状是三维的，由六个样本组成，具有三个时间步长，两个变量用于两个输入时间序列。

对于六个样本，样本的输出部分是二维的，对于要预测的每个样本，样本的输出部分是两个时间步长。

然后打印准备好的样品，以确认数据是否按我们指定的准备。

(6, 3, 2) (6, 2)

[[10 15]
 [20 25]
 [30 35]] [65 85]
[[20 25]
 [30 35]
 [40 45]] [ 85 105]
[[30 35]
 [40 45]
 [50 55]] [105 125]
[[40 45]
 [50 55]
 [60 65]] [125 145]
[[50 55]
 [60 65]
 [70 75]] [145 165]
[[60 65]
 [70 75]
 [80 85]] [165 185]

现在，我们可以为多步预测开发LSTM模型。

可以使用矢量输出或编码器/解码器模型。在这种情况下，我们将演示带有堆叠LSTM的矢量输出。

下面列出了完整的示例。

# multivariate multi-step stacked lstm example
from numpy import array
from numpy import hstack
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out-1
		# check if we are beyond the dataset
		if out_end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps_in, n_steps_out = 3, 2
# covert into input/output
X, y = split_sequences(dataset, n_steps_in, n_steps_out)
# the dataset knows the number of features, e.g. 2
n_features = X.shape[2]
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
x_input = array([[70, 75], [80, 85], [90, 95]])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行示例将拟合模型，并预测数据集之外的输出序列的下两个时间步长。

我们希望接下来的两个步骤是：[185，205]

这是用很少的数据来解决问题的具有挑战性的框架，并且该模型的任意配置版本已接近。

[[188.70619 210.16513]]

多路并行输入和多步输出

并行时间序列的问题可能需要预测每个时间序列的多个时间步长。

例如，考虑上一节中的多元时间序列：

[[ 10  15  25]
 [ 20  25  45]
 [ 30  35  65]
 [ 40  45  85]
 [ 50  55 105]
 [ 60  65 125]
 [ 70  75 145]
 [ 80  85 165]
 [ 90  95 185]]

我们可以将三个时间序列中每个序列的最后三个时间步用作模型的输入，并预测三个时间序列中每个序列的下一个时间步作为输出。

训练数据集中的第一个样本如下。

输入：

10, 15, 25
20, 25, 45
30, 35, 65

输出：

40, 45, 85
50, 55, 105

下面的split_sequences（）函数实现了此行为。

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out
		# check if we are beyond the dataset
		if out_end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

我们可以在小的人为数据集上演示此功能。

下面列出了完整的示例。

# multivariate multi-step data preparation
from numpy import array
from numpy import hstack
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out
		# check if we are beyond the dataset
		if out_end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps_in, n_steps_out = 3, 2
# covert into input/output
X, y = split_sequences(dataset, n_steps_in, n_steps_out)
print(X.shape, y.shape)
# summarize the data
for i in range(len(X)):
	print(X[i], y[i])

首先运行示例将打印准备好的训练数据集的形状。

我们可以看到，数据集的输入（X）和输出（Y）元素都是三维的，分别是样本数，时间步长和变量或并行时间序列。

然后并排打印每个系列的输入和输出元素，以便我们可以确认数据已按预期准备。

(5, 3, 3) (5, 2, 3)

[[10 15 25]
 [20 25 45]
 [30 35 65]] [[ 40  45  85]
 [ 50  55 105]]
[[20 25 45]
 [30 35 65]
 [40 45 85]] [[ 50  55 105]
 [ 60  65 125]]
[[ 30  35  65]
 [ 40  45  85]
 [ 50  55 105]] [[ 60  65 125]
 [ 70  75 145]]
[[ 40  45  85]
 [ 50  55 105]
 [ 60  65 125]] [[ 70  75 145]
 [ 80  85 165]]
[[ 50  55 105]
 [ 60  65 125]
 [ 70  75 145]] [[ 80  85 165]
 [ 90  95 185]]

我们可以使用向量输出或编码器-解码器LSTM对此问题进行建模。在这种情况下，我们将使用Encoder-Decoder模型。

下面列出了完整的示例。

# multivariate multi-step encoder-decoder lstm example
from numpy import array
from numpy import hstack
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out
		# check if we are beyond the dataset
		if out_end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps_in, n_steps_out = 3, 2
# covert into input/output
X, y = split_sequences(dataset, n_steps_in, n_steps_out)
# the dataset knows the number of features, e.g. 2
n_features = X.shape[2]
# define model
model = Sequential()
model.add(LSTM(200, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(RepeatVector(n_steps_out))
model.add(LSTM(200, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=300, verbose=0)
# demonstrate prediction
x_input = array([[60, 65, 125], [70, 75, 145], [80, 85, 165]])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行示例将拟合模型，并为数据集末尾的下两个时间步长预测三个时间步长的每个值。

我们希望这些序列和时间步长的值如下：

90, 95, 185
100, 105, 205

我们可以看到模型预测已经相当接近预期值。

[[[ 91.86044   97.77231  189.66768 ]
  [103.299355 109.18123  212.6863  ]]]

摘要

在本教程中，您发现了如何为一系列标准时间序列预测问题开发一套LSTM模型。

具体来说，您了解到：

如何开发用于单变量时间序列预测的LSTM模型。
如何为多元时间序列预测开发LSTM模型。
如何开发用于多步时间序列预测的LSTM模型。

很高兴有时间可以亲自翻译和实践一下大神Jason这篇基于LSTM的多种类型网络的数据建模分析教程，欢迎感兴趣的同学一起交流学习共同进步。

Together_CZ 博客专家

发布了521 篇原创文章 · 获赞 490 · 访问量 323万+

他的留言板关注