To "teach you to use a simple neural network and time series forecasting LSTM" code modification and interpretation

Text link and code here: to teach you to use a simple neural network and time series forecasting LSTM (with code)

But in the process of testing, source code some problems, directly run the original code does not work. Most of the original interpretation of the code written in a very clear, only to add here.

In this paper the test environment: Python3.6 Jupyter Notebook, TensorFlow + Keras

This article uses artificial neural networks (Artificial Neural Network, ANN) and long-term memory recurrent neural network (Long Short-Term Memory Recurrent Neural Network, LSTM RNN) model for time series data, the goal is the use of ANN to predict fluctuations and LSTM S & P 500 in time sequence.

From here ( https://ca.finance.yahoo.com/quote/%5Evix/history?ltr=1 ) Download the volatility of the S & P 500 data sets, you can choose the time range you want to download the data set.

Start knocking the code. The first is the ANN model

import pandas as pd
import numpy as np
%matplotlib inline 
#在Jupyter Notebook要PLOT出图像,必须加这个
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import r2_score
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import EarlyStopping
from keras.optimizers import Adam
from keras.layers import LSTM
#并将数据加载到Pandas 的dataframe中。
df = pd.read_csv("E:\data\VIX.csv")
#我们可以快速浏览前几行。
print(df.head())
#删除不需要的列,然后将“日期”列转换为时间数据类型,并将“日期”列设置为索引。
df.drop(['Open', 'High', 'Low', 'Close', 'Volume'], axis=1, inplace=True)
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index(['Date'], drop=True)
df.head(10)

Out[1]:

Out[2]:

(We only use this data Adj Close)

#我们绘制一个时间序列线图。
plt.figure(figsize=(10, 6))
df['Adj Close'].plot();

#按日期“2018–09–20”将数据拆分为训练集和测试集
split_date = pd.Timestamp('2018-09-20')
df = df['Adj Close']
train = df.loc[:split_date]
test = df.loc[split_date:]
plt.figure(figsize=(10, 6))
ax = train.plot()
test.plot(ax=ax)
plt.legend(['train', 'test']);

Out:

#将训练和测试数据缩放为[-1,1]。
train=np.array(train).reshape(-1,1)
test=np.array(test).reshape(-1,1)

scaler = MinMaxScaler(feature_range=(-1, 1))
train_sc = scaler.fit_transform(train)
test_sc = scaler.fit_transform(test)

 

Prior to normalization reduction:

origin_data=scaler.inverse_transform(y_pred_test)‘‘’

The last part of the data will be combined with the reduction of

#获取训练和测试数据。
X_train = train_sc[:-1]
y_train = train_sc[1:]
X_test = test_sc[:-1]
y_test = test_sc[1:]
#获取训练和测试数据。
X_train = train_sc[:-1]
y_train = train_sc[1:]
X_test = test_sc[:-1]
y_test = test_sc[1:]

Explanation, the sequence predicted a = np.array ([2,3,4,5,6,7])

print (a [: - 1]) # remove the last element

print (a [1:]) # from the second element

Out:

[2 3 4 5 6]

[3 4 5 6 7]

Input x = [2 3 4 5 6]

Corresponding Label y = [3 4 5 6 7] (Progressive next prediction) ''

#用于时间序列预测的简单人工神经网络ANN
ann=Sequential()
ann.add(Dense(12, input_dim=1, activation='relu'))
ann.add(Dense(1))
ann.compile(loss='mean_squared_error', optimizer='adam')
early_stop = EarlyStopping(monitor='loss', patience=2, verbose=1)
history= ann.fit(X_train, y_train, epochs=100, batch_size=1, verbose=1, callbacks=[early_stop], shuffle=False)

#进行预测
y_pred_test = ann.predict(X_test)

y_train_pred =ann.predict(X_train)

print("The R2 score on the Train set is:\t{:0.3f}".format(r2_score(y_train, y_train_pred)))

print("The R2 score on the Test set is:\t{:0.3f}".format(r2_score(y_test, y_pred_test)))

 

Out:

The R2 score on the Train set is:     0.851
The R2 score on the Test set is:      0.823

supplement:

R2 coefficient of determination (goodness of fit)

The better the model: r2 → 1

Model worse: r2 → 0

#显示6项数据
print('显示数据:')
print(y_test[:6].reshape([1,6]))
print(y_pred_test[:6].reshape([1,6]))
#将归一化数据还原
y_test_pred_origin=scaler.inverse_transform(y_pred_test)
y_test_origin=scaler.inverse_transform(y_test)
print('数据还原:')
print('真实值:'+'%s'%y_test_origin[:6].reshape([1,6]))
print('预测值:'+'%s'%y_test_pred_origin[:6].reshape([1,6]))

Out: (behind forecasts and real evaluation index MSE)

 

#显示TEST真实值和预测值的对比图
plt.figure(figsize=(10, 6))

plt.plot(y_test, label='True')

plt.plot(y_pred_test, label='NN')

plt.title("NN's Prediction")

plt.xlabel('Observation')

plt.ylabel('Adj Close Scaled')

plt.legend()

plt.show();

OUT:

LSTM models

x_train=X_train.reshape(1509,1,1)#注意input x_train 的shape
lstm_model = Sequential()
lstm_model.add(LSTM(7,activation='relu', kernel_initializer='lecun_uniform', return_sequences=False))
lstm_model.add(Dense(1))
lstm_model.compile(loss='mean_squared_error', optimizer='adam')
early_stop = EarlyStopping(monitor='loss', patience=2, verbose=1)
history_lstm_model = lstm_model.fit(x_train, y_train, epochs=100, batch_size=1, verbose=1, shuffle=False, callbacks=[early_stop])

LSTM ANN model input shape and there is a difference, it must first be converted to X_train, behind X_test well.

OUT:

Predict

x_test=X_test.reshape(251,1,1) #251 is the length of X_test
y_pred_test_lstm = lstm_model.predict(x_test)
y_train_pred_lstm = lstm_model.predict(x_train)
print("The R2 score on the Train set is:t{:0.3f}".format(r2_score(y_train, y_train_pred_lstm)))
print("The R2 score on the Test set is:t{:0.3f}".format(r2_score(y_test, y_pred_test_lstm)))

OUT:

#显示TEST真实值和预测值的对比图
plt.figure(figsize=(10, 6))
plt.plot(y_test, label='True')
plt.plot(y_pred_test_lstm, label='LSTM')
plt.title("LSTM's Prediction")
plt.xlabel('Observation')
plt.ylabel('Adj Close scaled')
plt.legend()
plt.show()

OUT:

#比较了两种模型的测试MSE
ann_test_mse = ann.evaluate(X_test, y_test, batch_size=1)
lstm_test_mse = lstm_model.evaluate(x_test, y_test, batch_size=1)
print('ANN: %f'%ann_test_mse)
print('LSTM: %f'%lstm_test_mse)

Out:

summary:

In fact, not many places to be modified, this is a relatively simple example, may not be familiar with the contributions of time, for numpy and pandas to pay more to learn and practice.

Published 10 original articles · won praise 10 · views 7516

Guess you like

Origin blog.csdn.net/qq_41647438/article/details/101147892