[Data analysis] Predictive analysis using machine learning algorithms (6): Long and short-term memory network (LSTM) (2021-01-22)

Machine learning methods in time series forecasting (6): Long-short-term memory network (LSTM)

This article is the sixth article in the series of "Machine Learning Methods in Time Series Forecasting". If you are interested, you can read the previous articles first:
[Data Analysis] Using Machine Learning Algorithms for Predictive Analysis (1): Moving Average (Moving Average) Average)
[Data analysis] Predictive analysis using machine learning algorithms (2): Linear Regression
[Data analysis] Predictive analysis using machine learning algorithms (3): K-Nearest Neighbours
[Data analysis] Predictive analysis using machine learning algorithms (4): Autoregressive differential moving average model (AutoARIMA)
[Data analysis] Predictive analysis using machine learning algorithms (5): Prophet

1. Introduction to LSTM

Long Short-Term Memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections. It can process not only a single data point (such as an image), but also an entire data sequence (such as voice or video).

LSTM network is very suitable for classification, processing and prediction based on time series data, because there may be a lag of unknown duration between important events in the time series. LSTM was developed to deal with the explosion and disappearing gradient problems that may be encountered when training traditional RNNs. The relative insensitivity to the gap length is the advantage of LSTM over RNN, hidden Markov model and other sequence learning methods in many applications.

2. "Stock price forecast" example

The data set is the same as the previous article, and the purpose is to compare the prediction effects of different algorithms on the same data set. The data set and code are on my GitHub , and friends who need it can download it by themselves.

Import the package and read in the data. Please ensure that the sklearn and keras packages are installed correctly.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM
df = pd.read_csv('NSE-TATAGLOBAL11.csv')

Set the index. In order not to destroy the original data, rebuild a new_data.

# setting the index as date
df['Date'] = pd.to_datetime(df.Date,format='%Y-%m-%d')
df.index = df['Date']

#creating dataframe
data = df.sort_index(ascending=True, axis=0)
new_data = pd.DataFrame(index=range(0,len(df)),columns=['Date', 'Close'])
for i in range(0,len(data)):
    new_data['Date'][i] = data['Date'][i]
    new_data['Close'][i] = data['Close'][i]

new_data.index = new_data.Date
new_data.drop('Date', axis=1, inplace=True)

Take a look at the format of new_data.

new_data

Insert picture description here

#creating train and test sets
dataset = new_data.values
dataset

Insert picture description here
Divide the data into training set and test set.

train = dataset[0:987,:]
valid = dataset[987:,:]

Normalize the data set.

#converting dataset into x_train and y_train
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(dataset)
scaled_data

Insert picture description here
The way we train the model is to use the 60 values ​​in front of each number for training.

x_train, y_train = [], []
for i in range(60,len(train)):
    x_train.append(scaled_data[i-60:i,0])
    y_train.append(scaled_data[i,0])
    
x_train, y_train = np.array(x_train), np.array(y_train)
x_train.shape

Insert picture description here

y_train.shape

Insert picture description here

x_train = np.reshape(x_train, (x_train.shape[0],x_train.shape[1],1))

Build LSTM network. Sequential represents a sequential model. The core operation is to add layers to it. In addition to LSTM, you can also add a convolutional layer Conv2D, a maximum pooling layer MaxPooling, and a flattening layer Flatten.

# create and fit the LSTM network
model = Sequential() # 顺序模型,核心操作是添加layer(图层)
model.add(LSTM(units=50, return_sequences=True, input_shape=(x_train.shape[1],1)))
model.add(LSTM(units=50))
model.add(Dense(1)) #全连接层

model.compile(loss='mean_squared_error', optimizer='adam') #选择优化器,并指定损失函数
model.fit(x_train, y_train, epochs=1, batch_size=1, verbose=2)

Insert picture description here
We will use the 60 numbers in front of each data to predict this number. A total of 248 numbers are predicted.

#predicting 248 values, using past 60 from the train data
inputs = new_data[len(new_data) - len(valid) - 60:].values # 1235 - 927 - 60 = 308
inputs = inputs.reshape(-1,1)
inputs  = scaler.transform(inputs)
inputs.shape

Insert picture description here
X_test represents the input of the prediction set on the model.

X_test = []
for i in range(60,inputs.shape[0]):
    X_test.append(inputs[i-60:i,0])
X_test = np.array(X_test)

X_test = np.reshape(X_test, (X_test.shape[0],X_test.shape[1],1))
X_test.shape

Insert picture description here
Start to predict with the model and convert the standardized data into raw data.

closing_price = model.predict(X_test)
closing_price = scaler.inverse_transform(closing_price)

Take a look at the RMSE value this time. It is much smaller than the value obtained by the previous centralized prediction method, indicating that the error is relatively small.

rmse = np.sqrt(np.mean(np.power((valid - closing_price),2)))
rmse

Insert picture description here
Look at the forecast by drawing.

#for plotting
train = new_data[:987]
valid = new_data[987:]
valid['Predictions'] = closing_price

plt.figure(figsize=(16,8))
plt.plot(train['Close'])
plt.plot(valid[['Close','Predictions']])
plt.show

Insert picture description here
It can be seen that the prediction of LSTM on this stock market closing price data set is relatively accurate. Of course, we can also adjust the LSTM model for various parameters, such as changing the number of LSTM layers, adding training algebra, etc.

Through the results of these several experiments, we can see that in the prediction of this stock, the moving average algorithm, linear regression algorithm, Arima algorithm, etc. have not shown good prediction results, but the prediction results of the LSTM algorithm are more in line with us. Expected value. Of course, this does not mean that LSTM algorithm is the best time series prediction algorithm. For different issues, specific analysis is needed. For example, if a supermarket wants to predict the future sales of different commodities, it will definitely not find a once-and-for-all algorithm to satisfy all categories. A better solution is to use different algorithms to predict each category and find the prediction curve that best fits the category.

Even with the support of machine learning algorithms, in most cases, it is difficult for us to make accurate predictions. Predicting the stock market is very difficult and there are too many uncertain factors. And with the passage of time series, for a certain commodity sales forecast, even if the temporary best forecasting algorithm is found, it does not mean that this forecasting algorithm will be available in the future. Where will the turning point in time appear? This is also a question that needs to be considered.

Written in the back: "Stock Prices Prediction Using Machine Learning and Deep Learning Techniques" this article is very helpful for me to learn how to use machine learning methods for predictive analysis based on time series . These blogs are equivalent to my study notes to write and promote learning.

Guess you like

Origin blog.csdn.net/be_racle/article/details/112999853