Predict tomorrow's stock price using a time series model

864b70755112d45c2b158193cd1d9f0d.gif

Preface

Last time I introduced a simple quantitative trading strategy , and I received a lot of private messages asking how to use the deep learning time series model. Let me share some ideas about how the LSTM model predicts tomorrow's price.

LSTM model

Long short-term memory (LSTM) is a special sequential neural network. An ordinary LSTM unit consists of a "unit" , an "input gate" , an "output gate" and a "forgetting gate" .

19751ed4f093394a335b145df84eda75.png
LSTM unit

To put it simply, the time series model uses a continuous period of data in history to predict future data. We connect the input data streams cyclically using the LSTM module as shown in the figure below.

b7f246dd2e2e4a6d4e40ac71c132f5ce.png

Environmental preparation

We first open a new Notebook on Jukuan. Remember to use the python3 version, which has Tensorflow pre-installed.

If not, install it manually

!pip install tensorflow

import math
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

First check the tensorflow version, 1.12, keras 2.2.4, the simple model is basically enough.

import keras
print('tensorflow:',tf.__version__)  # 1.12.2
print('keras:',keras.__version__)  # 2.2.4

data preparation

Taking "CSI 300 Index Fund" as an example, the data of the first 60 days is a time series interval, and we obtain all "closing data" since 2015 as a data set.

security = '510300.XSHG'
his_period = 60  # 历史时间
df = get_price(security, start_date="2015-01-05", end_date="2021-12-31", frequency='daily')
df[:3]
6975f2f7be834ff9829bab75613cb9b1.png

There are 1705 pieces of data in total. Let’s visualize it first.

plt.figure(figsize=(16,8))
plt.title('Close Price History')
plt.plot(df['close'])
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price', fontsize=18)
plt.show()
dcb7b3a38ad128b23a4a272ef8435394.png

Divide the training set

Filter out the closing data and first divide 80% of it into a training set with 1364 records.

data = df.filter(['close'])
dataset = data.values
training_data_len = math.ceil( len(dataset) * .8 )

Normalize the data and scale it to (0,1) to facilitate training

scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(dataset)
b1b1e2fee668e81fc04fb850387470d9.png

Every 60 historical data are used as a training set, and a label data is assigned to generate a training set.

train_data = scaled_data[0:training_data_len, :]
x_train = []
y_train = []

for i in range(his_period, len(train_data)):
    x_train.append(train_data[i-his_period:i, 0])
    y_train.append(train_data[i, 0])

x_train, y_train = np.array(x_train), np.array(y_train)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
x_train.shape

After converting it to a numpy array, the dimensions are (1304, 60, 1).

Modeling

We use the keras sequential model to build the neural network, first use a 2-layer 50-unit LSTM to extract features, and then use a fully connected layer to output the results. The optimizer uses Adam, and the loss function uses MSE.

model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(LSTM(50, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))

model.compile(optimizer='adam', loss='mean_squared_error')

Training model

Set the batch size to 1 and train for 5 epochs first.

model.fit(x_train, y_train, batch_size=1, epochs=5)
d455bfd13ef38cb4a6780af0701f4dfa.png

Okay, everything is ready, the next step is to make a cup of coffee cda2027fed4bbafa001139b0449d930c.pngand wait patiently. . .

Validate model

We divide the remaining 20% ​​of the data into the test set, which is also converted into numpy data, with dimensions of (341, 60, 1).

test_data = scaled_data[training_data_len - his_period:, :]
x_test = []
y_test = dataset[training_data_len:, :]
for i in range(his_period, len(test_data)):
    x_test.append(test_data[i-his_period:i, 0])
    
x_test = np.array(x_test)
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
x_test.shape

To predict a wave, first calculate the rmse.

predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)
rmse=np.sqrt(((predictions - y_test) ** 2).mean())
rmse
6debec02a393975861063c6ec2fd91ee.png

Visualize the effect

train = data[:training_data_len]
valid = data[training_data_len:]
valid['Predictions'] = predictions

plt.figure(figsize=(16, 8))
plt.title('Model')
plt.xlabel('Data', fontsize=18)
plt.ylabel('Close Price', fontsize=18)
plt.plot(train['close'])
plt.plot(valid[['close', 'Predictions']])
plt.legend(['Train', 'Val', 'Predictions'], loc='lower right')
plt.show()
9ec4ce938c329ec13495c8afc2b2de86.png

Overall, the trend is basically predicted, and the predicted price is conservative. There may be factors such as insufficient model epochs, or it may be that the index is at a relatively high level in the past two years and the model has not seen the world.

Interested friends can change some parameters, or introduce more features and finetune it.

Actual combat

Re-acquire the data set, all data from 2015 to yesterday.

new_df = get_price(security, start_date="2015-01-05", end_date="2022-1-4", frequency='daily')
new_df.tail()
cec54f5eb176eb410858a8b009c18f47.png

Let’s predict tomorrow’s closing price

last_days = new_df.filter(['close'])[-his_period:].values
last_days_scaled = scaler.transform(last_days)
X_test = []
X_test.append(last_days_scaled)
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
pred_price = model.predict(X_test)
pred_price = scaler.inverse_transform(pred_price)
print(pred_price)
[[4.901391]]

Remember to take a look at the actual price after the market closes tomorrow?

get_price(security, start_date="2022-1-5", end_date="2022-1-5", frequency='daily').filter(['close'])

Okay, I’ve told you the wealth code, it just depends on whether tomorrow’s closing price is accurate.430eecc26a47ea4077e08cae2a556057.png

94f3152075318638ec629e97f478350f.png

Finally, I would like to mention that there are risks in entering the market, so investment needs to be cautious.

Source code download

b25564a43ca8a17b4d13d84d348defd6.png

Relevant documents and information for this issue can be found on the public account "Deep Awakening" and reply: "trade01" in the background to obtain the download link.

32f80945924d673d1f066c57bf4ca527.gif

Guess you like

Origin blog.csdn.net/weixin_47479625/article/details/122335490