Predicting stock trends using MACD decision tree model

55beb1de8f2a80aa56819a3b46091672.gif

Preface

As mentioned last time, the LSTM time series model can be used to predict stock prices. In stock price technical analysis, MACD, as a classic trading indicator, has always guided the timing of buying and selling. In this article, we will implement the decision tree model under this indicator. to predict future stock trends.

basic concept

MACD indicator

MACD (Moving Average Convergence/Divergence) is a technical indicator used for stock price analysis, created by Gerald Appel in the late 1970s. It aims to reveal changes in the strength, direction, momentum and duration of stock price trends.

3bfb4ef5f6659c1ffb1a5f20e01d3b1d.png

The MACD indicator is an indicator calculated from three time series based on historical price data (usually closing prices). , that is, the time constant of the three EMAs, usually expressed as "MACD(a, b, c)". Where the MACD sequence is the difference between the exponential moving average (EMA) with characteristic time a (short period) and b (long period), and the average sequence is the EMA of the MACD sequence with characteristic time c.

These parameters are usually measured in days. The most commonly used values ​​are 12, 26 and 9 days, i.e. MACD(12,26,9). Like most technical indicators, MACD also finds its cyclical patterns from technical analysis, which in the past was mainly based on daily charts.

Since the working week used to be 6 days, the period settings of (12, 26, 9) represent 2 weeks, 1 month, and 1 and a half weeks respectively. Now, when the trading week only has 5 days, the cycle parameters can also be adjusted accordingly, such as (10, 20, 7). However, it is best to stick to the period settings used by most traders, as buying and selling decisions based on standard settings can further influence the price in that direction.

The short-term EMA reacts more quickly to recent stock price changes than the long-term EMA. By comparing EMAs from different periods, the MACD sequence can indicate changes in a stock's trend. Divergence series are said to reveal subtle changes in stock trends.

Since MACD is based on moving averages, it is a slow lagging indicator. As an indicator for predicting future price trends, MACD cannot be directly used when trading in completely irregular ranges or when prices are unpredictable. And when MACD shows a trend, the trend is complete or almost complete.

Decision tree model

Decision tree models are one of the predictive modeling methods used in statistics, data mining, and machine learning. It uses decision trees as prediction models, presenting observations of historical data in the branches and predicting target values ​​in the leaves.

4ed4aeeaace7eecd0d90b75e80490cd0.png

A tree model if the target value can take the form of a set of discrete values ​​is called a classification tree; in these tree structures, the leaves represent classification labels and the branches represent the conjunctions that lead to the features of these classification labels. A decision tree if the target value can take on continuous values ​​(usually real numbers) is called a regression tree. Our prediction target here is the stock price, so the regression model is used.

Decision trees are one of the most popular machine learning algorithms due to their understandability and simplicity. In decision analysis, decision trees can be used to intuitively and unambiguously represent decisions and decision making.

To sum up, MACD is an average indicator. Although it is not suitable to directly reflect the ever-changing real-time stock prices, it can be used to reflect the degree of price divergence over a period of time. Next, we will use a decision tree to predict the future period. Price trend.

Environmental preparation

Also create a new one first jupyterand Notebookimport sklearnthe library.

import math
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.tree import DecisionTreeRegressor
plt.style.use('fivethirtyeight')

data preparation

Get transaction data

Obtain historical data since 2015. This is similar to the previous article, so I won’t go into details.

security = '510300.XSHG'
his_period = 60  # 历史时间
df = get_price(security, start_date="2015-01-05", end_date="2022-1-25", frequency='daily')

Calculate EMA

Calculate the three time parameters separately. Although it is now a 5-day trading day, we still use the classic MACD (12, 26, 9). If you are interested, you can adjust the parameters yourself.

5a9200108651fa8cdf695564a8219cdd.png
ShortEMA = df.close.ewm(span=12, adjust=False).mean()
LongEMA = df.close.ewm(span=26, adjust=False).mean()
MACD = ShortEMA - LongEMA
signal = MACD.ewm(span=9, adjust=False).mean()

Visualize the parameter curve

macd_parameter = (12, 26, 9)
ShortEMA = df.close.ewm(span=macd_parameter[0], adjust=False).mean()
LongEMA = df.close.ewm(span=macd_parameter[1], adjust=False).mean()
MACD = ShortEMA - LongEMA
signal = MACD.ewm(span=macd_parameter[2], adjust=False).mean()
9e72f9b606dcfdf5c0ea1f8b00824a2c.png

Save indicator data into a dataset

df['MACD'] = MACD
df['Signal Line'] = signal
df.tail()
f41334fa234bea29950542d25834e8c8.png

Get buy and sell signals

MACD Crossover Indicator, when the MACD falls below the signal line, it is a bearish signal that it may be time to sell. Conversely, when the MACD rises above the signal line, the indicator sends a bullish signal, which indicates that the asset price may be experiencing upward momentum.

def buy_sell(signal):
    Buy = []
    Sell = []
    flag = -1
    
    for i in range(0, len(signal)):
        if signal['MACD'][i] > signal['Signal Line'][i]:
            Sell.append(np.nan)
            if flag != 1:
                # 买入信号
                Buy.append(signal['close'][i])
                flag = 1
            else:
                Buy.append(np.nan)
        elif signal['MACD'][i] < signal['Signal Line'][i]:
            Buy.append(np.nan)
            if flag != 0:
                # 卖出信号
                Sell.append(signal['close'][i])
                flag = 0
            else:
                Sell.append(np.nan)
        else:
            Buy.append(np.nan)
            Sell.append(np.nan)
            
    return (Buy, Sell)

Import buy and sell signals into dataset

df['Buy_Signal_Price'], df['Sell_Signal_Price'] = buy_sell(df)
df.head()
5380fbd3ba66315ef7bc86fa8fc4fda7.png

Visual buy and sell signals

period = 500  # 可视化范围
plt.figure(figsize=(16,8))
plt.scatter(df.index[-period:], df['Buy_Signal_Price'][-period:], label="Buy", color='green', marker='^',alpha=1, linewidths=5)
plt.scatter(df.index[-period:], df['Sell_Signal_Price'][-period:], label="Sell", color='red', marker='v',alpha=1, linewidths=5)
plt.plot(df['close'][-period:], label='Close Price', alpha=0.35)
plt.title('Close Prcie Buy & Sell Signals')
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price', fontsize=18)
plt.legend(loc='upper left')
plt.show()
b7a4978c2ea1bc71d036db4de061f3b4.png

In the picture, green is a buy signal and red is a sell signal. It can be found that this indicator is more useful during large fluctuations. During small fluctuations, there are many cases of selling at the bottom and buying at high levels. Therefore, technical indicators cannot be completely trusted. In actual combat, it is still Use your own judgment.

Decision tree regression model

Generate target value

If we predict the trend in the next 60 days, we first shift the closing data upward by one range.

# his_period = 60
df['Prediction'] = df[['close']].shift(-his_period)

Data preprocessing

When we visualized the buying and selling signals earlier, we used the closing values. In order to avoid the leakage of training data, these data were processed first and then imported into the data onehotset as a set of features.

df['Buy_Signal_Price'][np.invert(df['Buy_Signal_Price'].isna())] = 1
df['Buy_Signal_Price'] = df['Buy_Signal_Price'].fillna(0)

df['Sell_Signal_Price'][np.invert(df['Sell_Signal_Price'].isna())] = 1
df['Sell_Signal_Price'] = df['Sell_Signal_Price'].fillna(0)

Divide training set and test set

Here we are going to predict the trend in the next 60 days, so we first divide the data set into (2015~ the first 60 days), then divide the training set into (2015~ the first 120 days), and the test set is (the first 120 days~ (previous 60 days), that is, using the data of the last 60 days to predict the trend of the next 60 days.

X = np.array(df.drop(['Prediction'],1)[:-his_period])
y = np.array(df['Prediction'][:-his_period])
x_train = X[:-his_period]
x_test = X[-his_period:]
y_train = y[:-his_period]
y_test = y[-his_period:]
print(x_train.shape, x_test.shape, y_train.shape, y_test.shape)

Output:(1601, 10) (60, 10) (1601,) (60,)

Train a decision tree model

Everything is ready, we use sklearnthe library DecisionTreeRegressorto build the model, and use the previously divided training set to train the model.

tree = DecisionTreeRegressor(criterion='mse', splitter='best', max_depth=None, min_samples_split=2, min_samples_leaf=1).fit(x_train, y_train)

Verify model effect

Import test set data into the model for prediction

prediction = tree.predict(x_test)
print("The model training score is" , tree.score(X, y))

Output:The model training score is 0.9977305489482631

period = 500  # 可视化范围
valid = df[X.shape[0]-his_period:-his_period]
valid['Prediction'] = prediction
plt.figure(figsize=(16,8))
plt.title('Model')
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price', fontsize=18)
plt.plot(df[-period:X.shape[0]-his_period+1]['close'], linewidth=3, color='blue')
plt.plot(valid['close'], linewidth=5, alpha=0.5)
plt.plot(valid['Prediction'], linewidth=2, color='red')
plt.legend(['Train', 'Val', 'Predictions'], loc='lower right')
plt.show()
059cc001ca0c8397714cbfc624ea10c6.png

As can be seen in the figure, the model predicted that a big plunge would occur in about 30 days (but I didn't expect it to happen so quickly in reality...), and then the price rebounded slightly and gradually stabilized.

Predict future trends

We re-obtain the data of the last 60 days and import the model to obtain the predicted value

x_future = np.array(df.drop(['Prediction'], 1)[-his_period:])
prediction = tree.predict(x_future)

plt.figure(figsize=(16,8))
plt.title('Model')
plt.xlabel('Days', fontsize=18)
plt.ylabel('Close Price', fontsize=18)
plt.plot(prediction)
plt.legend(['Predictions'], loc='lower right')
plt.show()
3d3be0fd1e807b881fe922931f257b4b.png

The model shows that the market will be relatively flat in the next month, and there seems to be some opportunity in the second month. But is it accurate? Let's wait and see...

Source code download

7faa0160aa38cab9606be35edcf91060.png

Relevant documents and information for this issue can be found on the public account "Deep Awakening" and reply: "trade02" in the background to obtain the download link.

12b615b3a24ba26bda810c735118a8fa.gif

Guess you like

Origin blog.csdn.net/weixin_47479625/article/details/122711265