[python quantification] backtrader-based deep learning model quantification backtesting framework

ef62b04b52e268f4e43005fe72c1936f.png

written in front

In this article, we will introduce the use of PyTorch to build a deep learning model and integrate it into the backtrader backtesting framework. Specifically, we will use PyTorch to implement a long short-term memory neural network (LSTM) model and apply it to stock price prediction. Since backtrader currently does not have a native module that supports deep learning, we need to first implement a deep learning model, train and test it first, and then integrate the saved model with backtrader for backtesting.

1

foreword

Backtrader is an open-source Python framework for quickly designing, testing and deploying trading strategies. It is based on a vectorized calculation method, providing a wealth of tools and data structures, which can facilitate the development of backtesting and trading strategies. With Backtrader, you can easily acquire, process and analyze financial market data, write and optimize trading strategies, visualize and backtest them. Backtrader offers a number of built-in trading indicators and simulated traders that can help quickly test and evaluate different strategies. If you want to learn more about Backtrader, you can get Backtrader's api documentation through its official website (https://www.backtrader.com/).

2

Environment configuration

Local environment:

Python 3.7
IDE:Pycharm

Library version:

numpy 1.18.1
pandas 1.0.3 
sklearn 0.22.2
matplotlib 3.2.1
torch 1.10.1
tushare 1.2.60
backtrader 1.9.76.123

3

Code

overall design

In order to implement this deep learning model-based backtrader backtesting framework, we can divide it into the following modules:

1. Data acquisition module: use the tushare library to acquire historical stock data and preprocess the data

2. Deep learning model module: Use PyTorch to create a basic LSTM model, train and test stock historical data in the form of a sliding window, and then save the trained model to facilitate the call of the backtesting framework.

3. Quantitative strategy module: implement a simple trading strategy based on the LSTM model based on the Backtrader framework. This strategy will call the trained LSTM model at the close of each day to predict whether the subsequent market will rise. If the forecast rises, then buy at the opening price tomorrow, and if it falls, sell at the opening price tomorrow.

data collection

Next, we will implement a function to obtain stock historical data, please replace tushare API token with YOUR_API_TOKEN in the code. Next, you can use the get_stock_data function to obtain the historical data of the specified stock, and then save the data locally. Here we take the historical data of China Merchants Bank for two years as an example.

def get_stock_data(code, start_date, end_date, token):
    ts.set_token(token)
    pro = ts.pro_api()
    df = pro.daily(ts_code=code, start_date=start_date, end_date=end_date)
    df = df.sort_values(by="trade_date", ascending=True)  # 对数据进行排序,以便滑动窗口操作
    df.set_index("trade_date", inplace=True)
    return df




stock_code = "600036.SH"
start_date = "20200101"
end_date = "20220101"
api_token = "YOUR_API_TOKEN"


data = get_stock_data(stock_code, start_date, end_date, api_token)
data.to_csv('./data.csv')
print(data.head())

model realization

A simple LSTM model is used here for testing, and the model has not been further tuned. In order to make the model better predictive, the following are some methods that may improve the effect of the LSTM model: Change the network structure: increase the number of layers or increase the hidden The number of cells can improve the expressive power of the LSTM model. However, too many layers or hidden units may also lead to overfitting, so a reasonable choice needs to be made between the training set and the test set. Add regularization : LSTM models can also use methods such as L2 regularization to avoid overfitting. Add more features : In addition to the opening price, you can also consider adding other features, such as the highest price, lowest price, trading volume, etc., to improve the expressive ability of the model. Adjust hyperparameters : You can use methods such as grid search to find the best combination of hyperparameters, such as learning rate, batch size, sliding window size, etc. The LSTM model code used is as follows:

class SimpleLSTM(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, num_classes, dropout_rate=0.2):
        super(SimpleLSTM, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.rnn = nn.RNN(input_size, hidden_size, num_layers, batch_first=True)
        self.fc1 = nn.Linear(hidden_size, hidden_size)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(dropout_rate)
        self.bn = nn.BatchNorm1d(hidden_size)
        self.fc2 = nn.Linear(hidden_size, num_classes)
        self.sigmoid = nn.Sigmoid()


    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)
        out, _ = self.rnn(x, h0)
        out = self.fc1(out[:, -1, :])
        out = self.relu(out)
        out = self.dropout(out)
        out = self.bn(out)
        out = self.fc2(out)
        out = self.sigmoid(out)
        return out

Next, we need to divide the data set, here only the closing price of the stock data is used to build the data set. First, normalize the original data, and then use the sliding window to input the closing price of the past period of time to predict whether the closing price will rise in the next three days. Finally, we need to convert the data into a model suitable for training form:

def create_dataset(stock_data, window_size):
    X = []
    y = []
    scaler = MinMaxScaler()
    stock_data_normalized = scaler.fit_transform(stock_data.values.reshape(-1, 1))


    for i in range(len(stock_data) - window_size - 2):
        X.append(stock_data_normalized[i:i + window_size])
        if stock_data.iloc[i + window_size + 2] > stock_data.iloc[i + window_size - 1]:
            y.append(1)
        else:
            y.append(0)


    X, y = np.array(X), np.array(y)
    X = torch.from_numpy(X).float()
    y = torch.from_numpy(y).long()
    return X, y, scaler

After that, divide the training set and test set and train and test the model, and save the trained model for later calling in the backtrader strategy:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


train_data = TensorDataset(X_train, y_train)
train_loader = DataLoader(train_data, batch_size=32, shuffle=True)


model = SimpleLSTM(input_size, hidden_size, num_layers, num_classes)


criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)


num_epochs = 200


# 训练模型
for epoch in range(num_epochs):
    for i, (batch_X, batch_y) in enumerate(train_loader):
         outputs = model(batch_X)
         loss = criterion(outputs, batch_y)
         optimizer.zero_grad()
         loss.backward()
         optimizer.step()


    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')


print('Finished Training')
torch.save(model.state_dict(), 'lstm_model.pth')


model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    test_data = TensorDataset(X_test, y_test)
    test_loader = DataLoader(test_data, batch_size=32, shuffle=False)
    for batch_X, batch_y in test_loader:
          outputs = model(batch_X)
          _, predicted = torch.max(outputs.data, 1)
          total += batch_y.size(0)
          correct += (predicted == batch_y).sum().item()


print(f'Accuracy of the model on the test data: {100 * correct / total}%')

backtrader strategy implementation

Next, build the backtrader strategy. First, load the saved model in Strategy, and then call the model every time the market closes in next. It should be noted that the prediction is made in the form of a sliding window, so a counter variable is defined to calculate whether enough data has been collected. After that, the data of the window is normalized and input into the model, and the trading behavior of the next transaction is determined according to the prediction results of the model. The trading rules here are simply set as follows: if you predict that the future will rise and you have no position, you will open the market to buy on the next trading day; if you predict that the future will fall and you have a position, you will open the market to sell on the next trading day. When backtesting, the principal is 10,000, and only one lot is traded for each transaction, and the commission is set to 10,000.

# 构建策略
class LSTMStrategy(bt.Strategy):


    def __init__(self):
        self.data_close = self.datas[0].close
        self.model = SimpleLSTM(input_size, hidden_size, num_layers, num_classes)
        self.model.load_state_dict(torch.load('lstm_model.pth'))
        self.model.eval()
        self.scaler = scaler
        self.counter = 1


    def next(self):
        if self.counter < window_size:
            self.counter += 1
            return
        previous_close_prices = [self.data_close[-i] for i in range(0, window_size)]
        X = torch.tensor(previous_close_prices).view(1, window_size, -1).float()
        X = self.scaler.transform(X.numpy().reshape(-1, 1)).reshape(1, window_size, -1)


        prediction = self.model(torch.tensor(X).float())


        max_vals, max_idxs = torch.max(prediction, dim=1)
        predicted_prob, predicted_trend = max_vals.item(), max_idxs.item()


        if predicted_trend == 1 and not self.position:  # 上涨趋势
            self.order = self.buy() # 买入股票
        elif predicted_trend == 0 and self.position:  # 如果预测不是上涨趋势且持有股票,卖出股票
            self.order = self.sell()




# Load test data
test_data = pd.read_csv('data.csv', index_col=0, parse_dates=True)


# Create a cerebro entity
cerebro = bt.Cerebro(runonce=False)


# Add data to cerebro
data = bt.feeds.PandasData(
    dataname=test_data,
    datetime=None,
    open=1,
    high=2,
    low=3,
    close=4,
    volume=8,
    openinterest=-1)
cerebro.adddata(data)


# Add strategy to cerebro
cerebro.addstrategy(LSTMStrategy)


# 本金10000,每次交易100股
cerebro.broker.setcash(10000)
cerebro.addsizer(bt.sizers.FixedSize, stake=100)


# 万五佣金
cerebro.broker.setcommission(commission=0.0005)
# Print out the starting conditions
print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue())


# Run over everything
cerebro.run()


# Print out the final result
print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())


# Plot the result
cerebro.plot()

Strategy operation effect

First, the model was trained and tested. After multiple epoch iterations, the trained model achieved a classification accuracy of 66% on the test set.

Finished Training
Accuracy of the model on the test data: 66.3157894736842%

Then call the backtrader backtesting engine for backtesting. The visualization of the backtest results is shown in the figure below. It can be seen that each transaction is one lot, and the capital after the backtest is 11686.92. There were many losses in the early stage, and the intermediate model captured a large upward trend, followed by multiple positive returns. .

631a32584265cdca771370f9d8f637f8.png

4

Summarize

The trading strategy constructed in this article uses a deep learning model to construct trading signals, and implements quantitative backtesting through the backtrader framework. Compared with traditional technical analysis methods, the use of deep learning models can make better use of historical data information and enrich trading strategies. At the same time, using the backtrader framework can simplify the backtesting process, automate trading decisions, and provide rich visualization tools to facilitate users to analyze and optimize backtesting results. This framework simply uses the backtrader framework for historical backtesting of deep learning models, so there are still many things that can be improved, such as adding filter conditions, conducting out-of-sample tests, replacing other models, and introducing more features. In short, this article provides a way to build a backtesting framework based on deep learning based on backtrader. Interested readers can test more deep learning and machine learning models based on this framework.

The content of this article is only for technical discussion and learning, and does not constitute any investment advice.

Get the complete code and data, as well as other historical articles. The complete source code and data can be added to the knowledge .

cccf34326660650e715bb0636fcaac66.png

"Artificial Intelligence Quantification Laboratory" Knowledge Planet

add0b3747d02641862ae90d8f88af7af.png

Join Knowledge Planet of Artificial Intelligence Quantification Laboratory, and you can get: (1) Regularly push the latest research results related to the quantitative application of artificial intelligence, including high-level journal papers and high-quality financial engineering research reports of securities companies, so that you can understand the latest cutting-edge knowledge anytime, anywhere; (2) The complete source code of the Python project for official account history articles; (3) PDFs of e-books related to high-quality Python, machine learning, and quantitative trading; (4) High-quality quantitative trading data and project code sharing; (5) Communicate with star friends and make friends Like-minded friends. (6) Initiate questions to bloggers and answer questions.

Guess you like

Origin blog.csdn.net/FrankieHello/article/details/130453151