Interpretation: GCN-based stock forecasting model

Foreword: Since ICLR2017 first proposed the concept of graph convolutional neural network (GCN), this model has shown excellent performance in tasks such as node classification and edge prediction. In the traditional factor stock selection model, stocks are often regarded as independent individuals, but in fact there are intricate relationships between stocks. Using GCN can incorporate the relationship between stocks as incremental information into the prediction model, which has certain value-added and Heuristic.

Scholars have conducted a lot of research on the impact of the relationship between stocks. The REST model mentioned in the previous article models the stock relationship graph and achieves the effect of improving model performance, but it does not involve the graph convolutional neural network. a model. Considering the outstanding performance of GCN in various tasks is impressive, this paper selects two literatures to construct different stock relationship networks and apply GCN for prediction.



Part 1: Overview of GCN

Graph (Graph) is generally expressed as: , where V is a collection of vertices, and E is a collection of edges. In the model of this paper, the stock is used as the vertex, and the relationship between the stocks is represented by the edge, and the weight of the edge represents the strength of the relationship.

Graph Convolutional Neural Network (GCN)

Regarding the basic concepts and applications of graph neural networks, the Google team published two articles on distill [1]: "A Gentle Introduction to Graph Neural Networks", "Understanding Convolutions on Graphs" provides interesting interactive pages and very detailed explanation.

GCN is developed from graph signal theory and spectral domain graph convolution. It is one of the commonly used variants of GNN. Inspired by CNN, researchers successfully redefine the concept of convolution on graph data. The basic idea of ​​GCN is to explicitly or implicitly map the relationship between samples. Taking the graph structure and node features as input, it can capture the complex interactions of nodes in the graph through aggregate information and nonlinear transformation to create new features.

f17610b9ebb422e59daef9d5385449b5.png

Figure 1 GCN

The propagation method between GCN neural network layers is:

in:

  •   is the identity matrix, is the adjacency matrix

  •    is the degree matrix of ,  

  •   is the feature of each layer, and in the input layer is the feature matrix of the node  

  •   is the nonlinear activation function

The method of GCN is semi-supervised classification, even if only a few nodes have labels, it can be trained. In particular, even if it is not trained, the features extracted by GCN are also very good, which is different from other commonly used nodes. There is a very significant difference in the neural network model of .

The main problems involved in graph-structured data are divided into three levels, corresponding to graphs, edges, and nodes. Once GCN came out, it has been widely used in graph problems. For node problems, node classification and clustering can be performed to identify the most influential nodes; for edge problems, missing connections can be predicted; for graph problems, graphs can be classified. wait. It has demonstrated its excellent performance in recommender systems, fake user detection and other problems. In the main scenario of this article, the state of the node, that is, the stock, is the focus of our attention, and the change of the stock state is transmitted through the edge to predict the rise and fall of a specific stock.


Part II: Graphs in the Stock Market

Why the stock market can be represented using a graph

Stock linkage phenomenon : stock linkage is the free law of the market. In the stock market, stocks with high correlation often rise and fall together. This phenomenon of simultaneous rise and fall is called stock linkage.

Due to the various connections between stocks, such as the same industry, upstream and downstream supply chains, payment networks, business partnerships, and equity connections, there is cross interaction between stocks, which also explains the stock linkage phenomenon to a certain extent.

Lead -lag theory: A large number of empirical results have confirmed that stock prices have obvious lead-lag characteristics, that is, some stock prices lead or lag behind other stocks. The industry information diffusion hypothesis can explain this phenomenon: new information is always displayed first in industry leaders, and then spread to other companies in the same industry.

Considering the characteristics of the stock market above, the complex relationship between stocks should be taken into consideration by researchers, and the graph structure can well represent the relationship between entities, so the graph is used to represent the stock market, and based on this for further analysis.

graphs used in the model

For the relationship diagram between stocks, there are many different angles to consider. Some scholars choose to learn the relationship between stocks through models, and some use financial prior knowledge to form relationship diagrams; the types of relationships between stocks are also different, and the relationship diagrams used by the authors in the two literatures selected in this paper are also different.

GCNET: Impact Network

Figure 2 Steps to build an influence network

This model uses stock historical data to establish an influence network. First, four prediction models are trained on the basis of quadratic discriminant analysis (QDA). After the predicted value is obtained, the influence score is calculated for each pair of stocks: Connection edges between nodes. Afterwards, the influence scores are used as the weight sorting, and the edges are removed in ascending order until the graph is no longer connected, thus retaining strong connections and avoiding noise interference. Finally, the weights are normalized to obtain the final influence network.

Multi-GCGRU   

The method of constructing graphs in this model is mainly based on prior financial knowledge. The following three relationship graphs are extracted, corresponding to shareholding relationships, lead-lag effects, and current news influences. 

  • Share-holding graph

A company's performance will not only affect its own financial report, but also affect the interests of its shareholders and thus its stock price. Based on the shareholding relationship between enterprises, the shareholding ratio is used as the weight of the edge to construct the shareholding relationship graph. However, since the shareholding relationship is not common, the graph is relatively sparse, which affects the effectiveness of the expression to a certain extent.

  • Industry graph (Industry graph)

The lead-lag effect in the market is related to the size of the company, and the stock returns of large companies are often ahead of small companies in the same industry. The model mainly focuses on the influence within the industry, there is no connection between industries, and the edge weight between companies in the industry is set to , indicating the size of the company

  • Topicality graph

Since the news will affect the relevant stocks of a certain topic, and there is a certain correlation between the stocks of the same topic, the topic information of the stocks is obtained from the Internet to construct the topic graph of current affairs. The same stock may have multiple hashtags, and the number of common hashtags can be used to measure the relationship between stocks to a certain extent, where represents the number of shared topics, and represents the number of topics of the stock.


Part III: GCN Model

Brief description of existing models and their shortcomings

Many scholars have proved through experiments that considering the data of related stocks can help predict the trend of stock price changes, but the current research in this area is not very sufficient. How to model this relationship and use it to improve the prediction ability of the model is still a challenging problem. Many current methods only use graph analysis technology, which has poor versatility and flexibility and limited predictive effect.

Although some studies also use the data of related stocks to train the prediction model, most of them have the following disadvantages:

  1. It is difficult to flexibly and intuitively represent the associated information in the model.

  2. Excessive reliance on expert experience as prior knowledge.

  3. The model architecture is based on a specific task and only contains a fixed set of stocks, which is inconvenient to expand.

In response to these problems, using the GCN network is more interpretable, more flexible, and can aggregate useful information more effectively. In order to better adapt to the stock price prediction problem, the following two models have made some improvements to the basic network.

The basic structure of the two models

GCNET

3a763e05ef2dacc0ab3f1b1748af8f14.png

Figure 3 GCNET construction

GCNET transforms the original problem into a graph label prediction problem, based on the historical price data and degree modeling of related stocks, and generates a prediction model with a semi-supervised algorithm. Use the PLD method (plausible label discovery) for some nodes to make reliable predictions as initial labels, and then use GCN processing to redefine labels and predict unlabeled stocks.

The basic idea of ​​the PLD method used in the model is to use the historical data before the target time point to train multiple basic classifiers, and select the prediction result of the classifier with the best performance as the initial label.

The real stock data map of the 4 days before the forecast date and the map of some labels using the PLD method on the day of the forecast are used as input to predict the complete label: The basic structure of the model is three layers, and the first two layers are the basic GCN propagation layer with ReLU as the activation function. The last layer uses the Softmax function to process the classification output. Training is performed using cross-entropy loss.

Multi-GCGRU

e11dae40b3c59ec8dd314d2063407f2d.png

Figure 4 Multi-GCGRU architecture

The model is adjusted on the basis of the basic GCN propagation layer, and several types of predefined graph structures are combined into the model, and a multi-graph convolution layer is proposed: where is the Laplacian matrix corresponding to the three adjacency matrices.

Not only can the previously defined three types of relationship diagrams be used here, but also more relationship diagrams can be easily merged, which is easy to expand. Even without a predefined graph, learning from historical data can be done via dynamic graph convolutional layers.

From this, the output results of the Multi-GCN part are calculated on a daily basis, and then spliced ​​and put into GRU (Gated Reccurent Unit). As one of the variants of RNN, GRU can handle time series problems well. The formula of the GRU layer is: where represents the sigmoid activation function.

Model performance and value

Both models use Accuracy and MCC as the main evaluation index. The formula is as follows: The model generally obtains better results than the benchmark model. From the perspective of model establishment, it also provides a new idea for using stock relations to improve stock price prediction in the future guide.

GCNET

The author selected 93 stocks from the Nasdaq stock exchange, using daily stock data from 2011 to 2020, and compared them with other common models as follows:

120e70b42ec536c4414850b0c73ff26d.png

Figure 5 Performance of GCNET on the Nasdaq dataset

The prediction results of the top 20 stocks on Nasdaq are compared as shown in the figure:

e39c8b0e2ff8863694ec775bc6de8f8d.png

Figure 6 Comparison of GCNET model prediction results

        GCNET performs better than other baseline models, especially under MCC metric evaluation, showing a higher success rate in predicting growth and decline categories. Graph-based methods can make full use of the graph structure to infer labels, even if the initial label setting is wrong, it can also be corrected by the model.

9ce04418a0009f698c50a60e119abbf0.png

Figure 7 Label node ratio and accuracy

        The initial label selection in the model accounts for roughly 30% of the total number of stocks. If it is too low, the information will be insufficient. If it is too high, the noise will be too large, and the predictive ability will be limited.

Multi-GCGRU

        The data selected by the author are CSI 300 and CSI 500 from June 2015 to December 2019. The comparison with other models is as follows:

00eaa521c5fbe4663a2591e35958ee73.png

Figure 8 Multi-GCGRU performance

        Multi-GCGRU performs better, and uses different graphs to construct models. After comparative experiments, it is found that the current affairs topic graph has the strongest influence, and the shareholding relationship graph has the worst effect. This may be due to the fact that there are more retail investors in my country's stock market and they are more sensitive to news information. .

9a45d1181de78d0fe310bfb771b0d9b2.png

Figure 9 Multi-GCGRU performance of different durations

        In the selection of the length of historical data, the parameter selection is 7 days, and the accuracy rate of forecasting on the CSI 300 and CSI 500 is the best.


Part IV: Summary

The interconnected nature of stocks in the market makes them suitable for modeling using graph structures. The graph neural network provides researchers with a new method to analyze the relationship between stocks. There are many graph-based models, such as the graph attention network (GAT), GraphSAGE, etc. Here, only the literature on the application of GCN in predicting the trend of stock changes is selected. a brief overview.

The two literatures have made different improvements based on GCN. GCNET mainly uses historical data to build influence networks, turning the original problem into a semi-supervised label prediction problem; Multi-GCGRU designs three kinds of relationship networks and uses GRU to analyze timing information. Although the current accuracy rate still has some room for improvement, it provides an effective idea for further research, and the easy-to-expand nature of the model also makes it possible to further improve it.

Market risk, the investment need to be cautious. The above statement is only a review of historical events, and does not represent a view on the future, and does not serve as any investment advice.


references

[1] A Gentle Introduction to Graph Neural Networks (distill.pub)

[2] Ye J, Zhao J, Ye K, et al. Multi-graph convolutional network for relationship-driven stock movement prediction[C]//2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021: 6702-6709.

[3] Alireza Jafari & Saman Haratizadeh, 2022. "GCNET: graph-based prediction of stock price movement using graph convolutional network,"The Journal of Financial Data Science Oct 2022, 4 (4) 152-166; DOI: 10.3905/jfds.2022.1.104

[4] Chen Q, Robert C Y. Graph-Based Learning for Stock Movement Prediction with Textual and Relational Data[J]. The Journal of Financial Data Science, 2022, 4(4): 152-166.

Guess you like

Origin blog.csdn.net/FrankieHello/article/details/130355560