Newbie’s understanding of GCN image convolution network (1)

In the past two days, I need to use the GCN model to write articles, so I have been looking for relevant information to learn. I will record my learning experience here. If there is something wrong, you are welcome to correct me.

GCN can be used as a type of GNN, so when watching related videos, you will often see many bloggers explaining the model starting from the most basic GNN. 

First, here are a few articles and videos that I think are particularly good:

article 

https://distill.pub/2021/gnn-intro/

This is an article posted on distill. It contains many visual diagrams and interactive interpretations. Many bloggers on YouTube have analyzed and interpreted this article. It is highly recommended to read it. But this article In fact, it mainly focuses on the analysis of GNN, and only briefly touches on GCN. 

Understanding Convolutions on Graphs

Another article by the same author explains image convolution and can be used as a second reading material. 

video

There are several videos on YouTube about the interpretation of GCN, which are very clear and vivid:

https://www.youtube.com/watch?v=uMtDrG107Ws

Below are some of my reading experiences. The pictures are from the above two articles and videos. 

First, what is GNN: 

A GNN is an optimizable transformation on all attributes of the graph (nodes, edges, global-context) that preserves graph symmetries (permutation invariances).” 

Translated: GNN is an optimizeable transformation of all attributes of the graph (nodes, edges, global background), which preserves the symmetry of the graph (envelope invariance). It transforms the properties of the graph without changing the structure of the graph.

The main classifications of GNN are as follows: 1) Recurrent GNN 2) Convolutional GNN 3) Graph Auto Endor (GAE) 4) Spatial temporal GNN 

The image neural network follows the principle of graph-in, graph out, that is, what goes in is a graph, and what is predicted is also a graph. For data processing, GNN will also follow Turing things (data) into numbers and finding the patterns in those numbers. So when building, the main thing is to find the corresponding data to build, train and verify the model. 

Here the author uses several real-life examples to represent graph G, such as article citations, people's social circles, Baidu link entries, etc. The graph also includes directed graphs and undirected graphs, such as WeChat friends Circles (except for special cases such as blocking and deletion), the relationship between friends is mutual, and the graph formed in this way is an undirected graph. But for example, on Weibo, Station B, etc., the people you follow may not necessarily follow you, like this It is a directed graph. 

For graph G, it mainly contains three features G=(V,E,U): 1) V: node attributes 2) E: edge attributes 3) U: global attributes ). In actual calculations, the first two are often used more frequently, and the latter are optional. 

The matrix composed of these three features is called adjacency matrices. It is worth noting that when the vertices are sorted differently, the resulting matrices are likely to be different. For example, as shown in the figure below, although They represent the same graph, but because the vertices are sorted differently, the resulting pictures are also different. And it can be seen that the resulting graph is actually relatively sparse. 

In this case, you can use another more effective way to store. As shown in the figure below, if you use this way to store, no matter how the order of the points is transformed, as long as the position of the corresponding adjacency matrix is ​​adjusted accordingly, that is Can. 

 Then the author introduced the simplest GNN model:

For the prediction of graphs, the characteristics of points, edges and the whole world are generally predicted. In this simplest GNN model, MLP is performed on the points, edges and the whole world respectively, and the final prediction result can be obtained. 

But in fact, for the three features of V, E, and U, they can actually be converted into each other. For example, the features of points can be converted into features of edges, and the features of edges can be converted into features of points (using pooling). In this case It is suitable for when there is no feature for a point or edge but prediction is needed. The process is as follows.  

At this time, GCN is introduced. When pooling is performed on edges and points respectively, it is somewhat similar to convolution. At this time, the image is convolved.

There are several other concepts that may be more important in the calculation process. 

Batch Size: That is, when training a model on data, it is often not trained one by one (note), but on a batch of data. This value generally depends on the computer's capabilities and the properties of your own graph. , common values ​​include 32, 64, 128, etc.

Multigraph: Because for the connection between points, there may not be only one line between them (there may be several), and different graphs may be formed at this time. It can be represented by different adjacency matrices. This kind of In this case, a multigraph GNN is formed. 

Common python packages used to process GNN models are: PyTorch, tensorflowscikit-learn and

Guess you like

Origin blog.csdn.net/weixin_44897685/article/details/130688318