Detailed explanation of graph neural network GNN in deep learning

Introduction 

picture

Graph Neural Network GNN is a branch of deep learning.

The four branches of deep learning correspond to four common data formats. Feedforward neural network FNN processes tabular data, which can be feature vectors. Convolutional neural network CNN processes image data. Recurrent neural network RNN ​​processes time series data. Graph neural network processes time series data. Network GNN processes graph data.

The FNN here refers to the network with input layer, hidden layer and output layer as shown in the figure below.

picture

But these four correspondences are not static.

If you straighten the image into a vector, you can also use FNN to process it.

RNN-type networks are most suitable for time-series data such as sound and text, but if you treat a sentence as a 1*N matrix, you can use CNN to process it. However, CNN needs to stack multiple layers to capture it. Contextual information in time series data, which is called receptive field in the image field.

Note: Although sentences can be expressed as matrices through certain techniques and then processed by CNN, in essence, CNN is still not as suitable as RNN for processing time series data.

If the image is divided into patches, these paths form a sequence, which can also be processed with a Transformer. This is the Vision Transformer: ViT.

The GNN we are going to talk about today can process not only tabular data, but also images and text.

More precisely, as long as it can be expressed as a GNN-compatible graph data structure, it can be processed by GNN.

Today’s article comes from a blog post published by distill: A Gentle Introduction to Graph Neural Networks.

网址:https://distill.pub/2021/gnn-intro/B站沐神视频讲解:https://www.bilibili.com/video/BV1iT4y1d7zP/?spm_id_from=333.999.0.0

This article is very well written. The highlight is not only interactive charts, but also PlayGround, it can be seen that the author has put a lot of effort into it.

Let’s first learn about what is a playground?

Playground usually refers to a system or platform used to interactively experience and visualize AI models in the fields of machine learning and artificial intelligence.

Two AI Playgrounds are shown below.

picture

https://catalog.ngc.nvidia.com/orgs/nvidia/teams/playground/models/clip

picture

https://catalog.ngc.nvidia.com/orgs/nvidia/teams/playground/models/codellama

NVIDIA NGC或者Hugging face上有更多好玩的AI Playground,大家可以亲自去体验一下。https://huggingface.co/

https://catalog.ngc.nvidia.com/

A Gentle Introduction to Graph Neural Networks

https://distill.pub/2021/gnn-intro/

But I will not copy the above article as it is, but will make a general summary. Everyonemustread the original article< /span>, because there are a large number of interactive charts and playgrounds in the original text, which can speed up the understanding of GNN.

Today’s protagonist is GNN, and the object processed by GNN isGraph. Like other neural networks, before their emergence, there were traditional methods to deal with the corresponding data structures.

However, with the support of data volume and computing power, in order to complete more complex tasks, corresponding neural networks have emerged, and the same is true for GNN.

The follow-up content is mainly divided into four parts

1. What kind of data can be naturally represented as a graph?

2. The difference between graphs and other types of data and how to deal with this difference

3. Build a GNN

4. Build a playground to train on real data

What is a graph?

A graph is a data structure composed of nodes and edges that represents relationships between objects.

The graph is also divided intodirected graphandundirected graph, for example, in social media, if we follow each other, it is an undirected graph. If, like Station B, I follow you, but you do not follow me, it is a directed graph.

picture

picture

Nodes have their own attributes, edges have their own attributes, and the entire graph also has its own attributes. As shown in the figure above, attributes can be scalars or Vector representation. Vector sizes can be inconsistent.

It has been revealed earlier that GNN can process images and text, so images and text can be represented as graphs.

First let me tell you how computers store graphs. One isadjacency matrix, and the other isadjacency. Table.

Each pixel of the image is a node, and the attribute of the node is naturally the RGB value. Adjacent pixels are separated by an edge. Represents the adjacency relationship, where an adjacency matrix is ​​used to represent connectivity.

picture

In the same way, each character, each word, and each Token in a sentence can be regarded as a node, and adjacent nodes can be connected by a directed edge, which forms a directed graph.

picture

In addition, there are many other data that can be represented as graphs.

Molecules are composed of two or more atoms connected into a whole by forming chemical bonds through shared electron pairs. Different connection methods between molecules constitute different substances.

picture

Society is a big family, and all members and the relationships between them constitute a complex social network.

picture

If this network can be fully utilized, it can exert great value, such as knowledge graphs and recommendation systems.

Citation relationships can also be organized into graphs. For example, every page in Wikipedia lists citations.

Okay, so far we have been able to represent many things in the form of graphs, which is enough to use GNN to process them. So what can we do?

GNN can mainly handle three types of tasks.

Layer level tasks

Predict the entire graph, and GNN outputs the predicted attributes of the entire graph. For example, in the figure below, a graph with two rings is distinguished. This is a classification task for the entire graph.

picture

For images and text, graph-level tasks are similar to MINIS digit classification, or sentence sentiment analysis, for example. Determine whether a sentence expresses positive or negative emotions.

capstone level tasks

GNN predicts the attributes of each node in the graph. For example, in the figure below, it predicts whether a member of the social network is a fraudster. This is a node classification task.

picture

For images and texts, the task at the vertex level is similar to image segmentation. Image segmentation is to classify each pixel. For text, it is to predict the part of speech of each word in the sentence (such as nouns, verbs, adverbs, etc.).

side level tasks

The existence or absence of edges is a structural attribute, that is, connectivity. GNN can also predict non-structural attributes of edges in the graph. What are non-structural attributes? For a social network, edges represent relationships between nodes, but how to measure closeness? This involves the non-structural properties of edges.

Predicting closeness is a regression task, while the figure below is a classification task. Each edge can only be one of watching, fighting, and standing on.

picture

picture

We will see later thatThese three types of tasks can be solved with a unified GNN network.

​​​​​​​

Challenges faced by GNN

GNN can handle the attribute prediction of nodes, edges, and graphs, but it is a challenge to predict the connectivity between nodes. As mentioned before, use the adjacency matrix. to represent connectivity. When the number of nodes increases, the matrix will become very sparse and the memory utilization is very low.

For example, in the figure below, the adjacency matrix corresponding to a random combination of only four nodes is very large.

What about more nodes?

picture

Also, when you transpose the matrix, the GNN output should not change because the transposition does not change the relationship between nodes.

picture

picture

In order to solve the challenges posed by the adjacency matrix,adjacency lists came into being.

picture

Now that we have cleared all the obstacles, let's build a simple classification network to see how GNN makes predictions?

picture

We said before that the attributes of nodes, edges, and graphs are a bunch of vectors, which are the tabular data mentioned earlier. You can use MLP to process vectors. Of course, you can also use other networks, as shown in the figure above. Each node, edge, and graph has its own An MLP, all point parameters are shared, that is, all points share one MLP, and all edges share one MLP.

In addition, compared with the previous layer, Layer n+1's graph connectivity has not changed, only its attributes have changed. That is, you can continue to use the initial adjacency list.

The above figure constitutes a GNN block, also called a GNN layer, similar to the convolutional layer in CNN. Multiple GNN layers can be accumulated to form a more complex network.

With the GNN block, if you want to build a node binary classification task, then the following network is enough.

picture

However, what should we do if a graph only has information on its edges, but no information is stored on the nodes, and we need to classify the nodes?

At this time, it is necessary to utilize the edge information, which is calledinformationaggregation .

picture

As shown in the figure above, when predicting a node, the information of all the edges connected to it is aggregated as the attributes of the node.

When aggregation operations are possibleAverage, maximum, sum.

So, after havinginformation aggregation, if there are only edge attributes, the nodes need to be classified. The network structure is as follows:

picture

Similarly, withinformation aggregation, if there is only node information, edges must be classified. The network structure is as follows:

picture

Withinformation aggregation, there is only node information, and the graph is classified:

picture

The final GNN structure is as follows:

picture

Withinformation aggregation, a more complex GNN network can be constructed. This network can not only be used for binary classification but also for multi-classification problems.

But did you notice that the simple GNN network above does not utilize connectivity information? Each node, each edge and the global context are processed independently. Connectivity is only used when aggregating information to make predictions.

Therefore, we can further utilize aggregated information operations to make more complex predictions.

So what to do?

We can usemessage passing to achieve this, that is, adjacent nodes or edges exchange information and affect each other's updates. Embed.

picture

As shown in the figure above, when processing the current node, the information of adjacent nodes is aggregated.

This is a bit like a convolution operation. A pixel of a feature map corresponds to an adjacent area of ​​​​the previous layer.

picture

After the information propagation operation, our GNN network is updated as follows:

Node level tasks:

picture

Side level tasks:

picture

Layer level tasks:

picture

Finally, a playground is shown that can change the number of layers of the network, the type of aggregation operation, and the length of each attribute vector, and the impact of each parameter can be demonstrated through the final model performance.

picture

Guess you like

Origin blog.csdn.net/qq_39312146/article/details/134477679