A Preliminary Study on the Graph Neural Network - GNN


title: Graph Neural Network (GNN)
date:
tags:

  • essay
  • Knowledge point
    categories:
  • [study notes]

A First Look at Graph Neural Networks (GNN)

Article source: https://distill.pub/2021/gnn-intro/

Preface: Tell me why you want to write this article, because I have been hearing about "Graph Neural Network" recently, but I have never understood what it is, so this time I have the opportunity to have a simple understanding of Graph Neural Network. Let's start Get down to business.

The entire article does not divide the content into chapters, but only explains it from beginning to end, so here I fill in the notes in order according to the order of the original text.

The article first said such a sentence We are starting to see...... The introduction of this sentence shows that the current graph neural network is still in the emerging stage, so there is still a certain development prospect-both a challenge and an opportunity.

The author explains and explains what a graph neural network is from the following four aspects (also the main point of this article):

  1. What kind of data can be represented as a graph;
  2. What is the difference between the graph and other data;
  3. Build a GNN;
  4. Provide a GNN playground ;

what is a graph

A graph represents the relationship between entity nodes, as shown in the following figure:

image-20211129201442254

Where U represents the entire graph.

In order to further explain what is a node, relationship and the whole graph, we can use the following way to represent what is a graph (expressed in the form of a vector):

image-20211129201925513

In addition, graphs can be represented as undirected graphs and directed graphs:

image-20211129202136128

Data is represented as a graph

So how can other data be represented as a graph?

Imgae as graphs

We regard the pixel point of the image data as a node, so that a graph can be constructed:

image-20211129202820443

Text as graphs

If a graph is used to represent a piece of text, it is represented by a directed graph. Then in a paragraph, each character, word, token, etc. can be used as a node of the graph:

image-20211129203321360

Of course, the article also explained that at present, if pictures and texts are expressed in the form of pictures, it is not a commonly used encoding method for both, because if they are expressed in the form of pictures, redundant representations will be generated.

other forms of data representation

  • Representation of a molecular graph:

    image-20211129204058449

  • Representation of social networks:

image-20211129204110758

  • Representation of competition relationship (Briefly talk about this example: it is the relationship between each person and others in a Taekwondo competition—whether to compete or not, it can be expressed in the form of a graph):

image-20211129204826923

Problem solved or application

So what problems can graphs solve? It can be considered from the following three aspects:

  • graph-level
  • Node level (node-level)
  • edge-level

The first is the task at the graph level. The article mentioned that a task is to check or check whether there are two loops in a molecular graph:

image-20211129205703476

Such problems are similar to classification problems in images and similar to sentiment analysis tasks in text.


The task at the node level, the example given in the article is the Taekwondo competition problem - assuming that the instructor Mr. Hi and the administrator John H have a disagreement during the competition, the node represents the personnel participating in the competition, and the edge represents the relationship, then the task is— —Correctly classify the participants as Mr. Hi's side or John H's side:

image-20211129210328340

Then in the image, it is similar to the segmentation task of the image, and each node represents a pixel;

In the text, it is similar to predicting the part of speech of the words in the sentence-part-of-speech tagging;


Considering at the edge level, this paper proposes such a scenario - assuming that in a game, characters are represented by nodes, then edges are used to represent the relationship between nodes:

image-20211129210829481

That is to say, the task is turned into predicting the relationship between nodes, which is very similar to the relationship extraction in the knowledge map:

image-20211129210943171

Problems in machine learning

The article mainly discusses the problems existing in the neural network, that is, how to represent the graph when the neural network is used on the graph.

So far, we know that graphs have four properties that we need to consider:

  • node;
  • side;
  • Full image attribute information;
  • connectivity

The first three can also be expressed—expressed by vectors, and the most difficult to express is the last one—the connectivity of the graph, that is, how to express the connectivity of two nodes?

Some people say that the adjacency matrix can be used to realize the storage representation of connectivity, but there are some problems in this way:

  1. Under the premise of a large amount of data, the graph cannot be stored;

  2. If it is represented by a sparse matrix, parallel computing cannot be realized on the GPU (this is also a problem to be solved at present);

  3. Another feature of the adjacency matrix is ​​that the information of a graph can be represented by different adjacency matrices, so how should the neural network deal with different data inputs? That is, how to process data to achieve non-sequential representation.

    image-20211129213719468

The paper proposes the following representation:

image-20211129214841530

  • For the nodes in the graph, use a scalar to represent (number the nodes in the graph), similar to the One-hot representation method;
  • For the edge in the graph, it is also represented by a scalar, and it is also represented by a method similar to One-hot;
  • The connectivity here is represented by an adjacency list, the length of the list represents the number of incoming edges, and the elements in the list are a two-tuple, which respectively indicate which two nodes are connected on both sides of the edge;

What is a graph neural network?

The article first gives a concept of GNN:

A GNN is an optimizable transformation on all attributes of the graph (nodes, edges, global-context) that preserves graph symmetries (permutation invariances).

The translation is that the graph neural network is a graph that can transform the properties of the graph and can maintain the symmetric information on the graph.

Symmetric information means that after reordering the positions of nodes, the structure of the graph remains unchanged.

The article says that the framework is used message passing neural networkto build GNN, and of course there are other networks for representation. In this network, the input is a graph network, and the output is still a graph network.

The simplest GNN

Let's construct an example of the simplest GNN, as shown in the figure below, for the vertex vector (that is, the node vector we mentioned earlier), the edge vector and the global vector, we construct a multi-layer perceptron respectively.

image-20211130161320595

In this way, three MLPs constitute a layer of GNN.

The function of this layer is to input the corresponding input vertex vectors, edge vectors and global vectors into the MLP, and then input the corresponding graphs. Only the attributes of the graphs have been changed, but the connectivity or structure has not changed.

how to predict

Then consider, how to predict?

  • simple case

Consider the simplest binary classification problem first. For the vertices, we already have all the vertex vector information. Through these vertex vector information, we can add a fully connected layer with an output size of 2 behind it, and finally use a softmax to do a classification can;

Similarly, for a multi-classification problem, it is only necessary to add a fully connected layer with an output size of n after the output, and finally add a softmax as a prediction.

For linear regression problems, only one connected layer with an output size of 1 is required.

The meaning of the figure below is that given the output of the last layer (a graph), then input the vertices into the fully connected layer, and finally get the predicted output.

It should be noted that all nodes share the parameters of the same fully connected layer; similarly, all edges also share the parameters of the same fully connected layer.

image-20211130163028677

  • complex situation

If there is no vertex information, but the situation of the vertex still needs to be predicted, how to deal with it?

One method mentioned in the article is - Pooling, which is divided into two steps:

  1. For elements that need to be pooled (elements here refer to edges adjacent to no vertex information), collect their vector representations;
  2. Add and sum all the collected vectors to get a new vector. At this time, this vector is the vertex vector we want;
  3. (A global vector also needs to be added, but it is not mentioned in the text, but what this thing is will be explained in the sharing of global information )
image-20211130165121454

Expressed by the formula method, it is represented by the following figure:

image-20211130170143142

The above figure shows that there are no points but only edges, so for the case of only point information and no edges, the following method can be used to express:

image-20211130170350855

Therefore, for the above method, we can see that no matter what kind of data is missing, we can obtain the missing data according to the pool method.

So to summarize a simple GNN, use the following figure to represent it. First, input a picture, and then get the final output through MLP. If there is a lack of data, you can use the pooling layer to process it, and then go through the classification layer to get the final prediction result. .

image-20211130171021377

Information transfer

The above method has a big limitation that it cannot use the structural information of the graph to predict . It can be seen that the above method only modifies the attributes of points, edges, and global information independently, without using the connection relationship. The information Passing can solve this problem very well.

In fact, the process of message passing is very similar to the process of pooling. The process can be represented by the following figure (summed by adjacent node information):

image-20211130172512604

Expressed in formulaic form as follows:

image-20211130173447125

The pooling mentioned above is done after the final output layer, so is it possible to 提前complete the attributes according to this method?

The answer is yes.

The following is based on this method, and the information in the graph can be shared and distributed, which is essentially the addition of vectors (only adjacent sides can be added, in other words, only adjacent sides connected to point V can be added) To add the vector of the point V to the side, the same reasoning is added to the point.)

image-20211130174210484

However, whether it is better to add from the point set to the edge set first, or to add from the edge set to the point set first (the two methods will produce different results), has not yet been determined. In addition, other methods are also given in the article, such as alternately converging pools, which will not be discussed here.

sharing of global information

In the cases we have discussed so far, there is a problem where information about vertices or edges that are not directly adjacent cannot be shared.

This is why we introduce the information or concept of global variables.

The method proposed in this article is to introduce a graph U—called master nodeor context vector. This U connects all the edges and points in the graph, so it can be used as a bridge for information transmission. Specifically, it can be represented by the following graph:

image-20211130214542140

Build a GNN

This part is that there is an interactive graph in the text, which shows that GNN is very sensitive to the adjustment of hyperparameters. Although it is not clear what the graph means, I will not elaborate here.

There is a lot of discussion at the back of the article, so I will add it when I have time.

Guess you like

Origin blog.csdn.net/c___c18/article/details/131154270