Graph neural network study notes

The core idea is to learn a function map f, through which a node in the map  v_{i}can aggregate its own features  x_{i}and its neighbor features  x_{j}(j∈N( v_{i})) to generate  v_{i} a new representation of the node.

The distance between the output embedding vectors can represent the distance and vector between the nodes of the original graph 

The number of layers of the graph neural network actually refers to the distance that the calculation graph extends

As shown in the figure above, it is a two-layer neural network.

Each point has such a calculation graph, which can be regarded as a sample

It is worth mentioning that all white boxes and black boxes share a set of weights, that is, each layer of calculation graph shares a model

 It is worth mentioning that the graph neural network mentioned at this time is the most basic form, and the nodes in it also refer to attribute features (not yet involved)

Then express it in mathematical form:

The superscript k indicates the calculation graph of the kth layer 

This is the most critical part, I think.

A is an adjacency matrix, H^{k-1}which represents the embedding vector corresponding to the k-1 level of the calculation graph

At this point, the content in the box corresponds to the following part

 

But obviously there is a problem, that is, this is an extremely rough and forced averaging method. In other words, the weight of each neighbor node is the same, but it is not reasonable. The weight should be determined according to the number of neighbors of the neighbor node

Not only consider your own degree, but also consider the degree of the other party 

But there are still problems-

Compared with the original, D^{-1}the amplitude of the matrix multiplication will definitely become smaller, in order to avoid this from happening

 

The next question is that the nodes acting on node A are only neighbor nodes, not including yourself! ! ! Big joke!

Modify the adjacency matrix: the diagonal is all 1

 core part! ! !

Considering that the contribution of the own node to itself may be different from the contribution of other neighbor nodes to itself, another weight is used.

 

 

 

Guess you like

Origin blog.csdn.net/weixin_62375715/article/details/130092360