The core idea is to learn a function map f, through which a node in the map can aggregate its own features and its neighbor features (j∈N( )) to generate a new representation of the node.
The distance between the output embedding vectors can represent the distance and vector between the nodes of the original graph
The number of layers of the graph neural network actually refers to the distance that the calculation graph extends
As shown in the figure above, it is a two-layer neural network.
Each point has such a calculation graph, which can be regarded as a sample
It is worth mentioning that all white boxes and black boxes share a set of weights, that is, each layer of calculation graph shares a model
It is worth mentioning that the graph neural network mentioned at this time is the most basic form, and the nodes in it also refer to attribute features (not yet involved)
Then express it in mathematical form:
The superscript k indicates the calculation graph of the kth layer
This is the most critical part, I think.
A is an adjacency matrix, which represents the embedding vector corresponding to the k-1 level of the calculation graph
At this point, the content in the box corresponds to the following part
But obviously there is a problem, that is, this is an extremely rough and forced averaging method. In other words, the weight of each neighbor node is the same, but it is not reasonable. The weight should be determined according to the number of neighbors of the neighbor node
Not only consider your own degree, but also consider the degree of the other party
But there are still problems-
Compared with the original, the amplitude of the matrix multiplication will definitely become smaller, in order to avoid this from happening
The next question is that the nodes acting on node A are only neighbor nodes, not including yourself! ! ! Big joke!
Modify the adjacency matrix: the diagonal is all 1
core part! ! !
Considering that the contribution of the own node to itself may be different from the contribution of other neighbor nodes to itself, another weight is used.