Paper address: A Generalization of Convolutional Neural Networks to Graph-Structured Data
GNN's approach is to forcibly change the data of a graph structure into a data similar to the rules, so as to achieve 1-dimensional convolution.
Traditional convolution : After a fixed number of neighbor nodes are sorted, they are multiplied by the same number of convolution kernel parameters and then summed.
Then convolution can be divided into two steps:
1. Construct the neighborhood (neighbors are fixed and ordered)
2. Inner product of the points of the neighborhood and the convolution kernel parameters
But for graph structure data: the neighbor nodes are not fixed, and the graph structure is disordered . Therefore, it is necessary to find a solution to construct the neighborhood.
Solution idea : random walk , select a fixed number of neighbor nodes according to the probability expectation and size of being selected.
Implementation :
First define
PPP matrix is the transition matrix of random walk on the graph,P ij P_{ij}PijIs the transition probability from node i to node j.
SSThe S matrix is a similarity matrix, which can be understood as an adjacency matrix.
DDD matrix is a degree matrix,D ii = ∑ j S ij D_{ii}=\sum_jS_{ij}Dii=∑jSij.
Assuming that the graph structure is known, then SSS andDDD is known, thenPPP is defined as follows:
P = D − 1 SP=D^{-1}SP=D−1S
The above formula is equivalent to normalizing S, that is, using the normalized adjacency matrix as the graph transition matrix.
The multi-step transition matrix is defined as QQQ:
Q ( 0 ) = I , Q ( 1 ) = I + P , … Q ( k ) = ∑ j = 0 k P k Q^{(0)}=I,Q^{(1)}=I+P,\dots Q^{(k)}=\sum_{j=0}^kP^k Q(0)=I,Q(1)=I+P,…Q(k)=j=0∑kPk
Where kkk is the number of steps,Q ij (k) Q^{(k)}_{ij}Qij(k)Indicates the expected number of visits of the i-node after k steps to reach j, which is visualized as shown in the figure below.
Then the problem of selecting the neighborhood is solved. You can select the desired neighborhood node according to the expected number of visits, π i (k) (c) \pi_i^{(k)}(c)Pii(k)( c ) represents the sequence number of the node, which means that the node has passedkkk the steps ofiiThe expected visit from the i- nodeccFor the node with large c , then there are nodes expected to visit size order:
Q i π i (k) (1)> Q i π i (k) (2)>…> Q i π i (k) (N) Q_{i \pi_i^{(k)}(1)}>Q_{i\pi_i^{(k)}(2)}>\dots>Q_{i\pi_i^{(k)}(N)}Qi πi(k)(1)>Qi πi(k)(2)>⋯>Qi πi(k)(N)
Define convolution : pp before
each point is selectedp expected maximum as neighbor nodes, and theorder is from expectedQQQ decided. Convolution kernel isWWW withppp parameters.
For example: