GNN/GCN

GNN/GCN

Distill:https://distill.pub/2021/gnn-intro/

Insert image description here

Three major questions:

  1. Graph-level task (classification)
  2. Node-level task (vertex attribute judgment)
  3. Edge-level task (edge ​​attribute judgment)

Information storage: (storage is efficient and not affected by sorting)

  • Nodes: Scalar/Vector
  • Edges: Scalar/Vector
  • Adjacency List: The length is the same as the edge. The i-th item indicates which two vertices the i-th edge is connected to.
  • Global: scalar/vector

The “message passing neural network” framework is graph-in, graph-out, and does not change the connectivity of the graph.

The simplest GNN

The nodes vector, edges vector, and global vector construct an MLP respectively as a layer. All connection information is not considered .

Pooling operation

How to get the predicted value from the last layer output?

If prediction is made for vertices: two classifications: enter MLP, softmax with output dimension 2 for each vertex. Note that there is only one MLP , shared by all vertices.

What to do with a vector that does not correspond to a vertex?

The edge vectors connected to the vertex and the global vector are added together (assuming the dimensions are the same), entered into the MLP, and output

What if there are only vertices and no edge vectors?

To aggregate (add) the vertex vectors to the edges (connected vertices), you can +U vector, enter the MLP of the edges, and output

What if there are no global vectors but only vertex vectors?

The vertex vectors are summed into the MLP of U, and the output

The structure diagram of the simplest GNN is as follows:

Insert image description here

limitation:

The structural information of the graph is not used during transformation , and the graph information is not updated into the graph.

Improvement: passing messages

Insert image description here

The simplest message passing: when updating a vertex, add it with neighboring vertices and enter MLP

Similar to CNN, it is connected to adjacent pixels, the convolution kernel weights are the same, and the channel is MLP

Edge and vertex information aggregation can be done early:

  1. Pass the vertex information to the edges, the edge information is updated, and then aggregate the updated edge information to the vertices, and the vertices are updated (the dimensions are different)
  2. In turn, the results are different
  3. Alternate update

How to do global information U?

The graph is large and messages are transmitted far away. Add the master node or context vector (connected to all vertices and all edges), that is, U

U is connected to everything in E and V. When edges/vertices are gathered, U will also be added. Updating U will bring in all EVs and enter the MLP.

Similar to attention, get information similar to q

Aggregation: mean, max, sum

Other pictures

  1. There are different sides (directed and undirected)
  2. There are sub-pictures
  3. 。。。。

graph sampling batching

1. Randomly sample points, then find neighbors and create subgraphs to reduce storage

2. Randomly sample a point and walk randomly, and fix the number of random steps to get the subgraph.

3. Walk a few steps randomly and find neighbors.

4.diffusion sampling: Take a point and move the N nearest neighbors forward k steps to obtain a subgraph.

Inductive biases

Any machine learning has assumptions

CNN: spatial transformation invariance

RNN: temporal continuity

GNN: Maintain the symmetry of the graph (no matter how the vertices are exchanged, GNN remains unchanged)

In the aggregation operation, the Max Mean Sum is almost the same.

GCN as subgraph function approximators

GCN: (The one with convergence) If there are k layers and only look at neighbors, each vertex will see its subgraph, up to k steps away.

Point and edge duality

graph attention networks

Convolution weights are related to position, and GNN weights need to be insensitive to position.

Weights can depend on the relationship between vertex vectors, dot product, softmax, vertices get weights

Interpretability of graphs

Point and edge duality

graph attention networks

Convolution weights are related to position, and GNN weights need to be insensitive to position.

Weights can depend on the relationship between vertex vectors, dot product, softmax, vertices get weights

Interpretability of graphs

generative modelling

Guess you like

Origin blog.csdn.net/qq_52038588/article/details/131713352
GNN
Recommended