Graph embedding representation learning—Graph Embeddings

Embedding Entire Graphs

1. Basic concepts of graph embedding vectors

Please add image description

Unlike Node Embeddings, Graph Embeddings encode the entire graph or subgraph while ignoring the nodes within it. Application scenarios include anomaly detection or molecular toxicity detection.

2. Sum the Node Embeddings or average them after summing.

Please add image description

As shown in the figure, first, the nodes in the graph/subgraph are represented by graph embedding vectors ( $\mathbb{z}_u$ ), and then sum the embedding vectors (or re-average the summed vectors) to obtain the embedding vector of the graph/subgraph.

NOTE: This method is simple but works very well.

3. Create a super node

Please add image description

As shown in the figure, a super node is added to the graph/subgraph on the basis of the original graph, and the super node is connected to all nodes of the graph/subgraph. Then the Node Embeddings method is used to obtain the embedding vector of the node, which is the embedding vector of the graph/subgraph.

4. Anonymous random walk

1. Anonymous random walk method:

Please add image description

Anonymous random walk only remembers the time when a node appears and does not care about the specific node.

As shown in the figure, in the subgraph composed of nodes A, B, C, D, E and F:

The order of nodes in the first walk is: A, B, C, B, C. A is the first visited node marked 1, B is the second visited node marked 2, C is the third visited node marked 3, and there are no other nodes. At this time, the anonymous random walk path is: 1, 2, 3, 2, 3;

The order of nodes in the second walk is: C, D, B, D, B. C is the first visited node marked 1, D is the second visited node marked 2, B is the third visited node marked 3, and there are no other nodes. At this time, the anonymous random walk path is: 1, 2, 3, 2, 3;

It can be seen from the two walks that although the paths of the two walks are different, their anonymous walking paths are the same. Only remember the time when the node appears and don't care about the specific node.

Please add image description

It can be seen from the figure that for anonymous random walk sequences of different lengths, the number of walking paths explodes exponentially.

Among them, the number of paths with a length of 3 is calculated as shown in the figure:

Please add image description

2. Use anonymous random walks for graph embedding coding

Please add image description

For a graph/subgraph, perform a fixed-length anonymous random walk, and record the number of times (or occurrence probability) of each walking sequence as a vector, which is the embedding vector of the graph. As shown in the picture:

Please add image description

The length of the random walk $l$ is a hyperparameter, for the selected $l$ , how many times the graph/subgraph should be sampled is a question worth thinking about, as shown in the figure:

Please add image description

In order to ensure the robustness of sampling, the sampling distribution error is maintained at $[\varepsilon,\delta ]$ , for the selected length $l$ , calculate the number of random walk sequences as $\eta $(Look up the table or calculate by yourself), the number of sampling times$ m$ is:
$m=\left \lceil \frac{2}{\varepsilon ^2} \left ( log \left ( 2^\eta - 2 \right ) - log(\delta ) \right ) \right \rceil$
example: for lengthThe random walk sequence of $l$ $\eta$ , and the distribution error is defined as $[\varepsilon=0.1,\delta=0.01]$ , calculated as $m$ is 122500 times.

五、Learn Walk Embeddings

Please add image description

The difference between this method and anonymous random walk is that anonymous random walk only generates an embedding vector to calculate the number of different walk sequences (probability), while this method embeds each anonymous walk sequence separately and adds a graph. /The embedding vector of the subgraph.

Among them, the embedding vector of the graph/subgraph is $\mathbb{z}_G$ , anonymous random walk sequence embedding vector $Z=\{z_i:1...\eta\}$ 。 $\eta$ is the total number of walk sequences after fixing the walk length.

Please add image description

The specific steps are:

Create a context manager, $N_R(u)$ Recording $u$ is the walking path from the starting point, and the total number is $T.$ _
Contextual self-supervision: for the first $\Delta$ The walking sequence that has appeared $Δ$ $\Delta+1$ occurrence (similar to the transformer-decoder algorithm in NLP). The optimization function is as shown above.

The specific steps of contextual self-supervision are:

Please add image description

First, convert the former $\Delta$ -time walking sequence vector $\{ \mathbb{z}_{t-\Delta},...\mathbb{z}_{t-1} \ }$ Sum and average, and then combine it with the whole graph embedding vector $\mathbb{z}_G$ Stacked. $\Delta+1$ through linear change and softmax $D +$ A wandering sequence that appears once $.$ Repeat the above process to repeatedly update all $\mathbb{z}$ vector. Note that negative samples are used to optimize calculations.

6. Summary

Please add image description

Picture excerpt from - Stanford CS224W: Machine Learning with Graphs