Based on DGL library graph neural network tutorial (1)-basic mapping operations

I recently came into contact with graph neural networks, and I feel that this gadget can provide a multi-granular perspective. After abstracting the research object, you can classify the nodes, edges, and the graph as a whole. The structure of graph neural network provides a multi-granularity ability to describe specific objects. It's pretty good when you think about it.

So I do a graph neural network.
At present, the theoretical learning routes of the graph neural network are:

  1. Basic knowledge of graph theory. This is very familiar, so there is no need to learn it.
  2. Linear algebra, especially the matrix form of the Fourier transform. Just go back and read the book. I read the theoretical basis of the graph convolutional network, which is actually the application of the Fourier transform matrix form. This is already very common in the matrix analysis class, and it is not a new knowledge point. But if you want to get in touch with the graph neural network, this part needs to be mastered carefully!
  3. Neural network related knowledge.

I have probably gone through "Deep and Simple Explaining the Neural Network", and it feels similar, but it is still practical. So write about practical aspects.

Graph Neural Network Library

The library I use is DGL: https://docs.dgl.ai/index.html
github address: https://github.com/dmlc/dgl
This library seems to be written by New York University. It implements the current common graph neural network model, which directly constructs the network like building blocks, and then constructs the data.
Insert picture description here

Install DGL library

No cuda installation:

pip3 install dgl -i https://mirrors.aliyun.com/pypi/simple/

cuda10 installation:

python3 -m pip install dgl-cu100 -i https://mirrors.aliyun.com/pypi/simple/

DGL uses pytorch as the underlying neural network library by default

Tutorials

Add nodes, add edges, and visualize graphs

The edges in DGL are all directed edges. For undirected edges, two opposite edges can be created at the same time.

__author__ = 'dk'
#构建图,添加节点和边
import networkx as nx
import  dgl
import matplotlib.pyplot as plt
#构建星型图
u=[0,0,0,0,0]
v=[1,2,3,4,5]
#第一种方式,u和v的数组,他们是相同的长度
star1 = dgl.DGLGraph((u,v))
nx.draw(star1.to_networkx(),with_labels=True)#可视化图
plt.show()

star2 = dgl.DGLGraph((0,v))
#对于星型,是可以广播的
nx.draw(star2.to_networkx(),with_labels=True)
plt.show()

star3= dgl.DGLGraph([(0,1),(0,2),(0,3),(0,4),(0,5)])
#直接枚举
nx.draw(star3.to_networkx(),with_labels=True)
plt.show()

You can also add edges later, instead of adding edges in the constructor:

#也可以边构图,边加边,而不是在构造函数里面加边
g = dgl.DGLGraph()#这是一张空白图
g.add_nodes(9)#添加节点,注意一定要先加节点,后加边
for i in range(1,8):
    g.add_edge(0,i)
nx.draw(g.to_networkx(),with_labels=True)
plt.show()

note! When adding edges (u, v), u and v cannot exceed the maximum node ID that the network already has (the number of nodes minus 1). For redundant nodes with no discrepancies, DGL considers them to be isolated.
For example, the picture above: Node 8 is connected to it without change.
Insert picture description here

Assignment and extraction of node features

After creating the graph, you can add features to the nodes.
In DGL, the characteristics of nodes are treated as a dictionary. The user can take a key_name that has a characteristic meaning for the feature of the node, and at the same time, the user can also define multiple features for a node at the same time.
For example, the star model above:

import numpy as np
features = np.random.normal(0,1,(9,5)) #随机生成一个9x5的正态分布矩阵
print(features)
g.ndata['features'] = features

Output:

[[-0.73241917  0.78738566  1.21160063 -0.83944648 -0.15739715]
 [-0.05520377  0.83418124 -0.68477259 -1.29054549 -1.2375015 ]
 [-0.23807425 -0.40030208  1.74318389 -0.70699831 -0.61449034]
 [-0.48597221  0.65284435 -0.27101916 -0.69242791 -0.83134013]
 [-0.00580359  1.29773141  1.28545031 -0.41703836  0.97254182]
 [-1.19173936  1.18409306 -0.24284504 -1.93560515 -1.1080128 ]
 [-0.4854841   0.06257814 -1.3432515  -0.53297016 -0.01270537]
 [-0.16906863  0.17349874  1.0057332   1.85554737  0.13355367]
 [-1.45619866  0.77784642  1.52454762 -0.86608947  0.28595569]]

The ndata attribute is the abbreviation of node-data. It is a dict.
Note that the number of rows of features needs to be the same as the number of nodes currently owned by the graph.
Otherwise, an error will be reported:

dgl._ffi.base.DGLError: Expect number of features to match number of nodes (len(u)). Got 7 and 9 instead.

Access node features:
directly g.ndata[feature name][node ID] can:
For example: access node 3 features

print(g.ndata['features'][3])

Output: the third row of the features matrix

tensor([-0.4860,  0.6528, -0.2710, -0.6924, -0.8313], dtype=torch.float64)

Of course, you can modify the characteristics of a node:

g.ndata['features'][3]=th.zeros(1,5)
print(g.ndata['features'][3])

Output:

tensor([0., 0., 0., 0., 0.], dtype=torch.float64)

It can be seen that DGL uses a tensor internally to store node features. The modification and access of node characteristics are finally implemented to the modification and access to this matrix.

print(g.ndata)

Output:

{
    
    'features': tensor([[-0.4771,  1.7900, -1.1160,  0.2916, -0.7986],
        [-1.6190, -0.5006, -0.0437,  1.6412, -1.6979],
        [ 1.8872,  0.5236,  0.5123, -0.7658, -0.5050],
        [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
        [-0.3382, -0.4218,  0.8622,  1.1720,  0.3460],
        [-0.1710,  0.2713, -0.1639, -1.1159, -0.3623],
        [-0.9241,  1.2943,  0.1137,  1.5348,  0.1791],
        [-1.0372,  1.4145, -2.0653, -0.1469, -0.6041],
        [ 0.1035, -1.4215,  0.3288, -0.5195, -1.4120]], dtype=torch.float64)}

Assignment, access and modification of edge features

Like node features, edges can also specify features and access and modify features.
Assignment:

g.edata['w']=th.randn(len(g.edges),2)

Note that the row function of the assignment matrix needs to be equal to the number
of edges to access the features of a certain edge, and ultimately needs to be implemented to access a certain row in the edge matrix. And which line is the ID of the edge.
Therefore, we need to determine what the ID of the edge we are visiting is.
For example: Get the ID of the edge (0,7):

g.edge_id(0,7)

Access the features of this edge:

g.edata(g.edge_id(0,7))

Then you can also delete features: this is actually a dictionary operation

g.ndata.pop('features')
g.edata.pop('w')

Guess you like

Origin blog.csdn.net/jmh1996/article/details/106881394