[Graph neural network] Graph structure and graph representation

1. Graph structure

        Figure: A common language for describing Linked Data . In a graph, nodes are related. But in traditional machine learning, data samples are independent and identically distributed .

       The graph needs to be compatible with input structures of arbitrary size (indefinite length) and complex topology , and has no reference anchor (unlike CNN and GNN, which have a starting point for processing). And graphs are also dynamically changing and multimodal features (different types).

Second, the representation of the graph

        The graph consists of the following: nodes (nodes), denoted as N; edges (edge), denoted as E; full graph (graph), denoted as G(N,E)

        At the beginning of the design diagram, we need to make an ontology diagram (make a diagram of things and their possible connections);

        The design of ontology graph should follow the following requirements: Ontology graph must be unique and unambiguous; Nodes may contain many different data types;

        1. Properties of graphs

                General graphs can be divided into undirected graphs (symmetrical, bidirectional) and directed graphs .

                In addition, there is another kind called heterogeneous graph (heterogeneous graph), which may contain a variety of nodes and connections , denoted as G=(V,E,R,T), if a heterogeneous If there are two kinds of nodes in the graph , it is called a bipartite graph.

      

               Expand the two nodes in the bipartite graph to generate the expanded bipartite graph (Folded networks)

The specific expansion method is as follows: separate different types of nodes of the bipartite graph; and add new connections                 to the obtained two new graphs according to the connection of the bipartite graph .

                The number of connections/degree of the node

                The A node in the following figure (undirected graph) ; the average number of connections in a graph is (a connection contributes to both nodes, and x2 is required)

                For directed graphs, it is necessary to distinguish between in-degree and out-degree . The C node in the figure below, , , and the overall degree=in-degree + out-degree ( )

                When a node's in-degree=0, it is called Source; when out-degree=0, it is called Sink; the average number of connections and because of one- to-one correspondence, the average of in-degree is consistent with the average of out-degree

        In addition to the general unweighted graph, there is also a graph whose connections are weighted, called a weighted graph . The elements of the adjacency matrix with a weighted graph are the weights of its edges, which are no longer either 1 or 0. The reading of the weight map is the sum of all non-zero elements and divided by the number of connections

        When the graph has self -loops, the main diagonal elements of the adjacency matrix are no longer 0. The number of connections needs to add the main diagonal elements (no division by 2)

         When there are multiple connections (Multigraph) in the graph, the elements of the adjacency matrix are equal to the sum of the connection numbers of the point.

                 Graph connectivity

                        There must be a path (possibly more than one) in the graph, so that any two nodes can be reached/connected. This range is called: Connected components

                        If the two points are not connected, their adjacency matrix will have the property of " block diagonal "

                         This behavior is broken when a connection occurs between the two graphs

                         For a directed graph, if any two nodes can reach each other through a directed connection , it is called a strongly connected graph . The subgraph composed of three nodes ABC in the figure below is a strongly connected graph, also known as a strongly connected graph. Connected domain ( Scc ), the node pointing to Scc is called In-Component (such as E, G), and the node pointed out from Scc is called Out-component (such as D, F); and it is only connected to each other in the case of ignoring the direction Commonly known as: weak connection graph , the whole picture below is a weak connection graph.

         2. Matrix representation of graph

                ① Adjacency matrix

                It is usually represented by an adjacency matrix . When there is a connection between node i and node j, in the matrix , if there is no connection, then A ij=0 . The following figure can be expressed as:

                When the graph is undirected, the adjacency matrix is ​​symmetric and the main diagonal is 0 (when there is no self-input connection); the total number of connections at a point is equal to the summation along the row/column of that point.

                And for directed graphs. The adjacency matrix is ​​no longer a symmetric matrix, = sum of node columns, sum of node rows

                But in fact most connections in the natural world are sparse matrices (mostly nonexistent). In this case, it can be represented by a connection list and an adjacency list .

                ② Connection list

                The specific method of the connection list is: use an array to record only the node pairs that are connected, which is expressed as follows:

(2,3)    #有一条边由节点2指向了节点3
(2,4)    #有一条边由节点2指向了节点4

                ③Adjacency list

                The specific method of the adjacency list is: use an array to list all the nodes connected to a certain node behind it, expressed as follows:

2:3,4    #节点2指向节点3,节点4
3:2,4    #节点3指向节点2,节点4

Guess you like

Origin blog.csdn.net/weixin_37878740/article/details/129555310