Article directory
1. Composition of the graph
- Graph (graph, G) consists of nodes (nodes, N) and connections (edges, E).
2. Ontology diagram
2.1 What is ontology diagram
- Designing the ontology diagram is the first step in designing the diagram.
- That is, before designing the diagram, it is necessary to clarify the possible node types and connection types.
- The picture below shows the ontology diagram of a certain medical knowledge graph.
- After designing the ontology diagram, import the data to generate the diagram.
- The picture below is the medical knowledge graph (part) generated based on the picture above.
2.2 How to design ontology diagram
- First, the principle is that it depends on what problem we want to solve;
- Secondly, the general ontology graph is unique and unambiguous. For example, in a network of interpersonal relationships, the nodes are the characters, and the connections are whether there are relationships;
- Again, like the previous example of the medical knowledge graph, there are many types of nodes and many types of relationships;
- In short, the ontology diagram is flexibly designed according to the target tasks.
3. Types of pictures
3.1 According to whether the connection is directed
- Directed graph: such as: subway line map.
- Undirected graph: such as: Weibo follow graph.
3.2 According to ontology diagram
- Ordinary graph: There is only one type of nodes and connections;
- Heterogeneous graph: There is more than one type of nodes and connections;
- Bipartite graph: a special heterogeneous graph with two node types.
Note: The bipartite graph can be expanded into two graphs for separate processing.
3.3 Divide according to whether the connection is weighted
- Connect weighted graphs: Literally connect with weights.
- If there are multiple paths between two nodes: the weight is the sum of the weights of each path.
4. Number of node connections (degree of node)
- The degree of a node can be used as an indicator of the importance of the node.
4.1 Degree of nodes in undirected graph
- The number of connections that exist for a node is the degree of the node.
- The average degree of an undirected graph is K ˉ = 2 EN \bar{K} = \frac{2E}{N}Kˉ=N2 E. Among them, E is the total number of connections, and N is the number of summary points.
4.2 Degree of directed graph node
- The degree of a directed graph node is divided into in-degree and out-degree.
- In-degree: is the number of connections pointing to the node.
- Out-degree: is the number of connections issued by the node.
- The degree of the entire node is the sum of in-degree and out-degree.
- A node with an in-degree of 0 is called a source node, and a node with an out-degree of 0 is called a sink node.
- The average degree of a directed graph is K ˉ = EN \bar{K} = \frac{E}{N}Kˉ=NE, the average out-degree and the average in-degree are the same.
5. Representation method of graph
5.1 Adjacency matrix
- Where there is a connection it is 1, where there is no connection it is 0.
- Features: The adjacency matrix of an undirected graph is a symmetric matrix, and the adjacency matrix of a directed graph is an asymmetric matrix.
- For undirected graphs, the total number of connections is half the element-wise sum of the adjacency matrix; the degrees of nodes can be summed along rows or columns.
Note: If there are self-connections, the total number of self-connections does not need to be divided by two when calculating the total number of connections.
- For directed graphs, the total number of connections is the element-wise sum of the adjacency matrices; the in-degree of a node is the column-wise sum, and the out-degree is the row sum.
5.2 Connection list, adjacency list
邻接矩阵多为稀疏矩阵,这造成了存储空间的浪费。
- Connection list: Only record node pairs with connections.
- Adjacency list: only records connections issued by each node.
- Adjacency lists further compress storage requirements based on connection lists.
6. Graph connectivity
- For an undirected graph, if any two nodes can reach each other, it is called a connected graph; otherwise, it is called a non-connected graph, and the maximum connected subgraph of a non-connected graph is called a connected domain.
- For a directed graph, if any two nodes can reach each other, it is called a strongly connected graph; if a non-strongly connected graph is a connected graph after removing the direction, it is called a weakly connected graph.
- Strongly Connected Domain (SCC): It is a maximal subgraph that meets the definition of a strongly connected graph. It is very necessary to do SCC decomposition for a picture.