Data Structures and Algorithms (4) Graphic Structure

image-20220818210450592

Graph structure

Graph structures are actually very common in our lives, the most obvious of which is our map, such as my hometown of Chongqing:

image-20220821222600741

As you can see, the map is intricate and intricate, with different roads connecting each other. We can freely pass through these roads and get from one place to another. Of course, in addition to maps, our computer network, your interpersonal network, etc., can all be represented by graph structures. The graph structure is also a difficult part of the entire data structure, and in this chapter, we will explore the properties and applications of the graph structure.

The graph is also connected by multiple nodes, but one node can be connected to multiple other nodes at the same time, and multiple nodes can also point to one node at the same time. Unlike the tree structure we explained before, it is a Many-to-many relationship:

image-20220821223128857

It is more complex than the tree structure, has no clear hierarchical relationship, and the connection relationship between nodes is more free. The graph structure is a data structure in which there may be a certain relationship between any two data objects .

basic concept

Graph is generally composed of two sets, one is a non-empty but limited vertex set V (Vertex), and the other is an edge set E (Edge) that describes the connection relationship between vertices. The edge set can be an empty set, such as only In the case of a vertex, there is no need to connect it), a graph is actually composed of these nodes (vertices) and corresponding edges. Therefore, the graph can be expressed as: G = ( V , E ) G = (V, E)G=(V,E)

For example, we can represent a graph as the set V = { A , B , C , D } V = \{A,B,C,D\}V={ A,B,C,D},集合 E = { ( A , B ) , ( B , C ) , ( C , D ) , ( D , A ) , ( C , A ) } E = \{(A,B),(B,C),(C,D),(D,A),(C,A)\} E={(A,B),(B,C),(C,D),(D,A),(C,A )} , there are two basic forms of graphs, one is a directed graph like the one above (a directed graph indicates the direction, from which point to which point), and the other is an undirected graph (an undirected graph is just a connection , does not indicate the direction), for example, our above representation is an undirected graph:

image-20220822101619660

The degree of each node is the number of edges connected to it. Each edge may or may not contain a weight.

Of course, we can also express it as a directed graph, the set V = { A , B , C , D } V = \{A,B,C,D\}V={ A,B,C,D},集合 E = { < A , B > , < B , C > , < C , D > , < D , A > , < C , A > } E = \{<A,B>,<B,C>,<C,D>,<D,A>,<C,A>\} E={ <A,B>,<B,C>,<C,D>,<D,A>,<C,A>} Note that the edges of the directed graph are represented by angle brackets <>. For example, the directed graph above would look like this:

image-20220822104015728

If it is an edge (A, B) of an undirected graph, then A and B are said to be adjacent points to each other; if it is an edge <A, B> of a directed graph, then the starting point A is said to be adjacent to the end point B. Each node of a directed graph is divided into in-degree and out-degree, where in-degree is the number of edges connected to a vertex and pointing to the vertex, and out-degree is the number of edges from the vertex to adjacent vertices.

As long as there are no self-loop edges or duplicate edges in our graph, then we can call this graph a simple graph. For example, the two graphs above are both simple graphs. The following is a typical non-simple graph, in which a self-loop appears in Figure 1 and multiple edges appear in Figure 2:

image-20220822112214106

If in an undirected graph, any two vertices are connected by an edge, the graph is called an undirected complete graph :

image-20220822121243988

Similarly, in a directed graph, if any two vertices are connected by two edges with opposite directions, the graph is called a directed complete graph :

image-20220822113126420

The graph connects vertices through edges, so that we can reach other vertices from one vertex through a certain path. For example, we now want to reach the V1 point from the V5 point below:

image-20220822205354964

Then we can have many routes, such as arriving via V2, arriving via V3, etc.:

image-20220822205824613

In an undirected graph, two vertices are said to be connected if there is a path from one vertex to another. As you can see, we have many options to get from V5 to V1. We can get from V5 to V1 (of course, we can also go in reverse), so we say that V5 and V1 are connected. In particular, if any two points in the graph are connected, then we call the graph a connected graph . For a directed graph, if any vertices A and B in the graph have both a path from A to B and a path from B to A, then the directed graph is said to be a strongly connected graph .

For the graph G = ( V , E ) G = (V, E)G=(V,E )G ′ = ( V ′ , E ′ ) G' = (V', E')G=(V,E ), ifV ′ V'V isVVA subset of V , and E′ E′E isEEA subset of E is called G ′ G'G isGGSubgraphsof G , such as the following two graphs:

image-20220822212041079

The graph on the right satisfies the above properties, so the graph on the right is a subgraph of the graph on the left.

The maximal connected subgraph of an undirected graph is called a connected component , and the maximal connected subgraph of a directed graph is called a strongly connected component . So what is a maximal connected subgraph? First of all, a connected subgraph is a subgraph of the original graph, and the subgraph is also a connected graph. It should have the largest number of vertices. That is, adding other vertices in the original graph will cause the subgraph to be disconnected. Having a large number of vertices also It must include all edges attached to this vertex, for example:

image-20220822214010526

You can see that Figures 1, 2, and 3 on the right are all subgraphs of the left figure, but they are not all connected components of the original graph. First, let’s look at Figure 1, which is also a connected graph and contains a maximum number of vertices. and all the edges (that is, this piece inside the original graph), so it is a connected component. Let’s look at Figure 2. Although it is also a connected graph, it does not contain a maximum number of vertices (at most, D can be added as well). (but not added here), so it is not. Finally, look at Figure 3. It is also a connected graph and contains a maximum number of vertices and edges, so it is a connected component.

  • The original picture is a connected graph, then the connected component is itself, and there is only one.
  • The original graph is a non-connected graph, so there will be multiple connected components.

For minimally connected subgraphs, we will explain it in the spanning tree section later.


storage structure

Earlier we introduced some basic concepts of graphs, and then we will look at how to represent graph structures in programs. This part may involve some concepts that appear in the course "Linear Algebra".

adjacency matrix

The adjacency matrix actually uses a matrix to represent the adjacency relationships and weights between vertices in the graph. Suppose there is a graph G = ( V , E ) G = (V, E)G=(V,E ) , which has N vertices, then we can use an N×N matrix to represent it, such as the following graph with four vertices A, B, C, and D:

image-20220822104015728

At this point we need to use an adjacency matrix to represent it, like this:

image-20220822220549501

For a graph without weights:
G ij = { 1 , ( vi , vj ) of an undirected graph or < vi , vj > of a directed graph is edge 0 in the graph, ( vi , vj ) of an undirected graph vj ) or < vi , vj > of a directed graph is not an edge in the graph G_{ij} = \begin{cases} 1, (v_i,v_j) of an undirected graph or <v_i,v_j> of a directed graph is a graph The edge in\\ 0, (v_i,v_j) of the undirected graph or <v_i,v_j> of the directed graph is not an edge in the graph\end{cases}Gij={ 1,Undirected graph ( vi,vj) or directed graph<vi,vj>is the edge in the graph0,Undirected graph ( vi,vj) or directed graph<vi,vj>Not an edge
For a weighted graph, if there is an edge, directly fill in the weight of the corresponding edge. If not, then fill in 0 or ∞ (because some graphs will consider 0 to be a weight, so you can use ∞ , it can be represented by a number (the maximum value allowed by the computer is greater than the weight of all edges):
G ij = { wij , ( vi , vj ) of an undirected graph or < vi , vj > of a directed graph is a graph The edge 0 or ∞ in , ( vi , vj ) of the undirected graph or < vi , vj > of the directed graph is not an edge in the graph G_{ij} = \begin{cases} w_{ij}, of the undirected graph (v_i,v_j) or <v_i,v_j> of a directed graph is an edge in the graph \\ 0 or ∞, (v_i,v_j) of an undirected graph or <v_i,v_j> of a directed graph is not an edge in the graph \end{cases}Gij={ wij,Undirected graph ( vi,vj) or directed graph<vi,vj>is the edge in the graph0 or ,Undirected graph ( vi,vj) or directed graph<vi,vj>Not an edge
So, for the directed graph above, we should fill it in like this:

image-20220822221214967

So let's take a look at the adjacency matrix of an undirected graph? For example, the picture below:

image-20220822101619660

For an undirected graph, both sides of an edge are connected to each other, so if A is connected to B, then B is also connected to A, so it is like this:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-DX0hNBkW-1662297027117)(/Users/nagocoler/Library/Application Support/typora-user-images/image-20220822222331925 .png)]

It can be seen that the obtained matrix is ​​a symmetric matrix according to our definition in "Linear Algebra" (the upper and lower halves are the same). Because there is no self-loop vertex, the elements on the main diagonal are all 0. Since an undirected graph has no direction and the vertices are connected to each other, the adjacency matrix of the undirected graph must be a symmetric matrix.

We can summarize the properties:

  • The adjacency matrix of an undirected graph must be a symmetric matrix, so sometimes to save time, we can store only the upper half.
  • For an undirected graph, ithe number of non-zero (or non-∞) rows of the adjacency matrix is ​​the idegree of the vertex.
  • For directed graphs, ithe number of non-zero (or non-∞) rows of the adjacency matrix is i​​the out-degree of the vertex (the vertical direction is the in-degree).

Next let's take a look at how to implement it through code. First we need to define the structure. Here we take a directed graph as an example:

#define MaxVertex 5

typedef char E;   //顶点存放的数据类型,这个不用我多说了吧

typedef struct MatrixGraph {
    
    
    int vertexCount;   //顶点数
    int edgeCount;     //边数
    int matrix[MaxVertex][MaxVertex];   //邻接矩阵
    E data[MaxVertex];    //各个顶点对应的数据
} * Graph;

Then we can initialize it and return it:

Graph create(){
    
       //创建时,我们可以指定图中初始有多少个结点
    Graph graph = malloc(sizeof(struct MatrixGraph));
    graph->vertexCount = 0;    //顶点和边数肯定一开始是0
    graph->edgeCount = 0;
    for (int i = 0; i < MaxVertex; ++i)    //记得把矩阵每个位置都置为0
        for (int j = 0; j < MaxVertex; ++j)
            graph->matrix[i][j] = 0;
    return graph;
}

int main(){
    
    
    Graph graph = create();   //这里咱们就搞一个
    
}

Then we can write functions for adding vertices and adding edges:

void addVertex(Graph graph, E element){
    
    
    if(graph->vertexCount >= MaxVertex) return;
    graph->data[graph->vertexCount++] = element;   //添加新元素
}

void addEdge(Graph graph, int a, int b){
    
       //添加几号顶点到几号顶点的边
    if(graph->matrix[a][b] == 0) {
    
    
        graph->matrix[a][b] = 1;  //注意如果是无向图的话,需要[a][b]和[b][a]都置为1
        graph->edgeCount++;
    }
}

Let’s try to construct this directed graph:

image-20220822104015728

Graph graph = create();
for (int c = 'A'; c <= 'D' ; ++c) 
    addVertex(graph, (char) c);
addEdge(graph, 0, 1);   //A -> B
addEdge(graph, 1, 2);   //B -> C
addEdge(graph, 2, 3);   //C -> D
addEdge(graph, 3, 0);   //D -> A
addEdge(graph, 2, 0);   //C -> A

Then we print this collar matrix to see if it turns out as we expected:

void printGraph(Graph graph){
    
    
    for (int i = -1; i < graph->vertexCount; ++i) {
    
    
        for (int j = -1; j < graph->vertexCount; ++j) {
    
    
            if(j == -1)
                printf("%c", 'A' + i);
            else if(i == -1)
                printf("%3c", 'A' + j);
            else
                printf("%3d", graph->matrix[i][j]);
        }
        putchar('\n');
    }
}

Finally get:

image-20220830123847943

You can see that the result is exactly the same as the adjacency matrix we derived above. Of course, this only demonstrates an ordinary directed graph. We can also slightly modify the code to turn it into an undirected graph or a weighted directed graph. Figure, no demonstration will be done here.

adjacency list

We introduced the adjacency matrix earlier. We can use the adjacency matrix to save the edge-related information of a graph in the program. It uses the form of a two-dimensional array to store the connection relationships of the corresponding edges. However, we know that the array has capacity constraints. Limitations (after taking so many lessons, you should be able to realize that arrays require a continuous space, which is very troublesome to apply for and use). At the same time, after we create the adjacency matrix, if the graph has a large number of edges (dense graph ) The utilization rate is still quite high, but once you encounter a graph with few edges (sparse graph), a large number of positions in the table are actually not 0used at all, which is very wasteful.

At this point, we can consider using a chain structure to solve this problem, like this:

image-20220830125309778

For each vertex in the graph, create an array to store a head node. We will record the vertices adjacent to it through a linked list (it looks like the hash table mentioned earlier). In this way, we can also represent a graph. Connection relationships, and memory space can be used more efficiently. Of course, for undirected graphs, as before, both sides need to be saved:

image-20220830141940278

Let's try to write some code implementation. First, let's define:

#define MaxVertex 5

typedef char E;

typedef struct Node {
    
       //结点和头结点分开定义,普通结点记录邻接顶点信息
    int nextVertex;
    struct Node * next;
} * Node;

struct HeadNode {
    
       //头结点记录元素
    E element;
    struct Node * next;
};

typedef struct AdjacencyGraph {
    
    
    int vertexCount;   //顶点数
    int edgeCount;     //边数
    struct HeadNode vertex[MaxVertex];
} * Graph;

Next is to initialize it:

Graph create(){
    
       //创建时,我们可以指定图中初始有多少个结点
    Graph graph = malloc(sizeof(struct AdjacencyGraph));
    graph->vertexCount = graph->edgeCount = 0;
    return graph;   //头结点数组一开始可以不用管
}

When adding edges and vertices, it's a little trickier:

void addVertex(Graph graph, E element){
    
    
    if(graph->vertexCount >= MaxVertex) return;   //跟之前一样
    graph->vertex[graph->vertexCount].element = element;   //添加新结点时,再来修改也行
    graph->vertex[graph->vertexCount].next = NULL;
  	graph->vertexCount++;
}

void addEdge(Graph graph, int a, int b){
    
    
    Node node = graph->vertex[a].next;
    Node newNode = malloc(sizeof(struct Node));
    newNode->next = NULL;
    newNode->nextVertex = b;
    if(!node) {
    
        //如果头结点下一个都没有,那么直接连上去
        graph->vertex[a].next = newNode;
    } else {
    
       //否则说明当前顶点已经连接了至少一个其他顶点了,有可能会出现已经连接过的情况,所以说要特别处理一下
        do {
    
    
            if(node->nextVertex == b) return;   //如果已经连接了对应的顶点,那么直接返回
            if(node->next) node = node->next;   //否则继续向后遍历
            else break;   //如果没有下一个了,那就找到最后一个结点了,直接结束
        } while (1);
        node->next = newNode;
    }
  	graph->edgeCount++;   //边数计数+1
}

Let’s build it up, taking the picture above as an example:

image-20220822104015728

int main(){
    
    
    Graph graph = create();
    for (int c = 'A'; c <= 'D' ; ++c)
        addVertex(graph, (char) c);
    addEdge(graph, 0, 1);   //A -> B
    addEdge(graph, 1, 2);   //B -> C
    addEdge(graph, 2, 3);   //C -> D
    addEdge(graph, 3, 0);   //D -> A
    addEdge(graph, 2, 0);   //C -> A

    printGraph(graph);
}

Let's print it and see the effect:

void printGraph(Graph graph){
    
    
    for (int i = 0; i < graph->vertexCount; ++i) {
    
    
        printf("%d | %c", i, graph->vertex[i].element);
        Node node = graph->vertex[i].next;
        while (node) {
    
    
            printf(" -> %d", node->nextVertex);
            node = node->next;
        }
        putchar('\n');
    }
}

The result is as follows:

image-20220830132526621

We can see that the results are in line with our expectations.

However, although this method seems simpler and more efficient, it will bring us some unnecessary troubles. For example, in the leading table created above, we can only quickly get which vertices a certain vertex points to, that is, we can only calculate The out-degree to the vertex, but the in-degree of the vertex cannot be quickly calculated. The in-degree can only be obtained after counting all nodes. Therefore, when representing a directed graph, the search is not as convenient as the adjacency matrix.

In order to solve this problem, we can build a reverse collar list to represent all vertex lists pointing to the current vertex:

image-20220830133244446

In fact, it's just the opposite. By establishing these two receiving tables, the inconvenience can be alleviated to a certain extent.

Picture exercises:

  1. In a directed graph with n vertices, if the sum of out-degrees of all vertices is s, what is the sum of in-degrees of all vertices?

    A. s B. s - 1 C. s + 1 D. 2s

    All out-degrees of a directed graph are actually the number of edges connecting all vertices to other vertices. For a single vertex, either it points to others (its own out-degree, others' in-degree), or others point to itself (other people's in-degree). Out-degree, own in-degree), this thing is just relative, and these can all be regarded as degrees, so the sum of the in-degrees of all vertices is the sum of the out-degrees of all vertices, so choose A

  2. In an undirected complete graph with n vertices, what is the number of edges?

    A. n B. n(n-1) C. n(n - 1)/2 D. n(n + 1)/2

    First, let’s review the definition of an undirected complete graph: In an undirected graph, if any two vertices are connected by an edge, the graph is called an undirected complete graph. Since any two vertices have one, then each node will have n-1 edges connected to it, so the total number is n × ( n − 1 ) n \times (n-1)n×(n1 ) However, since it is an undirected graph, there is no direction, so half of the number needs to be removed to obtainn × ( n − 1 ) 2 \frac {n \times (n-1)} {2}2n×(n1), select C

  3. To connect n vertices into a connected graph, how many edges are needed?

    A. n B. n - 1 C. n + 1 D. 2n

    The definition of a connected graph is that each vertex has at least one path to other vertices, so we only need to find the simplest one that can ensure that every node is connected to it, that is, connected into a straight line (or tree), choose B

  4. For an undirected graph with n vertices and e edges, how many edge nodes are contained in its corresponding adjacency list?

    A. by B. by C. by D. 2e

    For an undirected graph, the number of nodes is equal to twice the number of edges. For a directed graph, it is exactly equal to the number of edges, so D is chosen.


Graph traversal

I remember that every time I went to a bookstore when I was a kid, I could see the maze book:

image-20220831141620073

Every time I saw it I wanted to buy one, but at that time my family conditions did not allow me to consume such an expensive book, so I could only look at it a few more times in the bookstore before going back. The solution to the maze is actually to find a path from the starting point to the end point in a complex map. It can be seen that starting from the starting point, at each intersection, there may be multiple forks. Some forks may lead to a dead end, and some forks may lead to the next intersection.

So how does our human brain find the correct path?

image-20220831142540478

We will first start from the starting point, and we will try to go in every direction of the forked road. If we encounter a dead end, then we will go back to the previous intersection and try other directions until we can go all the way down. After repeating the above operations, we will definitely be able to reach the exit of the maze in the end.

Graph search is actually similar to a maze. We need to start from a certain vertex of the graph to find the position of the corresponding vertex in the graph. In this part, we will discuss the graph search algorithm.

image-20220831144250794

Depth First Search (DFS)

In the process of learning binary trees, we explained the pre-order traversal of the tree. Think about it, how did we traverse it at that time?

image-20220814145531577

Preorder traversal is to go forward bravely, go directly to the end, and then go back and take other branches. Our graph can actually be like this. We can go all the way forward. If we reach a dead end, then go back and go in other directions. If all If you can't go in any direction, you can definitely find it if you continue to go back to the previous intersection (actually this is the thinking of our human brain) and keep searching.

For example, now we need to start from A to find I in the picture below:

image-20220831145024885

Then our route can be like this:

image-20220831145204170

At this time, vertex B has three directions, so we can choose any direction first (of course, in general, for the sake of standardization, it is recommended to go in alphabetical order. For demonstration purposes, we just go here) and take a look:

image-20220831145313492

When we come to K at this time, we find that K is already a dead end and there is no other way. So at this time we need to go back to the previous intersection and continue to explore other paths:

image-20220831145530501

At this time, we then go to the next adjacent vertex G and find that G has other branches, then we continue to move forward:

image-20220831145910420

At this time, when you walk to F and find that it is a dead end again, then go back to G and go in another direction:

image-20220831150008288

Too much luck, we have reached a dead end again. Likewise, return to G and continue in other directions:

image-20220831150236884

After reaching C, we have other roads, we continue to go back:

image-20220831150354010

At this time, when you reach the vertex H, you find that there is only one way to H, and if H goes forward, it will be the vertex B that you have already passed. So you can't go forward at this time, so you can directly go back to C and go to the other side:

image-20220831150617828

When you come to E, there are two more roads, so continue to choose one:

image-20220831150820472

At this time, when I came to the vertex J, I found that it was a dead end again. I retreated to E and continued on the other side:

image-20220831150913443

Okay, after so many trials and errors, I finally found the I vertex. This method is depth-first search.

So let’s play with some code. Here we build a simpler graph:

image-20220831152924911

Here we use the adjacency list to represent the graph, because the adjacency list directly saves adjacent vertices, so it will be faster to traverse the adjacent vertices when reaching the vertex (can reach O (V + E) O(V + E)O(V+E ) linear order) and if we use the adjacency matrix, we have to completely traverse the entire two-dimensional array, which is more time-consuming (requiresO (V 2) O(V^2)O(V2 )square order).

For example, now we want to search for vertex F starting from A. First, build the graph (note that there are 6 vertices, remember to write down the capacity):

int main(){
    
    
    Graph graph = create();
    for (int c = 'A'; c <= 'F' ; ++c)
        addVertex(graph, (char) c);
    addEdge(graph, 0, 1);   //A -> B
    addEdge(graph, 1, 2);   //B -> C
    addEdge(graph, 1, 3);   //B -> D
    addEdge(graph, 1, 4);   //D -> E
    addEdge(graph, 4, 5);   //E -> F

    printGraph(graph);
}

image-20220831154358394

Then there is our depth-first search algorithm:

/**
 * 深度优先搜索算法
 * @param graph 图
 * @param startVertex 起点顶点下标
 * @param targetVertex 目标顶点下标
 * @param visited 已到达过的顶点数组
 */
void dfs(Graph graph, int startVertex, int targetVertex, int * visited){
    
    

}

Let’s first write out the depth-first traversal:

/**
 * 深度优先搜索算法(无向图和有向图都适用)
 * @param graph 图
 * @param startVertex 起点顶点下标
 * @param targetVertex 目标顶点下标
 * @param visited 已到达过的顶点数组
 */
void dfs(Graph graph, int startVertex, int targetVertex, int * visited) {
    
    
    visited[startVertex] = 1;   //走过之后一定记得mark一下
    printf("%c -> ", graph->vertex[startVertex].element);   //打印当前顶点值
    Node node = graph->vertex[startVertex].next;   //遍历当前顶点所有的分支
    while (node) {
    
    
        if(!visited[node->nextVertex])   //如果已经到过(有可能是走其他分支到过,或是回头路)那就不继续了
            dfs(graph, node->nextVertex, targetVertex, visited);  //没到过就继续往下走,这里将startVertex设定为对于分支的下一个顶点,按照同样的方式去寻找
        node = node->next;
    }
}

int main(){
    
    
    ...

    int arr[graph->vertexCount];
    for (int i = 0; i < graph->vertexCount; ++i) arr[i] = 0;
    dfs(graph, 0, 5, arr);
}

The results of depth-first traversal are as follows:

image-20220831163728799

The route is as follows:

image-20220831163909522

Now we will determine the vertices we need to find:

/**
 * 深度优先搜索
 * @param graph 图
 * @param startVertex 起点顶点下标
 * @param targetVertex 目标顶点下标
 * @param visited 已到达过的顶点数组
 * @return 搜索结果,如果找到返回1,没找到返回0
 */
_Bool dfs(Graph graph, int startVertex, int targetVertex, int * visited) {
    
    
    visited[startVertex] = 1;
    printf("%c -> ", graph->vertex[startVertex].element);
    if(startVertex == targetVertex) return 1;   //如果当前顶点就是要找的顶点,直接返回
    Node node = graph->vertex[startVertex].next;
    while (node) {
    
    
        if(!visited[node->nextVertex])
            if(dfs(graph, node->nextVertex, targetVertex, visited))  //如果查找成功,直接返回1,不用再看其他分支了
                return 1;
        node = node->next;
    }
    return 0;   //while结束那肯定是没找到了,直接返回0
}

int main(){
    
    
    ...

    int arr[graph->vertexCount];
    for (int i = 0; i < graph->vertexCount; ++i) arr[i] = 0;
    printf("\n%d", dfs(graph, 0, 5, arr));
}

The result is as follows:

image-20220831164615659

Let’s find vertex D again:

image-20220831164641467

You can see that it stops after reaching D because it has been found. So what if we want to find a node that is not connected to the graph?

image-20220831164739301

It can be seen that the entire graph was traversed according to depth first and was not found after searching.

Breadth First Search (BFS)

We introduced depth-first search earlier, let's look at another solution. Remember the level-order traversal we learned in the previous binary tree?

image-20220831165617419

Level-order traversal actually prioritizes traversing each layer instead of going forward like pre-order traversal. In fact, graph search can also use this solution. We can first explore all the branches of the vertex, and then look at these in turn. All branches of the branch:

image-20220831170114857

First, we still go from A to B. At this time, B has three bifurcated roads. We visit the vertices of these three roads in turn:

image-20220831172011576

Let's first record these three vertices, which also need to be completed using a queue: H, G, K

Be careful not to continue downward after visiting. Then we start from the first vertex H among the three and continue in the same way:

image-20220831172153888

At this time, because there is only one branch, we find C, continue recording, and add C to it: G, K, C

Note that you need to go back at this time and continue to look at the second vertex G of the previous three vertices:

image-20220831172312762

At this time, C has already seen it, and then found F and D. Also record: K, C, F, D

Then, we continue to look at the last of the three previous nodes:

image-20220831172726616

At this point K is already a dead end, so end it and move on to the next C:

image-20220831172941671

At this time, continue to record E: F, D, E, and then look at D and F. There is no follow-up, so in the end there is only E:

image-20220831173224689

To successfully find the target I vertex, in fact, the breadth-first traversal is to expand the scope as much as possible and explore the vast land instead of clinging to it, just like love, if you really can't get it, just forget it, she started She has never loved you in the end. Don't continue to waste your feelings on her. Make more new friends. I believe you will meet better ones.

So according to this idea, let's try to implement it in code. First, move the queue over:

typedef int T;   //这里将顶点下标作为元素

struct QueueNode {
    
    
    T element;
    struct QueueNode * next;
};

typedef struct QueueNode * QNode;

struct Queue{
    
    
    QNode front, rear;
};

typedef struct Queue * LinkedQueue;

_Bool initQueue(LinkedQueue queue){
    
    
    QNode node = malloc(sizeof(struct QueueNode));
    if(node == NULL) return 0;
    queue->front = queue->rear = node;
    return 1;
}

_Bool offerQueue(LinkedQueue queue, T element){
    
    
    QNode node = malloc(sizeof(struct QueueNode));
    if(node == NULL) return 0;
    node->element = element;
    queue->rear->next = node;
    queue->rear = node;
    return 1;
}

_Bool isEmpty(LinkedQueue queue){
    
    
    return queue->front == queue->rear;
}

T pollQueue(LinkedQueue queue){
    
    
    T e = queue->front->next->element;
    QNode node = queue->front->next;
    queue->front->next = queue->front->next->next;
    if(queue->rear == node) queue->rear = queue->front;
    free(node);
    return e;
}

Let’s take the picture above as an example:

image-20220831152924911

/**
 * 广度优先遍历
 * @param graph 图
 * @param startVertex 起点顶点下标
 * @param targetVertex 目标顶点下标
 * @param visited 已到达过的顶点数组
 * @param queue 辅助队列
 */
void bfs(Graph graph, int startVertex, int targetVertex, int * visited, LinkedQueue queue) {
    
    
    offerQueue(queue, startVertex);  //首先把起始位置顶点丢进去
    visited[startVertex] = 1;   //起始位置设置为已走过
    while (!isEmpty(queue)) {
    
    
        int next = pollQueue(queue);
        printf("%c -> ", graph->vertex[next].element);  //从队列中取出下一个顶点,打印
        Node node = graph->vertex[next].next;    //同样的,把每一个分支都遍历一下
        while (node) {
    
    
            if(!visited[node->nextVertex]) {
    
       //如果没有走过,那么就直接入队
                offerQueue(queue, node->nextVertex);
                visited[node->nextVertex] = 1;   //入队时就需要设定为1了
            }
            node = node->next;
        }
    }
}

Let’s test it out:

int main(){
    
    
  	...
      
    int arr[graph->vertexCount];
    struct Queue queue;
    initQueue(&queue);
    for (int i = 0; i < graph->vertexCount; ++i) arr[i] = 0;
    bfs(graph, 0, 5, arr, &queue);
}

Successfully got the result:

image-20220831184445728

If you want to specify the search, it is even simpler:

_Bool bfs(Graph graph, int startVertex, int targetVertex, int * visited, LinkedQueue queue) {
    
    
    offerQueue(queue, startVertex);
    visited[startVertex] = 1;
    while (!isEmpty(queue)) {
    
    
        int next = pollQueue(queue);
        printf("%c -> ", graph->vertex[next].element);
        Node node = graph->vertex[next].next;
        while (node) {
    
    
            if(node->nextVertex == targetVertex) return 1;   //如果就是我们要找的,直接返回1
            if(!visited[node->nextVertex]) {
    
    
                offerQueue(queue, node->nextVertex);
                visited[node->nextVertex] = 1;
            }
            node = node->next;
        }
    }
    return 0;   //找完了还没有,那就返回0
}

In this way, we achieve breadth-first search of the graph.

Picture exercises:

  1. If the edge set of a graph is: {(A, B),(A, C),(B, D),(C, F),(D, E),(D, F)}, perform the graph Depth-first search, the resulting vertex sequence may be:

    A. ABCFDE B. ACFDEB C. ABDCFE D. ABDFEC

    For this kind of question, draw the picture directly. Because the edge set is in parentheses, it must be an undirected graph. Draw the picture first and then talk about it:

    image-20220902112113153

    Because these four options all start with A, we start from A. Because A connects B and C, A can be followed by B or C. Then look down and look at B first. Situation, because B is only connected to one D, so option A is directly excluded. Then, looking down, D is connected to E and F, so option C is directly excluded. At this time, there is only option D. Let’s look further down. When we choose F, only C follows, and D is not satisfied, so we choose B (of course, if you are afraid of instability, just push out option B as well)

  2. If the edge set of a graph is: {(A, B),(A, C),(B, D),(C, F),(D, E),(D, F)}, perform the graph Breadth-first search, the resulting vertex sequence may be:

    A. ABCDEF B. ABCFDE C. ABDCEF D. ACBFDE

    It’s the same idea as above. As long as you understand the ideas of BFS and DFS, there will definitely be no problem. Choose D.

  3. For the undirected connected graph shown in the figure below, a breadth-first traversal of the graph is performed starting from vertex A. The resulting vertex sequence may be:

    image-20220902110829087

    With the same idea, choose D


graph application

Earlier we introduced the relevant properties of graphs and the traversal methods of graphs. In this part, we will next look at the related applications of graphs.

Spanning tree and minimum spanning tree

Before we start to explain the minimum spanning tree, let us first review the connected components explained before.

  • For an undirected graph, if any two points in the graph are connected, then we call the graph a connected graph .
  • For a directed graph, if any vertices A and B in the graph have both a path from A to B and a path from B to A, then the directed graph is said to be a strongly connected graph .

The connected component is required to be a subgraph of a certain graph (a subgraph can be a graph that only contains some of the vertices and edges of the original graph, or it can be the original graph itself, because the definition is only a subset, not a true subset), and the subgraph It also needs to be connected. Another important condition is that it must have a maximum number of vertices (which can ensure that the graph is connected and contains the maximum number of vertices in the original graph) and include all the edges attached to these vertices (this maximum is more biased towards The maximum number of vertices), we call this subgraph a maximum connected subgraph.

  • The maximal connected subgraph of an undirected graph is called a connected component.
  • The maximal strongly connected subgraph of a directed graph is called a strongly connected component.

For example, the following directed graph is not connected:

image-20220903101036333

Among them, Figure 1 and Figure 2 both meet the above conditions. They are both strongly connected components. They are connected themselves and have reached the maximum number of vertices and edges (as long as other vertices and edges are added, it will lead to disconnection). However, Figure 3 is not connected. It is not a subgraph (the edge from A to B is missing) and is not strongly connected, so it is not a strongly connected component.

Another example is the following undirected graph, which itself is also disconnected:

image-20220822214010526

Among them, Figure 1 and Figure 3 both meet the conditions and are connected components. However, Figure 2 does not reach the maximum number of vertices and edges, so it is not a connected component.

Of course, the above are all situations where the original graph is not connected. If the original graph is a connected graph, the subgraph containing all its vertices and edges will already meet the conditions, so it itself is a connected component; similarly, if the original graph is a A strongly connected graph is itself a strongly connected component.

Summarized as follows:

  • If the original graph itself is not connected, then it has more than one connected component (strongly connected component).
  • If the original graph itself is connected, then its connected component (strongly connected component) is itself.

We have finished reviewing maximal connected subgraphs, so let’s discuss minimally connected subgraphs . The minimization here mainly refers to the minimization of the number of edges. First, it must still be a subgraph of the original graph and be connected, but at this time it is required to have the largest number of vertices and the smallest number of edges, which means that any edge must be removed. It will cause the graph to be disconnected (directly understood as a maximally connected subgraph, just remove as many edges as possible)

For minimally connected subgraphs, we generally only discuss undirected graphs (for directed graphs, there is no such thing as minimally strongly connected subgraphs, because we mainly discuss spanning trees). We still regard the original graph as a connected graph and the original graph as not. Connected graphs are analyzed separately. First, the original graph itself is a connected graph:

image-20220901180909877

The original graph itself is a connected graph, so its maximal connected subgraph is itself. At this time, we need to remove as many "unnecessary" edges as possible to still ensure that it is connected, that is, a minimally connected subgraph. You can see that the two pictures on the right contain the same number of vertices as the picture on the left, but the number of edges has been removed. And if you continue to remove any edge, it will lead to disconnection, so the two pictures on the left The pictures are all minimally connected graphs of the picture on the right (of course, just like the above, there may be multiple solutions, and minimally connected graphs are not unique)

We found that no matter which edges are removed, only N-1 edges must be left in the end (where N is the number of vertices). Each vertex has one and only one path connected to it, which means that it contains all N vertices of the original graph. The minimal connected subgraph of , we generally call it: spanning tree . Why is it called spanning tree? Because the number of nodes and edges exactly meet the definition of a tree (and there is no loop), we can adjust it to a spanning tree. Tree:

image-20220903103444346

Of course, this is the case when the original image itself is connected. If the original image itself is not connected, then multiple connected components will appear. At this time, a generated forest will be obtained. The number of trees in the forest is the number of its connected components. .

So how can we get a spanning tree of a directed graph in the program? We can use the two graph traversal methods explained earlier to generate it. Let’s take the following figure as an example. This is an ordinary undirected connected graph:

image-20220903111255127

If we follow the depth-first traversal method and start from G, we will get the following order:

image-20220903112122707

Following the sequence we can get a spanning tree:

image-20220903112332571

Although it looks strange, according to our order, the tree obtained is like this. It can be found that because our depth-first search will not take those back roads, it is equivalent to directly removing the loops and redundant edges. The final result of the traversal is a spanning tree.

Similarly, let's take a look at what results we will get if we use breadth-first traversal?

image-20220903112733812

The final spanning tree is:

image-20220903113108162

In fact, we found that the spanning tree obtained under breadth-first traversal is also arranged according to each layer, which is very clear. Of course, because the order of depth-first traversal and breadth-first traversal itself is not unique, the resulting spanning tree is not unique either.

After the discussion of spanning trees is completed, let's discuss the minimum spanning tree. So what does this minimum mean? If we add weights to the edges of an undirected graph (network graph) and now require that the sum of the weights of the spanning tree edges is minimum, we call this tree a minimum spanning tree (note that the minimum spanning tree is not unique, because it is possible There are situations where multiple solutions are the smallest). For example, the following is the final minimum spanning tree:

image-20220903113954010

There are two algorithms for constructing a minimum spanning tree, one is Prim's algorithm and the other is Kruskal's algorithm. Let's discuss the first one first:

Let’s take the following picture as an example:

image-20220903142138573

The core of Prim's algorithm is to start from any vertex and continue to grow into a tree. Each time, it will choose the smallest possible direction to extend. For example, we start from vertex A at the beginning:

At this time, the edges connected to A are B and E, and there are two extension directions of A. At this time, we only need to choose the smallest one:

image-20220903142208537

At this point we have constructed a tree consisting of A and E. Similarly, we need to find all the vertices connected to the A and E vertices in the current tree, including B, G, and H. Which one is the smallest, then the following Which one is extended? At this time, it is found that the minimum between H and E is the smallest, and continues to extend:

image-20220903142245688

Now it has become a tree composed of A, E, and H. Similarly, continue to find a minimum direction to extend according to the previous idea:

image-20220903142413604

Continue to extend and find the minimum between F and K:

image-20220903142558882

At this time, the weights of K, B, K, D, K, and H are all 4. The H vertex has already been passed and no loop can occur, so it is not considered. You can choose K, B or K, D at this time. , will not affect subsequent results:

image-20220903142829606

At this time, K and D are still the smallest, so we choose directly:

image-20220903142917096

Immediately afterwards, we found that the minimum weight came to 5. At this time, the edges with a weight of 5 include B, E and H, I and B, D. However, since E and D have already passed through, we can just choose H and I at this time. :

image-20220903143057702

Then, we found that I and G are also 5, so we can just select them directly:

image-20220903143509563

Then the minimum weight is now 6. You can choose H, J or I, J. Just choose any one:

image-20220903143532060

At this point, all the vertices of the entire graph have been traversed. Now we remove the unused edges, and the result is our minimum spanning tree:

image-20220903143645249

Although it looks a bit ugly, it will be fine if you smooth it over. It can be seen that the omitted edges are the largest possible edges, or the kind of edges that lead to loops. The remaining edges are basically the edges with small weights. What is obtained is the minimum spanning tree (note that during the exam, just follow There is definitely no problem in inferring our ideas, but we must look carefully and don’t miss any edges, otherwise big problems will occur)

Let's next look at another one, Kruskal's algorithm. Its core idea is that we actively select those small edges instead of passively extending them like above.

At the beginning, we directly remove all the edges, and we select them one by one (note that any edge can be selected, not only next to the selected vertex. Multiple trees may appear in this process. , but it will definitely connect into a tree in the end), and finally form a minimum spanning tree. Assuming that nothing is selected at the beginning, we will mark the selected edges in orange:

image-20220903144403449

First, we directly find the smallest edge, K, F. Its weight is 2, so we can just choose it directly:

image-20220903144533239

Next are the edges of F and H, with a weight of 3, which is currently the smallest:

image-20220903144828106

At this time, the minimum weight is only 4. There are currently 4 edges that can be selected. However, the edges K and H cannot be considered because K and H are already in the tree. The other three edges are all fine. Yes, we can just choose one:

image-20220903145239074

Continue to select the edge with weight 4:

image-20220903145321395

At this point the weight has reached 5, so we can choose any vertex with a weight of 5, as long as it does not cause a loop:

image-20220903145431925

At this time, when G and I are connected, we find that two trees appear. It doesn't matter. They will eventually be connected into one tree. We continue to select other edges with a weight of 5:

image-20220903145551091

At this time, we select the edges A and E, and then the edges H and I. Although the H and I vertices on this edge are already in the tree, they do not belong to the same tree. This is also the case can be connected, then we continue to select the vertex with a weight of 6:

image-20220903145828812

At this time, you can choose I, J or H, J (the minimum spanning tree is not unique). Now that we have connected all the vertices, the minimum spanning tree is constructed, and we throw away all the edges that are not selected:

image-20220903143645249

In fact, no matter which algorithm is used, a minimum spanning tree can be obtained in the end. The implementation code is too complicated and will not be written here.

shortest path problem

We introduced the minimum spanning tree earlier. Through two algorithms, we can select the smallest possible edges from many edges to get a tree with the smallest weight. In this section, we will continue to discuss issues related to minimum overhead.

image-20220903150609366

The subway lines are complicated. If we want to take a bus from one station to another, there are actually many options. For example, we can choose a solution with fewer transfers or a shorter distance. Different options may allow us to take the bus. The number of stations is different, and when we finally leave the station, we are always charged according to the minimum number of stations passing from point A to point B (for example, there are two options from A to B, the former requires 11 stops, and the latter requires Take 7 stations, but in the end you will only be charged for 7 stations), so how do we calculate the shortest path with so many lines?

We first discuss the simplest single-source shortest path . The so-called single-source shortest path is the shortest path starting from one vertex to other vertices, such as the following picture:

image-20220903153802247

To solve this problem, we can use Dijkstra's algorithm. Let's take a look at how Dijkstra's algorithm allows the computer to calculate the shortest path. It is similar to Prim's algorithm. There are many similarities in finding the minimum spanning tree. Let's start from A. Here we need a table to record:

image-20220903195351496

The dist line records the shortest path from A to other vertices. The path line records the vertices adjacent to the shortest path. We first start with A. The two directly adjacent to A are B and D. The distance between B and is 2, and the distance of D is 5, then we will record it first:

image-20220903195723929

Because they all come from A, we can just record it as A. Then we continue to find the shortest vertex B on the current path of A. At this time, vertex B can reach C, D, and A, because we cannot go back and do not consider A. , then the current shortest distance from A to C is via B, which is equivalent to the distance from A->B plus B->C:

image-20220903230103368

Then we look at vertex D. At this time we find that in addition to A directly going to D, we can also reach D from B. Then we can compare and see whether it is shorter from B to D or directly from A to D. Shorter min ( 2 + 2 , 5 ) min(2 + 2, 5)min(2+2,5 ) , through comparison, we found that going around B will be shorter, only 4 is needed, so we update it:

image-20220903230254335

Then we continue to find the next vertex D closest to A. D is connected to vertices E and J and can be updated directly. For example, the shortest path of E is equivalent to the shortest path from A to D plus the path from D to E. From D to The same goes for J:

image-20220903230521739

At this time, continue to find the next vertex J in the table that is closest to A. J can reach H or E. In the same way, we see whether it is shorter directly from D to E or from J to E, and compare. min (6 + 3, 8) min(6 + 3, 8)min(6+3,8 ) , the result is that D is shorter directly in the past, so there is no need to update. Then H is updated to the shortest path in J’s past:

image-20220903231152767

Let's then look at the next vertex C closest to A. At this time, C can reach F and E. Let's look at E first and compare them. If it is shorter to reach E from C, then update it to the new value. min (7 + 4, 8) min(7 + 4, 8)min(7+4,8 ) , in the end it is still the shortest from D to E, so it remains unchanged. Then we update the value of F:

image-20220903231449081

Then let's look at the next vertex E closest to A. E has more connections. At this time, the shortest path of E is from D, so we don't consider D. Let's look at the C and F connected to it in turn. , G, H, J (note that the comparison here is from E to these vertices, the previous comparison was from these vertices to E, don't think they are the same)

  • From E to vertex C: min (8 + 4, 7) min(8 + 4, 7)min(8+4,7 ) , so C continues to use the original plan.
  • From E to vertex F: min (8 + 2, 15) min(8 + 2, 15)min(8+2,15 ) , at this time the path from E to F is shorter, update F.
  • Reaching vertex G from E: direct update.
  • From E to vertex H: min (8 + 6, 13) min(8 + 6, 13)min(8+6,13 ) , so H continues to use the original plan.
  • From E to vertex J: min (8 + 3, 6) min(8 + 3, 6)min(8+3,6 ) , so J continues to use the original plan.

Finally get:

image-20220903232316607

We continue to the next vertex F closest to A. F connects G and E. However, since the current shortest path is from E, we cannot go back, so we go directly to G. Compare min (10 + 5, 17 ) min(10 + 5, 17)min(10+5,17 ) , it will be shorter to reach G from F, so update G:

image-20220903232542904

Then we then see the next shortest vertex H. At this time, H is connected to G and I. Let's look at G first, min (13 + 3, 15) min(13 + 3, 15)min(13+3,15 ) , maintain the original plan. Then there is I, just update it directly:

image-20220903232752582

Although the table has been filled in at this time, we have not yet traversed all the vertices. There may be shorter paths, so don't worry, we have to keep looking. At this time, continue to select the next vertex G closest to A, which is connected to E, F, H, and I. Since it actually comes from F, we exclude F. Let's look at the other three:

  • From G to vertex E: min (15 + 9, 8) min(15 + 9, 8)min(15+9,8 ) , obviously just choose the original plan.
  • From G to vertex H: min (15 + 3, 13) min(15 + 3, 13)min(15+3,13 ) , still choose the original solution which is shorter.
  • From G to vertex I: min (15 + 4, 21) min(15 + 4, 21)min(15+4,21 ) , from G to I is shorter, updated.

Finally get:

image-20220903233144469

At this point we look at the last vertex I, which is connected to G and H. Since it comes from G, just compare H directly, min (19 + 8, 13) min(19 + 8, 13)min(19+8,13 ) , just maintain the original plan. At this point, Dijkstra's algorithm ends. The final table obtained is the final shortest path value of A to each vertex, and based on the data in the path column, we can directly derive a path.

Of course, this only solves the single-source shortest path problem. Now we will increase the difficulty of the problem. For example, if we now want to find the shortest path between each pair of vertices in the graph, how should we calculate it? The simplest way is that we can execute Dijkstra's algorithm once for all vertices, so that we can find the shortest distance between all vertices. It’s just that this method is not the best choice. For this kind of problem, we can choose the Floyd algorithm.

For example, the following directed network diagram (don't have negative weights, otherwise big problems will occur):

image-20220904094948962

We can easily get its adjacency matrix:

image-20220904101234641

Freud's algorithm is derived based on the original adjacency matrix. The rules are as follows:

  • Starting from 1 and going to n (n is the number of vertices), there is a matrix sequence A1, A2,...An. We need to start from the initial adjacency matrix and push back from A1.
  • Every round, we will update the elements other than the non-diagonal lines (the diagonal lines are all 0, and they are still 0 after updating, so there is no need to look at them) and rows and icolumnsi , and determine whether the elements are projected horizontally or vertically. Is the sum smaller than the original value? If so, update it to the new value. The iteration formula is: A k ( i , j ) = min ( A k − 1 ( i , j ) , A k − 1 ( i , k ) + A k − 1 ( k , j ) ) A_k(i,j) =min(A_{k−1}(i,j), A_{k−1}(i,k)+A_{k−1}(k,j))Ak(i,j)=min(Ak1(i,j),Ak1(i,k)+Ak1(k,j))
  • After n rounds, the final result is the shortest distance.

We start from the first round, which is based on the original adjacency matrix:

image-20220904102258851

At this point we see that in addition to the diagonal, there are two positions: B->C and C->B. We calculate according to the above rules:

image-20220904102738649

In the same way, we continue to see C->B and update in the same way:

image-20220904103010762

The results of the final update are as follows:

image-20220904103033008

In fact, we found that the sum we calculated was equivalent to comparing the result of the detour with the result of the current direct route. In the same way, we start the second round:

image-20220904103410691

After the update is completed, the distance from C->A becomes 5:

image-20220904103549079

Let’s move on to the final round:

image-20220904103724239

At this time we also update the distance from A->B:

image-20220904103815369

The matrix we finally got stores the shortest distance between all vertices. Of course, we only calculated the shortest distance here and did not record the direction from which the vertex was reached. Friends can also calculate it separately in another A table records the minimum distance calculated from which vertex, which will not be demonstrated here. In fact, this algorithm is a better algorithm for us to understand, and it will be very simple when writing programs. Let’s take the following figure as an example:

image-20220904105442929

code show as below:

#define INF 210000000
#define N 4

int min(int a, int b){
    
    
    return a > b ? b : a;
}

void floyd(int matrix[N][N], int n){
    
    
    for (int k = 0; k < n; ++k)    //一共需要执行K轮
        for (int i = 0; i < n; ++i)   //i和j从0开始就行了,直接全看,不会影响结果的
            for (int j = 0; j < n; ++j)
                matrix[i][j] = min(matrix[i][k] + matrix[k][j], matrix[i][j]);   //按照规则更新就行了
}

int main(){
    
    
    int matrix[N][N] = {
    
    {
    
    0, 1, INF, INF},
                        {
    
    4, 0, INF, 5},
                        {
    
    INF, 2, 0, INF},
                        {
    
    3, INF, 7, 0}};

    floyd(matrix, N);

    for (int i = 0; i < N; ++i) {
    
    
        for (int j = 0; j < N; ++j)
            printf("%d ", matrix[i][j]);
        putchar('\n');
    }
}

The final result is:

image-20220904110149836

After comparison, it is indeed the shortest path.

topological sort

Let's look at topological sorting next . In fact, we may encounter the following problems in our lives:

For example, in our university courses, you may need to complete some prerequisite courses before starting some courses. For example, before starting a data structure course, you need to complete C language programming, and before starting a Java course, you need to complete courses such as computer network and computer composition principles. When we arrive, Before a certain stage, some prerequisites need to be completed before it can be unlocked. Including the tasks in our game, which main tasks need to be completed first and which side tasks need to be completed before new stages can be unlocked.

We can regard these tasks as a vertex, and finally connect them into a directed graph:

image-20220904110937920

Because the follow-up is always unlocked by preconditions, no loops can occur in the entire graph (if there is a loop, there is no way to continue, just like the question of which came first, the chicken or the egg), so it is built This kind of graph is also called a directed acyclic graph (DAG). In fact, in popular terms, it is just a flow chart. We only need to follow this flow chart. A graph in which vertices represent activities or tasks is also called an AOV graph .

Topological order refers to sorting a directed acyclic graph to obtain an ordered linear sequence.

For example, the topological sorting in the above figure can be the following:

  • A,B,C,D,E,F,G,H,I,J
  • A,C,D,B,E,F,G,H,I,J
  • A,D,C,B,E,F,G,H,I,J
  • A,B,D,C,E,F,G,H,I,J

As long as we ensure that the predecessor tasks are completed before the subsequent tasks, the completion order of the predecessor tasks is not required, so the topological sorting is not unique.

image-20220904121459739

So how do we perform topological sorting on a directed acyclic graph in the program? Taking the above picture as an example, it is actually very simple. We still use the queue to complete it. We only need to throw the vertices with an in-degree of 0 into the queue each time (note that after throwing them in, remember to update the in-degree of other vertices in the graph) First from A:

image-20220904122602140

At this time, there is the vertex A in the queue. Next, let’s take a look at the remaining vertices in the graph. Which ones are vertices with in-degree 0? We can see that D is also:

image-20220904122621668

After all the vertices with degree 0 currently enter the queue, we start dequeuing and officially start topological sorting. We print it directly when dequeuing and check whether there will be in-degrees of other vertices in the graph after this vertex leaves the graph. becomes 0, and if there are any, add other vertices to the queue. For example, after A is dequeued, A needs to be removed from the graph. Now B has also become a vertex with an in-degree of 0, so B is thrown into the queue:

image-20220904122914376

Then, we continue to dequeue D. We find that after D is dequeued, E becomes a vertex with an in-degree of 0, so we add E to the queue:

image-20220904123206951

Then we continue to dequeue. After B dequeues, we find that the indegree of any vertex has become 0, so we don’t care and continue:

image-20220904123257858

Continue to dequeue E. After E is dequeued, vertices F and C become vertices with an in-degree of 0 and are all added to the team:

image-20220904123445483

At this time, we continue to dequeue C, and we find that no vertices are enqueued and become 0. We continue to look at F:

image-20220904123544940

When F is dequeued, vertex G becomes a vertex with indegree 0. At this time, G is enqueued:

image-20220904123635522

The only thing left is to dequeue G, then add F to the team, and then dequeue F again:

image-20220904123742305

The final topological sequence obtained is: ADBECFGH. In fact, the idea is relatively simple. Of course, we can actually use the topological sorting algorithm to detect whether a directed graph is a directed acyclic graph, which means that the queue will be empty before the vertices are traversed. If so, it means there must be a loop.

critical path calculation

After the previous study, we know that a task may have predecessor tasks, but we have only briefly discussed the completion order of the tasks. If we add a weight to each task at this time, indicating the time required for the task, then Our follow-up tasks require that all previous tasks are completed on time before we can continue:

image-20220904130014247

For example, A represents a certain task (event), and B represents another task. We need to spend 2 days to complete A before we can start B. Activities including time are represented by edges. We call the graph with edges as activities an AOE graph . , each event corresponds to multiple activities (multiple edges). It is like a big project, starting from A, going through various steps in the middle, and finally ending with H.

What we need to calculate are those activities that delay the construction period the most. For example, if we want to start task C, we need to complete A and B. It takes 7 days to complete A and 5 days to complete B. Since C needs to complete A and B at the same time before we can continue. , so A becomes the task with the longest delay, because it takes longer than B. B has already completed the task, and still needs to wait for A to complete. As long as we calculate the tasks that delay the construction period the most and get a critical path , we can get the earliest time to complete the entire project and when each task can start.

Let’s see how to calculate it, using the following figure as an example:

image-20220904132328013

We need to calculate two things, one is the earliest completion time of the event (that is, the fastest time it will take to complete the event), and the other is the latest start time of the event (that is, the latest time this event can start without affecting the construction period) ):

image-20220904132930050

We still proceed in the order of the previous topological sorting. The first is A. Because there is only one starting point, A can definitely start directly, so the earliest and latest times are both 0 (note that if there are multiple starting points, the latest The later start time is not necessarily the same), we follow the working sequence of the AOE diagram to calculate the earliest and latest times of tasks B and C:

image-20220904133246766

Then there are D and E. First, D needs B and C to be completed at the same time before it can continue. In other words, it needs to choose the one that takes the longest time between B and C:

image-20220904133658962

The last one is F. There are three paths to reach F. We still choose the longest one. It takes a total of 8 days to get from D:

image-20220904133802361

Therefore, the earliest completion time of the entire project is 8 days. Let’s next look at the latest start time of the activity. Now we have to look backwards from the end point:

image-20220904134114068

First of all, the end point must be 8, because the fastest end of the construction period is 8 days. Let's continue to go backwards and look at E first. E takes 6 days to arrive, but it only takes 1 day to end, so 8 - 1 = 7, construction can start on the 7th day at the latest:

image-20220904134310369

Then comes D, because it takes 2 days to get from D to F, and D is already the 6th day, with a total time of 8 days, so D can’t wait, and needs to start work immediately on the 6th day:

image-20220904134445037

Then there is C, which is more complicated because C has two activities, one pointing to D and one pointing to F. We need to calculate each activity separately:

  • C -> F: Subtract the task time from the latest start time of F = 8 - 3 = 5. At this time, C can start from the 5th day at the latest.
  • C -> D: Subtract the task time from D's latest start time = 6 - 4 = 2. At this time, because C's earliest start time is 2, C cannot start later.

To sum up, C cannot start later, but can only start from the 2nd day, because the conditions of D must be met:

image-20220904135059487

Finally, there is B. B also has two tasks, one pointing to E and the other pointing to D:

  • B -> E: Subtract the task time from E's latest start time = 7 - 3 = 4. At this time, B can start work on the 4th day at the latest.
  • B -> D: Subtract the task time from D's latest start time = 6 - 2 = 4, same as above.

Therefore, the latest start time of B can be day 4:

image-20220904135338214

Of course, in the end we can also calculate A -> B and A -> C, but since there is only one starting point, the calculation must be 0. Of course, if there are multiple starting points, it needs to be calculated.

After the calculation is completed, we can get the critical path, that is, those vertices with the same earliest and latest time (which means it is urgent and time is very tight). The route connected by these vertices is the critical path we are looking for: A -> C -> D -> F, this path is fully arranged. All activities on the critical path are critical activities , and the entire construction period is determined by these activities. Therefore, we can shorten the entire project period by appropriately accelerating key activities, but be careful not to accelerate too fast, because if you use too much force, May cause changes in the critical path. Of course, the critical path is not unique and the same situation may occur.

At this point, we will stop explaining the content related to the graph structure.

Guess you like

Origin blog.csdn.net/qq_25928447/article/details/126694679