Graph Theory of Data Structures

Table of contents

6.1 Basic Concepts of Graphs

6.2 Storage and manipulation of graphs

6.2.1 Adjacency Matrix

6.2.2 Adjacency list

6.2.3 Cross linked list

6.2.4 Adjacency Multitable

6.3 Graph traversal

6.3.1 Depth-first search DFS

6.3.2 Breadth-first search BFS

6.4 Applications of graphs

6.4.1 Minimum spanning tree

6.4.2 Shortest path

6.4.3 Topological sort

6.4.4 Critical
path


(Key content: deep search and wide search; basic concepts and properties of graphs; storage structure and characteristics of graphs; traversal and various applications based on storage structures; ideas of related algorithms for graphs) 

6.1 Basic Concepts of Graphs

Def: The graph G is composed of a vertex set V and a universal set V, denoted as G=(V,E); V represents a finite non-empty set of vertices (data elements); E represents a finite set of edges. (The graph cannot be empty, there cannot be no vertex in the graph, and the edge set can be empty, at this time there are only vertices but no edges)

Directed graph: E is a finite set of directed edges (arcs), and directed edges are ordered pairs of vertices, denoted as <v,w>, called arcs from v to w, or v adjacent to w

Undirected graph: E is a finite set of undirected edges

Complete graph: Any two vertices in the graph are connected by an edge. For an undirected complete graph there are n(n-1)/2 edges; for a directed complete graph there are n(n-1) arcs

Network: A graph with weights on its edges is called a network, also called a weighted graph

Vertex degree: the number of edges associated with the vertex, denoted as TD(v). In a directed graph, the degree of a vertex is the sum of the in-degree ID(v) and out-degree OD(v) of the point

Path: a vertex sequence composed of consecutive edges, the sum of the number/weight of edges on the path is the path length

Loop: A path whose first vertex is the same as the last vertex is called a loop or a cycle (if a graph has n vertices and more than n-1 edges, then the graph must have a cycle); except that the starting point and ending point of the path can be the same, the path with the other vertices being different is a simple path, and the path with the other vertices being different except that the starting point and the ending point of the path are the same is a simple cycle

Connected graph (strongly connected graph): In the undirected graph G=(V, {E}), if there is a path from v to u for any two vertices v and u, then G is called a connected graph (strongly connected graph)

Connected component (strongly connected component): The maximum connected subgraph of the undirected graph G is called the connected component of G.

A maximally connected subgraph means that the subgraph is a connected subgraph of G. If any vertices of G that are not in the subgraph are added, the subgraph is no longer connected; a minimally connected subgraph means that if any edge is deleted in the modified subgraph, it will no longer be connected (the same is true for directed graphs).

Spanning tree, forest: Spanning tree is a minimal connected subgraph containing all vertices of undirected graph G; Spanning forest is a collection of spanning trees of each connected component for a non-connected graph.

6.2 Storage and manipulation of graphs

6.2.1 Adjacency Matrix

Def: Use a one-dimensional array to store the information of the vertices in the graph, and use a two-dimensional array to store the information of the edges in the graph. This two-dimensional array that stores the adjacency relationship between vertices is called an adjacency matrix

The adjacency matrix A of a graph G=(V,E) with n nodes is n×n, and when (vi,vj)∈E is defined, A[i][j]=1, otherwise =0; and for a weighted graph, the adjacency matrix should store the weight corresponding to the changed edge, and ∞ represents that there is no edge between two vertices:

·Features:

1) The adjacency matrix of an undirected graph is symmetric, so only the upper or lower triangular matrix elements need to be stored in actual storage; the degree of vertex i = the number of 1s in the i-th row (column);

2) The adjacency matrix of a directed graph may be asymmetrical, the out-degree of a vertex = the sum of elements in row i, the in-degree of a vertex = the sum of elements in column i, and the degree of a vertex = the sum of elements in row i + the sum of elements in column i

3) In the adjacency matrix of the complete graph, the diagonal elements are 0 and the rest are 1

·Adjacency matrix storage structure definition:

#define MaxInt 32767       //表示极大值,即∞
#define MVNum 1 00         //最大顶点数
typedef char VerTexType;   //设顶点的数据类型为字符型
typedef int ArcType;       //假设边的权值类型为整型

typedef struct{
VerTexType vexs[MVNum];    //顶点表
ArcType arcs[MVNum][MVNum];//邻接矩阵
int vexnum, arcnum;        //图的当前点数和边数
}AMGraph; // Adjacency Matrix Graph

·Creation of undirected network graph:

(1) Enter the total number of vertices and the total number of edges.

(2) The information of input points in turn is stored in the vertex table.

(3) Initialize the adjacency matrix so that each weight is initialized to a maximum value.

(4) Construct an adjacency matrix.

/*建立无向网图的邻接矩阵表示*/
void CreateMGraph (MGraph *G )
{
   int i,j,k,w; 
   printf ( "输入顶点数和边数:\n") ;
   scanf ("&d, d", &G->numVertexes, &G->numEdges) ;/*输入顶点数和边数*/
   for(i = 0;i <G->numVertexes;i++)                /*读入顶点信息,建立顶点表*/
       scanf ( &G->vexs[i] ) ;
   for (i = 0;i <G->numVertexes;i++ )
       for(j=0;j <G->numVertexes;j++ )
           G->arc[i][j]- INFINITY;  /* 邻接矩阵初始化*/
   for (k .0;k <G->numEdges;k++)    /* 读入numEdges条边,建立邻接矩阵*/
  { 
        printf ("输入边(vi,vj)上的下标1,下标和权w:\n") ;
        scanf ("ed,td,各d",G主,&j,&W) ;   /*输入边(vi,vj)上的杈w */
        G->arc[i][j1-w;
        G->arc[j][i]= G->arc[i][j];       /*因为是无向圈,矩阵对称*/
  }
}

6.2.2 Adjacency list

When a graph is a sparse graph, using the adjacency matrix method will obviously waste a lot of storage space, and the adjacency table method combines sequential storage and chain storage, which greatly reduces this unnecessary waste

Def: The vertices in the graph are stored in a one-dimensional array or singly-linked list (arrays are more convenient to obtain information). In addition to storing itself, each vertex also needs to store a pointer to the first adjacent point to find the edge information of the vertex; all adjacent points of each vertex form a linear single-linked list, which is called the edge list of vertices in undirected graphs, and called the outgoing edge list of vertices as arc tails in directed graphs; so there are two types of nodes in the adjacency list: vertex table nodes and edge table nodes

The structure of the adjacency list:

·Features:

1) The adjacency list of an undirected graph is not unique. If there are n vertices and e edges, then n vertex table nodes and 2e edge table nodes are required, which is O(n+2e), which is suitable for storing sparse graphs; the degree of a vertex is the number of nodes in the vertex edge table

2) The storage space required in a directed graph is O(n+e), the out-degree of a vertex is the number of nodes in the vertex out-edge table, and the in-degree needs to traverse all the adjacency lists to find the number of nodes whose adjacency point field value is i-1. Between this, the inverse adjacency list can be used to quickly calculate the in-degree of a vertex

The structure definition of the adjacency list:

typedef char VertexType;      /* 顶点类型应由用户定义*/
typedef int EdgeType;         /*边上的权值类型应由用户定义*/
typedef struct EdgeNode       /*边表结点*/
{
 int adjvex;                   /*邻接点城,存储该顶点对应的下标*/
 EdgeType we ight;             /*用于存储权值,对于非网图可以不需要*/
 struct EdgeNode *next;        /*链域,指向下一个邻接点 */
}EdgeNode;

typedef struct VertexNode      /* 顶点表结点*/
{
  VertexType data;               /*顶点域,存储顶点信息+/
  EdgeNode * firstedge;          /*边表头指针*/
}Ver texNode, AdjList [MAXVEX] ;
typedef struct
{
  AdjList adjList;
  int numVertexes, numEdges;     /*图中当前顶点数和边数*/
}GraphAdiList;

Create an undirected network:

(1) Enter the total number of vertices and the total number of edges.

(2) Create a vertex table

The information of the input points in turn is stored in the vertex table

Initialize the pointer field of each header node to NULL

(3) Create an adjacency list

Enter the two vertices attached to each edge in turn

Determine the serial number and j of the two vertices, and establish an edge node

Insert this edge node into the heads of the two edge lists corresponding to vi and vj respectively

6.2.3 Cross linked list

For directed graphs, the adjacency list is easy to calculate the degree but difficult to calculate the degree, while the inverse adjacency list is easy to calculate the degree but difficult to calculate the degree, and the cross-linked list can effectively combine the two

Def: Redefine the node structure of the vertex table and edge table, firstin represents the head pointer of the incoming edge table, pointing to the first node of the incoming edge table of the vertex, firstrout represents the pointer of the outgoing edge table, pointing to the first node in the outgoing edge table of the changed vertex; tailvex is the subscript of the arc start point in the vertex table, headvex is the subscript of the arc end point in the vertex table, headlink is the pointer field of the incoming edge table, pointing to the next edge with the same end point, taillink is the edge table pointer field, pointing to the same starting point next edge

· Cross linked list structure: 

6.2.4 Adjacency Multitable

For undirected graphs, the adjacency list is easy to obtain the information of vertices and edges, but for some operations such as deleting an edge, it is necessary to find two nodes representing the edge, which is troublesome, so the adjacency multiple table is introduced

Def: Redefine the structure of the edge table nodes in the same way as the cross-linked list, and the vertex nodes remain unchanged

 The structure of adjacency multi-table:

6.3 Graph traversal

6.3.1 Depth-first search DFS

·Basic idea: DFS is actually a recursive process, similar to the preorder traversal of a tree. It starts from a certain vertex v, visits this vertex, then visits any vertex w from v's unvisited adjacent points, and then visits w's.... Repeat the above process. When it is no longer possible to visit downwards, return to the most recently visited vertex, if it has other adjacent points that have not been visited, continue the search process until all vertices in the graph are visited

·algorithm:

typedef int Boolean;        /* Boolean是布尔类型,其值是TRUE或FALSE */
Boolean visited [MAX];      /*访问标志的数组*/

/*邻接矩阵的深度优先递归算法*/
void DFS (MGraph G,int 1)
{
  int j;
  visited[i] = TRUE;
  printf (“%C ",G.vexs[1]) ;  /*打印顶点,也可以其他操作*/
  for (j=0;j < G.numVertexes; j++ )
    if (G.arc[i][j] 16& !visited[j] )
     DFS(G, j) ;               /*对为访问的邻接顶点递归调用*/

/*邻接矩阵的深度遍历操作*/
void DFSTraverse (MGraph G)
{
  int i;
  for(i=0; i< G.numVertexes; 1++ )
    visited[i]=FALSE;          /*初始所有顶点状态都是来访问过状态*/
  for(i=0; 1< G. numVertexes; i++)
     if(!visited[i]);           /*对来访问过的顶点调用DES,若是连通图,只会执行一次*/
     DFS(G, i);
}

If the graph is an adjacency list structure, the code in the traversal operation is almost the same, but the difference is that the array is replaced by a linked list in the recursion

Algorithm analysis: Use an adjacency matrix to represent a graph. When traversing each vertex in the graph, you need to scan the row where the vertex is located from the beginning, and the time complexity is O(n2); use an adjacency list to represent the graph. Although there are 2e table nodes, you only need to scan e nodes to complete the traversal, plus the time to visit n head nodes, the time complexity is O(n+ e).

So the dense graph is suitable for depth traversal on the adjacency matrix; the sparse graph is suitable for depth traversal on the adjacency list

6.3.2 Breadth-first search BFS

Basic idea: BFS is a hierarchical search process, similar to the hierarchical traversal of a binary tree. Starting from a certain vertex of the graph, visit all adjacent vertices V1 V2, ... Vn of this point in turn, and then visit all unvisited vertices adjacent to them according to the order in which these vertices are visited, and repeat the process until all vertices are visited.

It can be seen that each step forward in the breadth search process may visit a batch of vertices, and there will be no regressing process, so it is not recursive. The algorithm needs to use auxiliary queues to remember the vertices of the next layer of vertices being visited

·algorithm:

/*邻接表的广度遍历算法*/
void BFSTraverse (GraphAdjList GL)
{
  int i;
  EdgeNode *P;
  Queue Q;
  for(i=0;i < GL->numVertexes; i++)
    visited[i] - FALSE;
  InitQueue (&Q) ;
  for(i=0; i< GL->numVertexes; i++)
  { 
     if( !visited[i])
    {
      visited[i]=TRUE;
      printf (&C ",GL->adjList[i].data) ;     /*打印顶点,也可以其他操作*/
      EnQueue (&Q,i) ;
      while ( !QueueEmpty(Q) )
      {
        DeQueue (6Q,6i) ;
        P =GL->adjList [i].firstedge;        /*找到当前顶点边表链表头指针*/
          while (p )
         {
            if (!visited[p->adjvex])             /* 若此顶点未被访问*/
             {
               visited [p->adjvex]=TRUE;
               printf ("%c",GL->adjList [p->adjvex].data) ;
               EnQueue (&Q,p->adjvex);              /* 将此顶点入队列*/
             }
                P = P->next;                          /*指针指向下一个邻接点*/
         }
       }
     }
   }  
}  

·Algorithm analysis: If an adjacency matrix is ​​used, BFS must loop through a whole row (n elements) in the matrix for each visited vertex, and the total time cost is O(n2); if an adjacency list is used to represent a graph, although there are 2e table nodes, it only needs to scan e nodes to complete the traversal, plus the time to visit n head nodes, the time complexity is O(n+e).

6.4 Applications of graphs

6.4.1 Minimum spanning tree

Spanning tree: a graph in which all vertices are connected by edges, but there are no loops, and a graph can have many different spanning trees.

Its characteristics are:

  1. A spanning tree is a minimal connected subgraph of a graph. If one edge is removed, it is not connected, and if an edge is added, it will inevitably form a circuit.
  2. The spanning tree has the same number of vertices as the graph
  3. The spanning tree of a connected graph with n vertices has n-1 edges; the graph with n vertices and n-1 edges is not necessarily a spanning tree
  4. The path between any two points in the spanning tree is unique

Minimum spanning tree : For a given undirected network graph, among all spanning trees of the network, the spanning tree that minimizes the sum of the weights of each edge is called the minimum spanning tree

nature:

  1. The number of edges of the minimum spanning tree is the number of vertices minus one
  2. The minimum spanning tree is not unique, but the weight sum of its corresponding edges is always unique and the smallest
  3. When the weights of each edge in the graph are not equal to each other, the minimum spanning tree of the graph is unique; if the number of edges of an undirected connected graph is one less than the number of vertices, that is, when the graph itself is a tree, its minimum spanning tree is itself

· MST properties: Let N = (V, E) be a connected network, U is a non-empty subset of the vertex set V. If the edge (u, v) is an edge with the minimum weight, where u∈U, v∈(VU), there must be a minimum spanning tree containing the edge (u, v).

·General algorithm:

During the construction of the spanning tree, n vertices in the graph belong to two sets:

The set of vertices that have fallen on the spanning tree: U

The set of vertices that have not yet fallen on the spanning tree: VU

Next, the edge with the smallest weight should be selected among all the edges connecting the vertices in U and the vertices in VU

GENERIC MST (G) {
   T=NULL;
   while T术形成一棵生成树;
       do找到一条最小代价边(u, v)并且加入T后不会产生回路;
         T=T∪{u,v};
}

The minimum spanning tree algorithms based on the above properties mainly include prim algorithm and kruskal algorithm

· Prim Algorithm:

➢Assume N=(V, E) is a connected network, and TE is the set of edges in the minimum spanning tree on N;

➢Initial order U={u0}, (u0∈V), TE={};

➢Among all u∈U, V∈VU edges (u, v)∈E, find an edge (u0, v0) with the least cost;

➢Merge (u0, v0) into the set TE, while v0 is merged into U;

➢Repeat the above operations until U=V, then T=(V, TE) is the minimum spanning tree of N;

 The simple implementation is as follows. The time complexity of the prim algorithm is 0(|V|2) and does not depend on E, so it is suitable for solving the minimum spanning tree of a graph with dense edges

void Prim(G,T) {
  T=Ø;              //初始化空树
  U={w};            //添加任一顶点w
   while((V-U)!=Ø) {    //若树中不含全部顶点
     设{u,v)是使u∈U与v∈{V-U},且权值最小的边;
     T=T∪{ (u,v) } ;        //边归入树
     U=U∪{v} ;             //顶点归入树
    }
}

Kruskal algorithm: This algorithm is different from the prim algorithm starting from the vertex. It selects the appropriate edge to construct according to the increasing order of the weight , but the selected edge cannot form a ring.

➢Set the connected network N= (V, E), let the initial state of the minimum spanning tree be a non-connected graph T=(V, { }) with only n vertices and no edges,

Each vertex forms a connected component by itself;

➢Select the edge with the least cost in E, if the vertices attached to the edge fall on different connected components in T (that is, acyclic)

Then add this edge to T; otherwise discard this edge and choose the next edge with the smallest cost;

➢According to this, until all vertices in T are on the same connected component, that is, n-1 edges;

 The simple implementation is as follows. Usually, the k algorithm uses a heap to store the set of edges, so it only needs O(log|E|) to select the edge with the smallest weight each time. Since all the edges in the spanning tree can be regarded as an equivalence class, the time complexity of constructing T is O(|E|log|E|), so the k algorithm is suitable for graphs with sparse edges and many vertices.

void Kruska1 (V,T) {
  T=V ;              //初始化树T.仅含顶点
  numS=n;            //连通分量数
    while (numS>1) { //若连通分数大于1
      从E中取出权值最小的边(v,u) ;
      if(v和u属于T中不同的连通分量) {
      T=T{(v,u)};    //将此边加入生成树中
      numS-- ;        //连通分量数减1,
    }
}

6.4.2 Shortest path

Dijkstra's Algorithm--Solving the Single-Source Shortest Path Problem

First divide V into two sets of S and T:

 S is the set of vertices for which the shortest path has been calculated ; T=VS is the set of vertices for which the shortest path has not been determined .

Then add the vertices in T to S in the increasing order of the shortest path to ensure that the shortest path length from the source point V to each vertex in S is not greater than the shortest path length from V to any vertex in T.

At the same time, two auxiliary arrays dist[] and path[] are set:

dist[]: Record the current shortest path length from the source point v0 to other vertices. Its initial state is: if there is an arc from v0 to vi, then dist[i] is the weight on the arc; otherwise, set dist[i] to ∞.

path[]:  path[i] represents the predecessor node of the shortest path from the source point to vertex i. At the end of the algorithm, the shortest path from the source point v0 to the vertex vi can be traced back according to its value.

Algorithm idea:

1) Initialize S={V}, T= {rest vertices}. The distance value corresponding to the vertex in T is stored in the auxiliary array dist[], if <v0, vi> exists, dist[i] is its weight, otherwise it is ∞;

2) Select a vertex vj with the smallest distance value from T, satisfying dist[j]=Min{dist[i]|vi∈T}, and add it to S;

3) Modify the distance value of the vertices in T. If vj is added as an intermediate vertex, the distance value from v0 to vi is shorter than the path without vj, then modify the distance value;

4) Repeat 2) 3) for a total of n-1 operations until S= V

Floyd's Algorithm--Solve the shortest path problem between all vertices

To solve the problem of the shortest path between all vertices, you can use a vertex as the source point and execute the dijkstra algorithm n times repeatedly. The algorithm complexity is n×O(n^2), which is troublesome; the floyd algorithm introduced below can solve this problem better, but the time complexity is still O(n^3)

Algorithm idea:

6.4.3 Topological sort

· AOV network: A directed acyclic graph is used to represent the sub-projects of a project and their mutual restrictive relationship, in which the vertex represents the activity, and the arc represents the priority restriction relationship between the activities. This kind of directed graph is called the network whose vertices represent activities, referred to as AOV network (Activity On Vertex). Loops are not allowed in the AOV network, and the existence of a loop indicates that an activity presupposes itself, which is obviously impossible

Def: Topological sorting is a kind of sorting for the vertices of the directed acyclic graph, which makes if there is a path from vertex A to vertex B, then vertex B appears behind vertex A in the sorting, and each AOV network has one or more topological sorting sequences

The method of topological sorting:

  1. Select a vertex with no predecessor from the AOV net and output
  2. Remove the vertex and all directed edges originating from it from the net
  3. Repeat the above steps until all vertices are output or there are no vertices without predecessors in the graph

 ·Algorithm implementation:

bool Topologicalsort (Graph G){
  InitStack(S) ;                  //初始化投,存储入变为0的顶点
  for (int i=0;i<G. vexnum;i++)
    if (indegree[i]==0}
      Push(s, i);                   //将所有入度为0的项点进栈
  int count=0 ;                     //计数,记录当前已经输出的顶点数
  While (!IsEmpty(S)) {             //栈不空,则存在入度为0的顶点
      Pop(S,i);                     //栈顶元素出栈
      Print [count++]=i;            //输出顶点i
      for (p=G.vertices[i}.firstarc;p; p=p->nextarc) {
      //将所有i指向的顶点的入度减1,片且将入度减为0的顶点压入栈s
         v=P->adjvex;
         if (!(--indegree[v]} )
            Push{S,v) ;              //入度为0,则入栈
       } 
  }//while
  if (count<G. vexnum)
      return  false;                 //排序失败,有向图中有回路
  else 
      return true;                   //拓扑排序成功
}

While outputting each vertex, all edges starting from the vertex must be deleted, so the time complexity of topological sorting is O(n+e)

【Notice】: 

①If a vertex has multiple direct successors, the result of topological sorting is usually not unique; but if each vertex has been arranged in a linear orderly sequence, and each vertex has a unique predecessor-successor relationship, the result of topological sorting is unique

②Because the status of each vertex in the AOV network is equal, it is renumbered according to the result of topological sorting, and the new adjacency matrix of the generated AOV network can be a triangular matrix; for a general graph, if its adjacency matrix is ​​a triangular matrix, there is a topological sorting, and vice versa.

6.4.4 Critical Path

AOE network : A directed graph is used to represent the sub-projects of a project and their mutual constraints, in which arcs represent activities, and vertices represent the start or end events of activities. This directed graph is called a network that represents activities with edges, referred to as AOE network (Activity On Edge), and the edges in the AOE network have weights.

Critical path: The sum of the duration of each activity on the path is the path length. Among all paths from the source point to the destination, the path with the largest path length is called the critical path, and the activities on the critical path are called critical activities.

·Several parameters:

1) The earliest occurrence time ve(k) of the event vk: refers to the longest path length from the source point v1 to the vertex vk, which determines the earliest time when all activities starting from vk can start

  Start recursively from ve (source point) = 0, ve(k)=Max{ve(j)+weight<vj, vk,>}

2) The latest occurrence time vl(k) of event vk: refers to the latest time that the event must occur when the subsequent event vj can occur at its latest occurrence time vl(j) without delaying the completion of the entire project

  From vl (sink point) = ve (sink point), recursively push vl(k)=Min{vl(j)-weight<vk, vj>}

3) The earliest start time e(i) of activity ai: refers to the earliest occurrence time of the event represented by the starting point of the activity arc, if <vk, vj> represents ai, then e(i)=ve(k)

4) The latest start time l(i) of activity ai: refers to the difference between the latest occurrence time of the event represented by the end point of the activity arc and the time required for the activity. If <vk,vj> represents ai, then l(i)=vl(j)-weight(vk,vj)

5) Time margin of activity ai: the time that ai can delay without increasing the total time required to complete the entire project, d(i)=l(i)-e(i), the key activity is the activity with a time margin of 0

【Note】 :

① All activities on the critical path are critical activities, which are the key factors determining the duration, so the duration of the entire project can be shortened by speeding up the critical activities. However, the degree of shortening is limited. Once shortened to a certain extent, critical activities may become non-critical activities.

② The critical path is not unique. For a network containing multiple critical paths, only increasing the speed of key activities in one critical path cannot improve the duration of the entire project. Only by speeding up the critical activities included in all critical paths can the duration be shortened

Steps to solve the critical path:

1) Starting from the source point, set ve (source point) = 0, and calculate the earliest occurrence time ve of the remaining vertices according to the topological order;

2) Starting from the sink point, set vl(sink point)=ve(sink point), and calculate the latest occurrence time vl of the remaining vertices according to the inverse topological order;

3) Calculate the earliest start time e of all arcs according to the ve() value of each vertex;

4) Calculate the latest start time l of all arcs according to the vl (value of each vertex;

5) Calculate the difference d() of all activities in the AOE network, and find out that all activities with d()=0 constitute the critical path;

Reverse method to solve the critical path:

 Details and other methods: data structure - quickly grasp how to manually solve the critical path_real_vavid's blog-CSDN blog_how to find the critical path

Guess you like

Origin blog.csdn.net/weixin_46516647/article/details/126456207