BFS、DFS、Dijkstra 和 A-Star 算法的通用实现

著名的算法 BFS、DFS、Dijkstra 和 A-Star 本质上是同一算法的变体。我将通过实际实施来证明这一点。

        事实证明， BFS、DFS、Dijkstra和A-Star等著名算法本质上是同一算法的变体。
        换句话说，可以实现一种通用数据结构，可以在这些算法之间进行切换，而无需更改其核心组件。虽然需要考虑一些限制，但探索这种方法很有趣。
        您可以在我的 GitHub 存储库中找到这些算法的所有工作代码。我建议在阅读本文时尝试代码，因为实践经验不仅能增强理论理解，还能增强学习效果。

图形表示

让我们考虑一个有 25 个节点排列在 5x5 网格中的图，我们的目标是找到从左上角的节点 0 到右下角的节点 24 的路径：

( 0  ) - ( 1  ) - ( 2  ) - ( 3  ) - ( 4  )
  |        |        |        |        |
( 5  ) - ( 6  ) - ( 7  ) - ( 8  ) - ( 9  )
  |        |        |        |        |
( 10 ) - ( 11 ) - ( 12 ) - ( 13 ) - ( 14 )
  |        |        |        |        |
( 15 ) - ( 16 ) - ( 17 ) - ( 18 ) - ( 19 )
  |        |        |        |        |
( 20 ) - ( 21 ) - ( 22 ) - ( 23 ) - ( 24 )

        上述每种算法都能够实现这一目标，但它们都有自己的局限性：
        BFS和DFS算法都在未加权图上运行，忽略边权重。虽然他们可以找到任何路径，但不能保证最优路径。
        Dijkstra 算法和A-Star算法都适用于加权图，但不应与包含负权重的图一起使用。A-Star 通常速度更快，因为它经过优化，在寻路过程中结合了欧几里得坐标。

为了解决这些限制，我们为每个节点 (X, Y) 分配假想坐标：

(0, 0) - (0, 1) - (0, 2) - (0, 3) - (0, 4)
   |        |        |        |       |
(1, 0) - (1, 1) - (1, 2) - (1, 3) - (1, 4)
   |        |        |        |       |
(2, 0) - (2, 1) - (2, 2) - (2, 3) - (2, 4)
   |        |        |        |       |
(3, 0) - (3, 1) - (3, 2) - (3, 3) - (3, 4)
   |        |        |        |       |
(4, 0) - (4, 1) - (4, 2) - (4, 3) - (4, 4)

最后，让我们为图中的每条边分配一些权重：

(0, 0) -1- (0, 1) -1- (0, 2) -1- (0, 3) -2- (0, 4)
   |          |          |          |         |
   2          1          1          2         2
   |          |          |          |         |
(1, 0) -2- (1, 1) -1- (1, 2) -2- (1, 3) -1- (1, 4)
   |          |          |          |         |
   2          1          1          1         1
   |          |          |          |         |
(2, 0) -1- (2, 1) -1- (2, 2) -1- (2, 3) -2- (2, 4)
   |          |          |          |         |
   2          1          1          1         2
   |          |          |          |         |
(3, 0) -2- (3, 1) -2- (3, 2) -1- (3, 3) -2- (3, 4)
   |          |          |          |         |
   2          1          1          2         2
   |          |          |          |         |
(4, 0) -2- (4, 1) -1- (4, 2) -2- (4, 3) -2- (4, 4)

在 C++ 中，该结构可以表示如下：

class GraphNode
{
public:
    int X;
    int Y;
};

class Graph
{
public:
    vector<vector<pair<int, int>>> Edges;
    vector<GraphNode> Nodes;
};

图中的边列表由数组数组表示，其中索引对应于图中每条边的出口节点的编号。然后，每个元素都包含一对值：

图中每条边的进入节点的数量。
边缘的重量。

使用这个简单的构造，我们可以遍历图中的每个节点并获取有关其连接的所有必要信息：

int toNode = graph.Edges[fromNode][neighbourIndex].first;
int weight = graph.Edges[fromNode][neighbourIndex].second;

现在，让我们在图中创建一些自定义连接，以观察对通用算法工作方式的影响。由于这段代码不是这里的主要焦点，我将提供相关方法的链接：
生成节点列表
创建自定义连接
或者，也可以使用更少的代码延迟生成该图中的所有连接和权重。然而，这种方法可能无法全面理解算法如何遍历图的实际差异。

通用算法

通用寻路算法的核心在于通用数据结构，在本项目中我们将其称为“队列”。然而，它不是经典的 FIFO（先进先出）数据结构。相反，它是一种通用结构，允许我们在遍历期间实现节点排队，同时能够根据所使用的算法更改排队机制。这个“队列”的界面很简单：

class pathFindingBase
{
public:
  virtual void insert(int node) = 0;
  virtual int getFirst() = 0;
  virtual bool isEmpty() = 0;
};

在深入研究队列的细节之前，让我们先检查一下遍历算法本身。

本质上，它与典型的 A-Star 或 Dijkstra 算法非常相似。首先，我们需要初始化一组集合，使我们能够：

维护尚未处理（白色）、当前正在处理（灰色）和已处理/访问（黑色）的节点列表。
跟踪从起始节点到集合中每个节点的最短路径的当前距离。
存储上一个-下一个节点对的列表，以便我们随后重建最终路径。

const int INF = 1000000;
const int WHITE = 0;
const int GREY = 1;
const int BLACK = 2;

/// <summary>
/// Universal algorithm to apply Path search using BFS, DFS, Dijkstra, A-Star.
/// </summary>
vector<int> FindPath(Graph& graph, int start, int finish, int finishX, int finishY)
{
  int verticesNumber = graph.Nodes.size();

  // All the nodes are White colored initially
  vector<int> nodeColor(verticesNumber, WHITE);

  // Current shortest path found from Start to i 
  // is some large/INFinite number from the beginning.
  vector<int> shortestPath(verticesNumber, INF);

  // Index of the vertex/node that is predecessor 
  // of i-th vertex in a shortest path to it.
  vector<int> previousVertex(verticesNumber, -1); 

  // We should use pointers here because we want 
  // to pass the pointer to a data-structure
  // so it may receive all the updates automatically on every step.
  auto ptrShortestPath = make_shared<vector<int>>(shortestPath);
  shared_ptr<Graph> ptrGraph = make_shared<Graph>(graph);

  ...

接下来，我们需要初始化数据结构。通过使用GitHub 存储库中提供的代码，您只需取消注释必要的代码行即可。该代码并非旨在根据参数选择数据结构，因为我希望您积极尝试它以获得更好的理解（是的，我是一个硬汉：D）。

//
// TODO
// UNCOMMENT DATA STRUCTURE YOU WANT TO USE:

//dfsStack customQueue;
//bfsQueue customQueue;
//dijkstraPriorityQueue customQueue(ptrShortestPath);
//aStarQueue customQueue(finishX, finishY, ptrGraph, ptrShortestPath);

// END OF TODO
/

最后是算法本身。本质上，它是所有三种算法和一些附加检查的组合。我们初始化一个“customQueue”并执行算法直到它变空。当检查图中的每个相邻节点时，我们将接下来可能需要遍历的每个节点放入队列。然后，我们调用该 getFirst()方法，该方法仅提取算法中下一个应该遍历的一个节点。

...
  customQueue.insert(start);
  nodeColor[start] = BLACK;
  ptrShortestPath->at(start) = 0;

  // Traverse nodes starting from start node.
  while (!customQueue.isEmpty()) 
  {
    int current = customQueue.getFirst();

    // If we found finish node, then let's print full path.
    if (current == finish) 
    {
      vector<int> path;

      int cur = finish;
      path.push_back(cur);

      // Recover path node by node.
      while (previousVertex[cur] != -1) 
      {
        cur = previousVertex[cur];
        path.push_back(cur);
      }

      // Since we are at the finish node, reverse list to be at start.
      reverse(path.begin(), path.end()); 
 return path;
    }

    for (int neighbourIndex = 0; 
         neighbourIndex < graph.Edges[current].size(); 
         neighbourIndex++)
    {
      int to = graph.Edges[current][neighbourIndex].first;
      int weight = graph.Edges[current][neighbourIndex].second;

      if (nodeColor[to] == WHITE) // If node is not yet visited.
      {
        nodeColor[to] = GREY; // Mark node as "in progress".
        customQueue.insert(to);
        previousVertex[to] = current;

        // Calculate cost of moving to this node.
        ptrShortestPath->at(to) = ptrShortestPath->at(current) + weight;
      }
      else // Select the most optimal route.
      {
        if (ptrShortestPath->at(to) > ptrShortestPath->at(current) + weight)
        {
          ptrShortestPath->at(to) = ptrShortestPath->at(current) + weight;
        }
      }
    }

    nodeColor[current] = BLACK;
  }

  return {};
}

到目前为止，该实现与您在书籍或互联网上找到的其他示例没有显着差异。然而，这里是关键的方面 -getFirst()是服务于主要目的的方法，因为它确定节点遍历的确切顺序。

广度优先搜索队列

让我们仔细看看队列数据结构的内部工作原理。BFS 的队列接口是最简单的一种

#include <queue>
#include "pathFindingBase.h"

class bfsQueue : public pathFindingBase
{
private:
  queue<int> _queue;

public:
  virtual void insert(int node)
  {
    _queue.push(node);
  }

  virtual int getFirst()
  {
    int value = _queue.front();
    _queue.pop();
    return value;
  }

  virtual bool isEmpty()
  {
    return _queue.empty();
  }
};

实际上，我们可以简单地将这里的自定义队列接口替换为STL（标准模板库）提供的标准C++队列。然而，这里的目标是普遍性。现在，您只需取消注释 main 方法中的行并运行此算法：
//bfsQueue customQueue; // UNCOMMENT TO USE BFS

结果，BFS 找到路径 24<-19<-14<-9<-8<-7<-6<-1<-0。

(0, 0) - (0, 1) - (0, 2) - (0, 3) - (0, 4)
                                       |
                                    (1, 4)
                                       |
                                    (2, 4)
                                       |
                                    (3, 4)
                                       |
                                    (4, 4)

如果我们考虑权重，这条路径的最终成本将为 11。但是，请记住 BFS 和 DFS 都不考虑权重。相反，他们遍历图中的所有节点，希望迟早能找到想要的节点。

DFS队列

DFS 看起来并没有太大不同。我们仅用堆栈替换 STD 队列。

#include <stack>
#include "pathFindingBase.h"

class dfsStack : public pathFindingBase
{
private:
  stack<int> _queue;

public:
  virtual void insert(int node)
  {
    _queue.push(node);
  }

  virtual int getFirst()
  {
    int value = _queue.top();
    _queue.pop();
    return value;
  }

  virtual bool isEmpty()
  {
    return _queue.empty();
  }
};

DFS 找到路径 24<-23<-22<-21<-20<-15<-10<-5<-0，成本为 15（它不优先考虑寻找最优成本）。有趣的是，与 BFS 相比，它的遍历方向相反：

(0, 0)
   | 
(1, 0) 
   |
(2, 0)
   |
(3, 0)
   | 
(4, 0) - (4, 1) - (4, 2) - (4, 3) - (4, 4)

迪杰斯特拉队列

现在，Dijkstra 算法是图中最著名的贪心搜索算法。尽管它有已知的局限性（无法处理负路径、循环等），但它仍然足够流行和高效。

需要注意的是，getFirst()该实现中的方法使用贪心方法来选择遍历的节点。

#include <queue>
#include "pathFindingBase.h"

class dijkstraQueue : public pathFindingBase
{
private:
  vector<int> _queue;
  shared_ptr<vector<int>> _shortestPaths;

public:
  dijkstraQueue(shared_ptr<vector<int>> shortestPaths) : _shortestPaths(shortestPaths) { }

  virtual void insert(int node)
  {
    _queue.push_back(node);
  }

  virtual int getFirst()
  {
    int minimum = INF;
    int minimumNode = -1;

    for (int i = 0; i < _queue.size(); i++)
    {
      int to = _queue[i];
      int newDistance = _shortestPaths->at(to);

      if (minimum > newDistance) // Greedy selection: select node with minimum distance on every step
      {
        minimum = newDistance;
        minimumNode = to;
      }
    }

    if (minimumNode != -1)
    {
      remove(_queue.begin(), _queue.end(), minimumNode);
    }

    return minimumNode;
  }

  virtual bool isEmpty()
  {
    return _queue.empty();
  }
};

Dijkstra 算法找到最短且最最优的路径 24<-19<-18<-13<-12<-7<-6<-1<-0，成本为 10：

(0, 0) -1- (0, 1)
             |
             1 
             |
           (1, 1) -1- (1, 2)
                         |
                         1 
                         |
                      (2, 2) -1- (2, 3)
                                    |
                                    1 
                                    |
                                  (3, 3) -1- (3, 4)
                                               |
                                               1 
                                               |
                                             (4, 4)

A-Star

A-Star 算法特别适合在带有坐标的欧几里得空间中寻找路径的情况，例如地图。这就是它在游戏中被广泛使用的原因。它不仅利用基于最小权重的“盲目”贪婪搜索，而且还考虑到目标的欧几里得距离。因此，在实际场景中它通常比 Dijkstra 算法高效得多

class aStarQueue : public pathFindingBase
{
private:
  vector<int> _queue;
  shared_ptr<vector<int>> _shortestPaths;
  shared_ptr<Graph> _graph;
  int _finishX;
  int _finishY;

  /// <summary>
  /// Euclidian distance from node start to specified node id.
  /// </summary>
  int calcEuristic(int id)
  {
    return sqrt(
      pow(abs(
        _finishX > _graph->Nodes[id].X ?
        _finishX - _graph->Nodes[id].X :
        _graph->Nodes[id].X - _finishX), 2) +
      pow(abs(
        _finishY > _graph->Nodes[id].Y ?
        _finishY - _graph->Nodes[id].Y :
        _graph->Nodes[id].Y - _finishY), 2));
  }

public:
  aStarQueue(int finishX, int finishY, shared_ptr<Graph> graph, shared_ptr<vector<int>> shortestPaths)
    :
    _shortestPaths(shortestPaths),
    _graph(graph)
  {
    _finishX = finishX;
    _finishY = finishY;
  }

  virtual void insert(int node)
  {
    _queue.push_back(node);
  }

  virtual int getFirst()
  {
    int minimum = INF;
    int minimumNode = -1;

    for (int i = 0; i < _queue.size(); i++)
    {
      int to = _queue[i];
      int newDistance = _shortestPaths->at(to);
      int euristic = calcEuristic(to);

      if (minimum > newDistance + euristic)
      {
        minimum = newDistance + euristic;
        minimumNode = to;
      }
    }

    if (minimumNode != -1)
    {
      _queue.erase(remove(_queue.begin(), _queue.end(), minimumNode), _queue.end());
    }

    return minimumNode;
  }

  virtual bool isEmpty()
  {
    return _queue.empty();
  }
};

结果，我们得到了与 Dijkstra 算法相同的结果，因为它提供了最优路线。

缺点

然而，我们的 Dijkstra 算法和 A-Star 算法存在问题......
上面的实现在我们的通用数据结构中使用了向量（动态数组 []）。每次调用时getFirst()，都需要 O(N) 时间才能在向量中找到所需的节点。因此，假设主要算法也需要 O(N*M) 时间，其中 M 是邻居的平均数量，则整体复杂度可能几乎是立方的。这将导致大型图的性能显着下降。

虽然该示例有助于理解所有四种算法并没有本质上的不同，但问题在于细节。使用通用数据结构有效地实现所有四种算法具有挑战性。

为了获得最佳性能（这通常是 99% 情况下的主要关注点），应将更多精力放在优化上。例如，对于 Dijkstra 算法和 A-Star 算法，使用优先级队列而不是数组非常有意义。

谈到 A-Star 算法的优化，提及一些将打开优化的深层世界的链接是很有意义的：Lucho Suaya 的 A* Optimizations and Improvements和Steve Rabin 的 JPS+：比 A* 快 100 倍。

最后一句话

本文的目的是展示所有遍历算法彼此之间的相关性。但本文中使用的图表示例绝对过于简单，无法展示这些算法之间性能的真正差异。因此，使用这些示例主要是为了获得概念性理解，而不是用于生产目的。