Backtracking Method for Algorithm Design and Analysis

1. Introduction to Backtracking

Backtracking, also known as heuristics, is a brute force search method to find the optimal solution . Due to violence, the time complexity of the backtracking method is high , so when comparing some problems with large numbers, such as the shortest path problem , the running time is generally longer. In the backtracking method, **DFS (Depth First Search)** is a very important tool.

1.1 The basic idea of ​​DFS

(1) A certain possible situation is explored forward and a child node is generated;

(2) During the process, once the original selection is found not to meet the requirements, it will go back to the parent node, then re-select another direction, generate child nodes again, and continue to explore forward;

(3) Repeat this until the optimal solution is obtained.

1.2 Basic idea of ​​backtracking method

(1) For specific problems, define the solution space of the problem ;

(2) Determine an easy-to-search solution space structure (choice of data structure);

(3) Generally, the solution space is searched in the form of DFS ;

(4) During the search process, the pruning function can be used to optimize the algorithm. (Pruning function: Use the constraint function and the limit function to cut off the subtrees that cannot get the optimal solution , collectively referred to as the pruning function.)

Solution space: As the name suggests, it is the set of all solutions to a problem. (But this is still far from the optimal solution we require .)

Constraints: the requirements of an effective solution, that is, the requirements of the title.

Constraint function: A function that subtracts subtrees that do not satisfy the constraint.

Bounding function: A function that removes the nodes that cannot get the optimal solution.

Expansion node: The node that is currently generating child nodes is called an expansion node.

The types of solution spaces processed by the backtracking method are mainly divided into the following two types:

  • Subset tree: When the given problem is to find a subset satisfying a certain property from the set , the corresponding solution space tree is called a subset tree.

  • Permutation tree: When a given problem is determined to satisfy a certain property from the set , the corresponding solution space tree is called a permutation tree.

1.3 The difference between the backtracking method and DFS

DFS is an algorithm for traversing data structures such as search graphs and trees, more like a tool;

The backtracking method is to continuously generate and give up some solutions in order to solve the problem (the dynamic generation of the solution space in the process of searching the problem is an important feature of the backtracking method), until the optimal solution is found or the search is completed. As a guiding idea, DFS is used to conduct a comprehensive search in the solution space.

1.4 Pruning

Pruning is to use filter conditions to cut out the search path that does not need to be considered at all (it has been judged that this path will not get the optimal solution) in the search process , thereby avoiding some unnecessary searches and optimizing the algorithm solution speed, of course It is also necessary to ensure the correctness of the results.

Applied to the backtracking algorithm, we can judge in advance whether the current path can generate a result set , and if not, we can backtrack in advance. And this is also called feasibility pruning .

In addition, there is another kind of optimal pruning , which records the current optimal value each time. If the current node cannot produce a better solution than the current optimal solution , it can be backtracked in advance.

However, the filter conditions for pruning are difficult to find. If you want to improve the efficiency of the algorithm through pruning optimization, you must ensure the correctness of the results and the accuracy of pruning.

2.01 Knapsack Problem: Subset Trees

2.1 Problem introduction

The 01 knapsack problem is a classic problem solved by subset trees. Questions are as follows:

Xiao Ming is going to visit his classmates, and he plans to bring a backpack of chocolates as a gift. He hopes that the total value of the chocolates packed is the highest (this may be better). However, Xiao Ming's physical strength is limited, and the chocolate bag should not be too heavy, only 8kg. Available chocolates are as follows:

serial number brand weight/kg value
1 Ferrero 4 45
2 good time 5 57
3 Dove 2 22
4 Cudie (Spain) 1 11
5 self made 6 67

2.2 Solutions

insert image description here

Because we are considering finding a subset, each item has only two states of selection and non-selection, so the solution space is a binary tree. In this tree, the edges at each level represent whether an item is selected or not. As shown in the figure above, select the edge between point 0 on the first layer and point 1 on the left, which means selecting item 1, that is, select the left subtree to go down; if you do not select item 1 to enter the bag, enter the right subtree, select Point 1 on the right. Then, if there are n items in total, there will be n layers of edges and n+1 layers of points. Each leaf node in the last layer represents a selection method. There are 2 n leaf nodes in total, that is, there are 2 n solutions in the solution space. We need to select the best node among these leaf nodes.

We first give a pseudocode framework for searching subtree sets using the backtracking method:

void search(层数)
{
    
    
	if(搜索到最底层)
		打印出结果解;
	else 
		for(遍历当前层解)
		{
    
    
			if(合适解)
				继续搜索;
			撤消当前状态的影响; //回溯
		}
}

The backtracking method pays attention to "violence". Thinking from the perspective of violence, if you want to find out all the collocations that fill your backpack as much as possible, you need to mark the maximum value of each method (each solution) to find the optimal solution. We start with the first type of chocolate, then find the next one, judge whether it can be loaded, then recurse, reach the boundary, compare, record the better solution, backtrack, and continue to look down... loop. From the perspective of the subset tree, we prefer to go to the left subtree, that is, to enter the package; when we go to the leaf node or do not meet the weight conditions of the constraints, we will go back to the parent node, enter the right node, and finally traverse the whole Tree.

After judging whether it can be loaded, you can use a book array to mark whether to choose to pack.

2.3 Algorithm implementation

The algorithm for writing the 01 knapsack problem based on the above ideas is as follows:

//01背包问题-回溯法-子集树 
#include <iostream>

int n, bag_v, bag_w;
int bag[100], x[100], w[100], val[100];

//search递归函数,当前节点背包的价值为cur_v(current value),重量为cur_w(current weight)
void search(int cur, int cur_v, int cur_w)
{
    
    
    if(cur > n) //判断子集树的边界   
    {
    
    
        if(cur_v > bag_v) //子集树对应的背包价值 是否超过了 最大价值
        {
    
    
            bag_v = cur_v; //得到最大价值
            for(int i = 1; i <= n; i++)      
                bag[i] = x[i]; //x表示当前子集树各物品是否被选中,将选中的物品存入bag中 
        }
    }
    else 
        for(int j = 0; j <= 1; j++) //遍历当前解层:j 代表是否选择该物品
        {
    
    
            x[cur] = j;      
            if(cur_w + x[cur]*w[cur] <= bag_w) //满足重量约束,继续向前寻找配对 
            {
    
    
                cur_w += w[cur]*x[cur];
                cur_v += val[cur]*x[cur];
                search(cur + 1, cur_v, cur_w); //递归,下一层物品 
                //清除痕迹,回溯上一层 
                cur_w -= w[cur]*x[cur];   
                cur_v -= val[cur]*x[cur];
                x[cur] = 0;
            }
        }
}

int main()
{
    
    
    int i;
    bag_v = 0; //初始化背包最大价值
    
    //输入数据 
    std::cout << "请输入背包最大容量:" << std::endl;
    std::cin >> bag_w;
    std::cout << "请输入物品个数:" << std::endl;
    std::cin >> n;
    std::cout << "请依次输入物品的重量:" << std::endl;
    for(i = 1; i <= n; i++) 
        std::cin >> w[i];
    std::cout << "请依次输入物品的价值:" << std::endl;
    for(i = 1; i <= n; i++) 
        std::cin >> val[i]; 
    search(1, 0, 0);
    
    std::cout << "最大价值为:" << std::endl;
    std::cout << bag_v << std::endl;
    std::cout << "物品的编号依次为:" << std::endl;

    for(i = 1; i <= n; i++)
        if(bag[i] == 1) 
            std::cout << i << " ";
    std::cout << std::endl;
    
    return 0;
}

The output is as follows:

PS E:\Code\VSCode\Learning\build> .\main.exe
请输入背包最大容量:
8
请输入物品个数:
5
请依次输入物品的重量:
4 5 2 1 6
请依次输入物品的价值:
45 57 22 11 67
最大价值为:
90
物品的编号依次为:
2 3 4

2.4 How to optimize

We can use an upper bound function bound(): the current value + the maximum value that can be accommodated by the remaining capacity to compare with the current maximum value of the backpack (that is, the optimal solution). If the bound() is smaller, the search will not continue meaning, cut off the left subtree, that is, do not select the current item, and enter the right subtree.

Because there are only two decisions to choose or not to choose for an item, and there are n items in total, the time complexity is O(2 n ). Because the recursive stack can reach up to n layers, and only a constant one-dimensional array is needed to store the information of all items, the final space complexity is O(n).

So, how do we calculate this "maximum value that the remaining capacity can hold"? First, we sort the items according to their unit weight value from largest to smallest, and then consider each item in order. code show as below:

if(cur_w+w[cur]<=bag_w) //将物品cur放入背包,搜索左子树,即选择当前物品 
{
    
    
	cur_w+=w[cur]; //同步更新当前背包的重量
	cur_v+=val[cur]; //同步更新当前背包的总价值
	put[cur]=1;
	search(cur+1,cur_v,cur_w); //深度搜索进入下一层
	cur_w-=w[cur]; //回溯复原
	cur_v-=val[cur]; //回溯复原
}
if(bound(cur+1,cur_v,cur_w)>bag_v) //如若符合条件则搜索右子树,即不选择当前物品 
{
    
    
	put[cur]=0;
	search(cur+1,cur_v,cur_w);
}
  • When i<=n and the weight exceeds the limit, leftw is negative , and what we get is an unattainable ideal maximum value , because the value of the last item put in at this time is relatively high, but it cannot be completely stuffed into the school bag, so we will Remove the redundant part, and only take a part of the object into the bag. Of course, this cannot be done. The calculated value is therefore an unattainable ideal value.

  • When i>n, the weight does not exceed the limit, it is the maximum value achievable.

This explains the optimization of this upper bound function. It can be seen that this is an optimal pruning optimization to determine whether the current node has a chance to produce a better solution.

The optimized algorithm is as follows:

#include <iostream>

int n, bag_v, bag_w;
int bag[100], put[100], w[100], val[100], order[100];
double perp[100]; 

//按照单位重量价值排序,这里用冒泡 
void bubblesort()
{
    
    
    int i,j;
    int temporder = 0;
    double temp = 0.0;
 
    for(i = 1;i <= n; i++)
        perp[i] = val[i] / w[i]; //计算单位价值(单位重量的物品价值)
    for(i = 1; i <= n - 1; i++)
    {
    
    
        for(j = i + 1; j <= n; j++)
            if(perp[i] < perp[j]) //冒泡排序perp[], order[], sortv[], sortw[]
        {
    
    
            temp = perp[i];  //冒泡对perp[]排序交换 
            perp[i] = perp[i];
            perp[j] = temp;
 
            temporder = order[i]; //冒泡对order[]交换 
            order[i] = order[j];
            order[j] = temporder;
 
            temp = val[i]; //冒泡对val[]交换 
            val[i] = val[j];
            val[j] = temp;
 
            temp = w[i]; //冒泡对w[]交换 
            w[i] = w[j];
            w[j] = temp;
        }
    }
}

//计算上界函数,功能为剪枝
double bound(int i, int cur_v, int cur_w)
{
    
       //判断当前背包的总价值cur_v + 剩余容量可容纳的最大价值 <= 当前最优价值
    double leftw = bag_w - cur_w; //剩余背包容量
    double b = cur_v; //记录当前背包的总价值cur_v,最后求上界
    //以物品单位重量价值递减次序装入物品
    while(i <= n && w[i] <= leftw)
    {
    
    
        leftw -= w[i];
        b += val[i];
        i++;
    }
    //装满背包
    if(i <= n)
        b += val[i] / w[i] * leftw;
    return b; //返回计算出的上界
}

void search(int cur, int cur_v, int cur_w)
{
    
       //search递归函数,当前current节点的价值为current value,重量为current weight 
    if(cur > n) //判断边界   
    {
    
    
        if(cur_v > bag_v) //是否超过了最大价值
        {
    
    
            bag_v = cur_v; //得到最大价值
            for(int i = 1; i <= n; i++)      
                bag[order[i]] = put[i]; //put表示当前是否被选中,将选中的物品存入bag中 
        }
    }
    //如若左子节点可行,则直接搜索左子树
    //对于右子树,先计算上界函数,以判断是否将其减去
    if(cur_w + w[cur] <= bag_w) //将物品cur放入背包,搜索左子树,即选择当前物品 
    {
    
    
        cur_w += w[cur]; //同步更新当前背包的重量
        cur_v += val[cur]; //同步更新当前背包的总价值
        put[cur] = 1;
        search(cur + 1, cur_v, cur_w); //深度搜索进入下一层
        cur_w -= w[cur]; //回溯复原
        cur_v -= val[cur]; //回溯复原
    }
    if(bound(cur + 1, cur_v, cur_w) > bag_v) //如若符合条件则搜索右子树,即不选择当前物品 
    {
    
    
        put[cur] = 0;
        search(cur + 1, cur_v, cur_w);
    }
}

int main()
{
    
    
    int i;
    bag_v = 0; //初始化背包最大价值
    //输入数据 
    std::cout << "请输入背包最大容量:" << std::endl;;
    std::cin >> bag_w;
    std::cout << "请输入物品个数:" << std::endl;
    std::cin >> n;
    std::cout << "请依次输入物品的重量:" << std::endl;
    for(i = 1; i <= n; i++) 
        std::cin >> w[i];
    std::cout << "请依次输入物品的价值:" << std::endl;
    for(i = 1; i <= n; i++) 
        std::cin >> val[i];
    for(i = 1; i <= n; i++) //新增的order数组,存储初始编号 
        order[i] = i;
    search(1, 0, 0);
    
    std::cout << "最大价值为:" << std::endl;
    std::cout << bag_v << std::endl;
    std::cout << "物品的编号依次为:" << std::endl;

    for(i = 1; i <= n; i++)
        if(bag[i] == 1) 
            std::cout << i << " ";
    std::cout << std::endl;
    
    return 0;
}

3. Traveling Salesman Problem TSP: Sorting Tree

3.1 Problem introduction

Xiao Ming thought about it before going to his classmates, and planned to visit his high school classmates in various universities. He planned to start from his own school, pass through some colleges where his high school classmates were, and finally return to his own school. Xiao Ming is lazy and wants to only take the shortest path, and at the same time doesn't want to play a second time with a school because they are not the main target. How to make a travel plan?

At first glance, is this topic similar to the shortest path problem? But it is a pity that the shortest path does not require passing through every point, and it is still different.

3.2 Solutions

The biggest difference between the permutation tree and the subset tree is that the solution of the subset tree is an unordered subset, while the solution of the permutation tree contains all the elements of the entire set. Based on the principle of violence, we arrange the elements in full .

insert image description here

Numbers outside { } indicate that they have been sorted, and numbers inside { } indicate that they are not yet sorted.

In the sorting tree, each layer selects a number and puts it at the end of the queue . Therefore, for a collection of n elements, the first layer of the tree will have n child nodes, indicating that n numbers can be selected at the first position of the queue. , a fork is reduced by one compared with the previous one (because the element at a position has been determined); the tree has a total of n+1 layers (the last layer is omitted in the figure), indicating n times of selection ; the leaf nodes have a total of **n!* *, representing the number of combinations A, there are n! cases in all permutations (so the time complexity is also n!).

In this problem, our solution space is the full arrangement of all cities, that is, the order of walking through each city, so we can use a sorting tree to consider this problem.

3.3 Algorithm framework

void backtrack(int t)
{
    
    
    if(t > n)
        output(x);
    else
    {
    
    
        for(int i = t; i <= n; i++)
        {
    
    
            swap(x[t], x[i]);
            if(constraint(t) && bound(t))
                backtrack(t+1);
            swap(x[i],x[t]);
        }
    }
}

The swap here is an exchange function . For an arrangement, as long as any two numbers are exchanged, it will be a new arrangement. constraint() and bound()) are constraints and limit functions (for pruning optimization), respectively.

Why use swap to exchange instead of putting the data into a new array and other operations? This is because when we swap in the array x that originally stored the data, we put the sorted elements at the front of the array, leaving the data unsorted . In this way, we can start from t when we perform the for loop, and at the same time avoid repeatedly encountering the sorted numbers, and do not need redundant codes such as book records.

3.4 Algorithm implementation

//旅行商问题-回溯法-排序树 
#include <iostream>
 
int n, t;
int dis[100][100], x[100], bestroad[100]; 
int cur_dis, bestdis;
const int INF=99999;

void swap(int& a, int& b)  //swap函数,交换 
{
    
    
	int temp;
	temp = a;
	a = b;
	b = temp;
}
 
void backtrack(int t)   
{
    
    
	if (t == n)
	{
    
     	//判断边界。很长的判断,不能到自己或到不了,要比当前最优解短 
		if (dis[x[n - 1]][x[n]] != 0 && dis[x[n]][1] != 0 &&(cur_dis + dis[x[n - 1]][x[n]] + dis[x[n]][1] < bestdis || bestdis == 0)) 
		{
    
      	//记录最优路径,最优距离 
			for (int j = 1; j <= n; j++)
				bestroad[j] = x[j];
			bestdis = cur_dis + dis[x[n-1]][x[n]] + dis[x[n]][1];
			return;
		}
	}
	else
	{
    
    
		for (int j=t;j<= n; j++)
		{
    
    
			if(dis[x[t]][x[j]]!=0&& (cur_dis + dis[x[t - 1]][x[t]] + dis[x[t]][1] < bestdis || bestdis == 0))
			{
    
    
				swap(x[t], x[j]);
				cur_dis += dis[x[t]][x[t-1]];
				backtrack(t+1);
				//回溯 
				cur_dis -= dis[x[t]][x[t-1]];
				swap(x[t], x[j]);
			}
		}
	}
 }
 
int main()
{
    
    
	int i, j, m, a, b, c;

	std::cout << "输入城市数:" << std::endl;
	std::cin >> n; 
	std::cout << "输入路径数:" << std::endl; 
	std::cin >> m;
	//初始化邻接矩阵
	for(i = 1; i <= n; i++)
		for(j = 1; j <= n; j++)
			dis[i][j] = 0;  
	std::cout << "输入路径与距离:" << std::endl;

	//读入城市之间的距离
	for(i = 1; i <= m; i++)
	{
    
     
		std::cin >> a >> b >> c;
		dis[a][b] = dis[b][a] = c; //无向图,两边都记录 
	}
	for(i = 1; i <= n; i++)
		x[i] = i;
		
	backtrack(2);      
	std::cout << "最佳路径为:";
	for (i = 1; i <= n; i++)
			std::cout << bestroad[i] << " --> ";
	std::cout << "1" << std::endl;
	std::cout << "最短距离为:" << bestdis;

	return 0;
 }
 

The output is as follows:

PS E:\Code\VSCode\Learning\build> ."E:/Code/VSCode/Learning/build/main.exe"
输入城市数:
4
输入路径数:
6
输入路径与距离:
1 2 30
1 3 6
1 4 4
2 3 5
2 4 10
3 4 20
最佳路径为:1 --> 4 --> 2 --> 3 --> 1
最短距离为:25

Notice:

  • Different from the shortest path, here we treat **INF (that is, no path connectivity) and 0 (that is, itself)** together, because they do not need swap.

  • We use t==n instead of t>=n to prevent the array below the table from going out of bounds .

4. Summary

  • As an extremely violent search method, the backtracking method has a very high time complexity, the subset tree is about 2 n , and the sorting tree is about n! , so it is not very powerful to deal with large problems. But in return, it can give the real optimal solution.

  • The subset tree and sorting tree of the backtracking method can deal with two types of problems, finding the optimal subset and the optimal sort.

  • It is very difficult to optimize with pruning functions.

Reference article: Program Ape Voice: [Algorithm Learning] Let's talk about the backtracking method

Guess you like

Origin blog.csdn.net/crossoverpptx/article/details/131419565