Dynamic Programming for Probability Expectation and Gaussian Elimination to Solve Equations

The project of the algorithm class has a very interesting topic, which is to use dynamic programming to find probability expectations, which uses the Gaussian elimination method, hereby record it.

Topic:
Little Z came to an ancient tomb to find treasure. There are many intersections and fork roads in the ancient tombs. Some intersections have traps. Little Z will drop A[i] blood every time he passes the trap at intersection i, and the trap is permanent (that is, every time the little Z arrives A[i] point of blood will be dropped at intersection i). Fortunately, there are some intersections without traps. Unfortunately, Xiao Z is a road idiot. He has no way of judging where he has gone or where he is going; he can only randomly (equally) walk to a fork at each intersection to reach the next intersection. Little Z is now at the entrance of the ancient tomb (ie junction 1), there are no traps here; the treasure is hidden at junction n, and there are no traps there.

What you never expected is that you are the guardian of this ancient tomb. You know all the structures of this ancient tomb (including its intersections, forks, and traps). Now you need to calculate the probability that little Z can see the treasure alive.

The interface of this function is now given:

double func(int n, int hp, vector<int>& damage, vector<int>& edges) {
    
    
}

Among them, the parameter n is the number of intersections; hp is the initial blood volume of the small Z; the array damage is n data, representing the damage of n intersection traps (no traps are 0, to ensure that there are no traps at junctions 1 and n); data edges are 2*(the number of forks) data, every 2 data is an edge, and the edges are bidirectional.

For example:
there is a triangular intersection 1, 2, 3. Little Z has 2 points of blood. The small Z is at 1, the treasure is at 3, and the trap is at 2 (damage is 1). Then its result is 0.875. Little Z is successful if it reaches 3 before its HP drops to 0, so its only path to failure is 1-2-1-2. Every time it finds its way is random, and the probability of going wrong three times is 0.5^3 = 0.125, then the probability of successfully reaching 3 is 0.875.
Insert picture description here
Example input:

3 2 3
0 1 0
1 2 
1 3
2 3

n=3, hp=2, there are 3 edges, damage[]={0,1,0}, the edges are 1->2,1->3,2->3, edges[]={1,2 ,1,3,2,3}.

Idea:
Create a two-dimensional array dp[hp+1][n+1], where dp[i][j] represents the expected number of times to reach point j when i drops of blood are left. List the state transition equation: When
Insert picture description here
i=hp, j=1, add 1 because the starting point is node 1, so it will inevitably pass through node 1 once. Excluding the end means: once the end is reached, it is over, so it cannot be considered that the end is turned back.

When damage[j]>0, if the lower layers are regarded as constants, dp[i][j] can be easily calculated. When damage[j]=0, the dp values ​​of all intersections where damage is 0 in this hp layer have a relationship (ie equation), and the value cannot be directly calculated. First find out the dp values ​​of intersections where the damage is not 0 at this level, treat them as constants, and then use the Gaussian elimination method to solve the linear equations of the intersection dp with damage=0, and you can find all the intersections in this hp level. dp value. After this layer is found, you can continue to solve the previous layer.

Finally, the probability
Insert picture description here
that Xiao Z can see the treasure alive is the sum of the expected number of times to reach the intersection n when hp>=1.

Code:

//选择列主元并进行消元
void upperTrangle(vector<vector<double>> &a,int n) {
    
    
	double tmp; //用于记录消元时的因数
	for (int i = 1; i <= n; i++) {
    
    
		int r = i;
		for (int j = i + 1; j <= n; j++)
			if (fabs(a[j][i]) > fabs(a[r][i]))
				r = j;
		if (r != i)
			for (int j = i; j <= n + 1; j++)
				swap(a[i][j], a[r][j]);//与最大主元所在行交换
		for (int j = i + 1; j <= n; j++) {
    
    //消元
			tmp = a[j][i] / a[i][i];
			for (int k = i; k <= n + 1; k++)
				a[j][k] -= a[i][k] * tmp;
		}
	}
}
//高斯消元法(列选主元)
void Gauss(vector<vector<double>> &a, int n) {
    
    
	upperTrangle(a, n);//列选主元并消元成上三角

	for (int i = n; i >= 1; i--) {
    
    //回代求解
		for (int j = i + 1; j <= n; j++)
			a[i][n + 1] -= a[i][j] * a[j][n + 1];
		a[i][n + 1] /= a[i][i];
	}
}

vector<int> findAdjacent(vector<int> edges, int p) {
    
    //找p点的相邻点
	vector<int> points;
	for (int i = 0; i < edges.size() / 2; ++i) {
    
    
		if (edges[2 * i] == p) {
    
    
			points.push_back(edges[2 * i + 1]);
		}
		else if (edges[2 * i + 1] == p) {
    
    
			points.push_back(edges[2 * i]);
		}
	}
	return points;
}

double func(int n, int hp, vector<int>& damage, vector<int>& edges) {
    
    
	vector<vector<double>> dp;
	for (int i = 0; i < hp + 1; ++i) {
    
    
		vector<double> tmp;
		for (int j = 0; j < n + 1; ++j) {
    
    
			tmp.push_back(0);
		}
		dp.push_back(tmp);
	}
	dp[hp][1] = 1;

	vector<int> adjacentCount;//邻接点个数,下标是点标识
	for(int i=0;i<=n;++i){
    
    
		if(i==0) adjacentCount.push_back(0);
		else adjacentCount.push_back(findAdjacent(edges,i).size());
	}

	for (int row = hp; row >= 1; --row) {
    
    
		//先计算damage不为0的点
		for (int col = 1; col <= n; ++col) {
    
    
			if (damage[col - 1] > 0) {
    
    
				for (int i : findAdjacent(edges, col)) {
    
    //不为终点的相邻点
					if (i != n && row + damage[col - 1] <= hp) {
    
    
						dp[row][col] += dp[row + damage[col - 1]][i]/(double)adjacentCount[i];
					}
				}
			}
		}
	    
		//计算damage为0的点
		vector<int> zero;
		vector<vector<double>> matrix;//增广矩阵的扩大

		for (int col = 1; col <= n; ++col) {
    
    
			if (damage[col - 1] == 0) {
    
    
				zero.push_back(col);
			}
		}

		for (int i = 0; i < zero.size() + 1; ++i) {
    
    //矩阵n+1行 n+2列 第1行第1列均为0 其余部分为增广矩阵
			vector<double> tmp;
			for (int j = 0; j < zero.size() + 2; ++j) {
    
    
				tmp.push_back(0);
			}
			matrix.push_back(tmp);
		}
 

		//填充增广矩阵
		for (int i = 0;i<zero.size();++i){
    
    
			matrix[i + 1][i + 1] = -1;
			matrix[i+1][zero.size() + 1] = -dp[row][zero[i]];//常数项
			
			for (int k : findAdjacent(edges, zero[i])) {
    
    
				if (k != n) {
    
    

					if (damage[k-1] > 0) {
    
    //若damage>0,则为常数项
						matrix[i+1][zero.size() + 1] -= dp[row][k]/(double)adjacentCount[k];
					}
					else {
    
    //若damage=0,则为未知项
						for (int index = 0; index < zero.size(); ++index) {
    
    
							if (zero[index] == k) {
    
    
								matrix[i+1][index + 1] = 1/(double)adjacentCount[k];
								break;
							}
						}
					}
				}
			}
			
		}

		//高斯消元法求解
		Gauss(matrix, zero.size());

		//将解写回dp中
		for (int i = 0; i < zero.size(); ++i) {
    
    
			dp[row][zero[i]] = matrix[i + 1][zero.size() + 1];
		}
	}


	double result = 0;
	for (int i = 1; i <= hp; ++i) {
    
    
		result += dp[i][n];
	}

	return result;
}

The time complexity is O(hp*n^2), and the space complexity is O(hp*n).

Guess you like

Origin blog.csdn.net/livingsu/article/details/106824702