Topic description: Given two sequences X={x1, x2, x3, ...xm} and Y={y1, y2, y3, ... yn}, find the longest common subsequence of X and Y.
Analysis: If the brute force search method is used, all subsequences of X need to be exhausted and then compared with all subsequences of Y respectively, so as to filter out LCS. X has a total of 2^m subsequences, so the complexity of brute force search must be exponential, which is obviously not practical. Then can we deduce the subsequences of X and Y by analyzing the results of the prefix subsequences of X and Y?
Suppose a prefix subsequence of X Xi = {x1, x2, x3, ... , xi}, a prefix subsequence of Y Yi = {y1, y2, y3, ... , yi}, and we assume that it is known The LCS of Xi and Yi is kij. So what is the LCS of X(i+1) and Y(i+1)? Let's assume its LCS is k(i+1)(j+1). After a little thinking, it is easy to find that there are two cases: (1) If X(i+1) = Y(i+1), then obviously k(i+1)(j+1) = kij + 1 (2) If X(i+1) != Y(i+1), then k(i+1)(j+1) = max(k(i+1)j, ki(j+1)) see here for dynamic Students who are familiar with planning usually find that this seems to be in line with the problem-solving characteristics of dynamic programming! Let's continue to analyze this problem with the problem-solving idea of dynamic programming:
Step 1. Sub-problem: To find the LCS of Xi and Yj, we must first find the LCS of X(i-1) and Y(j-1), and the LCS of X(i) and Y(j-1) and the LCS of X(i-1) and Y(j), thus forming a recursive problem
Step 2. Find the state transition formula of dynamic programming
Suppose we use an array c[i,j] to record the LCS lengths of Xi and Yj, then (1) c[i, j] = 0 if i = 0 or j = 0 (2)c[i-1, j -1] + 1 if i, j>0 and X[i] = Y[j] (3)max(c[i-1, j], c[i, j-1]) if i, j>0 and X[i] != Y[j]
Step 3. Write the code according to the formula
(1) From the formula in step 2, we can easily write the recursive algorithm:
#Recursively find LCS def LCS_Length(X, Y, i, j): if i < 0 or j < 0: # Judge the recursive exit return 0 else: if X[i] == Y[j]: return (LCS_Length(X, Y, i-1, j-1) + 1) return max(LCS_Length(X, Y, i, j-1), LCS_Length(X, Y, i-1, j))
(2) Recursive conversion into a bottom-up dynamic programming algorithm
#Dynamic programming for LCS def LCS_Length2(X, Y): m = len(X) n = len (Y) #record list is used to record the LCS length of Xi and Yj record = [[0 for i in range(n)] for j in range(m)] #The outer loop starts from i = 0 and calculates record[i, j] in turn, #Calculation order: [0,0],[0,1],[0,2]...., [1,0],[1,1],[1,2].... #So when solving record[i, j], we have saved record[i-1, j-1], record[i, j-1], record[i-1,j] (the key to solving the problem) for i in range(m): for j in range(n): if X[i] == Y[j]: if i>0 and j>0: record[i][j] = record[i-1][j-1] + 1 else: record[i][j] = 1 else: #Pay attention to judging the boundary conditions here, that is, whether i, j are equal to 0 if i == 0 and j>0: record[i][j] = record[i][j-1] elif i > 0 and j==0: record[i][j] = record[i-1][j] else: record[i][j] = max(record[i-1][j], record[i][j-1]) #return an array of records LCS return record
Step 4. Refactor the solution to the problem
After writing the code, we found that we seem to have missed a problem, that is: the above code only helps us to find the length of the LCS, how do we reconstruct the solution of the LCS problem? That is, how to output the LCS instead of just finding the length of the LCS.
Let's re-analyze the formula in step 1. We are based on whether X[i] and X[j] are equal, and then pass record[i-1, j-1], record[i, j-1] or record[i -1, j] derives record[i, j]. Can we now reversely determine X by comparing the values of record[i, j] and (record[i-1, j-1], record[i, j-1], record[i-1, j]) Are the values of [i] and Y[j] equal? The answer is yes. Code directly below:
#Print LCS, because the recursive function is used, the LCS output order is just the same as the actual situation def Print_LCS(record, X, i, j): #recursive exit if i==0 or j==0: return #At this time X[i] = Y[j], so X[i] is in LCS, output X[i] if record[i][j] == record[i-1][j-1] + 1: Print_LCS(record, X, i-1, j-1) print(X[i], end = '') #Discuss X[i] separately below! = Y[j] in both cases elif record[i][j] == record[i-1][j]: Print_LCS(record, X, i-1, j) else: Print_LCS(record, X, i, j-1)
Problem- solving idea: analyze the problem, divide the original problem into several sub-problems, and deduce the solution of the original problem through the solutions of the sub-problems, so as to find that the problem can be solved by dynamic programming. Then use the problem-solving steps of dynamic programming, find out the state transition formula, write code through the formula and reconstruct the solution of the original problem!
Algorithm optimization: If this problem is solved by a bottom-up dynamic programming algorithm, the time complexity is O(n^2), and the space complexity is O(n*n). But by analyzing the formula, we can see that when solving record[i, j], only the two lines of record[i-1] and record[i] are used, so we can replace the original n with a 2*n list *n list.