72. Edit Distance
Description:
Given two words word1 and word2, find the minimum number of operations required to convert word1 to word2.
You have the following 3 operations permitted on a word:
- Insert a character
- Delete a character
- Replace a character
Example 1:
Input: word1 = "horse", word2 = "ros" Output: 3 Explanation: horse -> rorse (replace 'h' with 'r') rorse -> rose (remove 'r') rose -> ros (remove 'e')
Example 2:
Input: word1 = "intention", word2 = "execution" Output: 5 Explanation: intention -> inention (remove 't') inention -> enention (replace 'i' with 'e') enention -> exention (replace 'n' with 'x') exention -> exection (replace 'n' with 'c') exection -> execution (insert 'u')
解题思路:
(1)递归解法
从两个字符串的最后的位置开始考虑:
- 如果最后两个字符(i,j)相等,最后两个字符就不要配对,所以等于minDistance(s1[0..i-1],s2[0...j-1]);
- 如果最后两个字符不相等: 说明要编辑,具体可以分为三种情况:
a. 如果 s1[i-1]和s2[j]可以配对,那我就删除s1[i]即可(删除);
b. 如果 s1[i]和s2[j-1]可以配对,那我就在s1的后面加上s2[j]即可(插入);
c. 如果 s1[i-1]和s2[j-1]可以配对,那我就把s1[i]修改成s2[j]即可;
上面图片来源: https://blog.csdn.net/zxzxzx0119/article/details/82054807
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
l1 = len(word1)
l2 = len(word2)
def tryMinDistance(i, j):
if i == -1:
return j + 1
elif j == -1:
return i + 1
elif (word1[i] == word2[j]):
return tryMinDistance(i - 1, j - 1)
else:
return (1 + min(tryMinDistance(i - 1, j - 1), # replace i
tryMinDistance(i - 1, j), # delete i
tryMinDistance(i, j - 1) # add i+1 index
))
return tryMinDistance(l1 - 1, l2 - 1)
so = Solution()
# test 1
word1 = "horse"
word2 = "ros"
# test 2
# word1 = "intention"
# word2 = "execution"
# test 3
# word1 = "dinitrophenylhydrazine"
# word2 = "benzalphenylhydrazone"
# test 4
# word1 = ""
# word2 =""
print(so.minDistance(word1, word2))
(2)记忆化递归法
下面代码没有AC,代码有问题。
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
l1 = len(word1)
l2 = len(word2)
dp = [[-1] * (l2)] * (l1)
def tryMinDistance(i, j):
if i == -1:
return j + 1
elif j == -1:
return i + 1
elif dp[i][j] != -1:
return dp[i][j]
elif (word1[i] == word2[j]):
dp[i][j] = tryMinDistance(i-1, j-1)
else:
dp[i][j] = 1 + min(tryMinDistance(i-1, j-1), # replace i
tryMinDistance(i-1, j), # delete i
tryMinDistance(i, j-1) # add i+1 index
)
print(dp)
return dp[i][j]
return tryMinDistance(l1-1, l2-1)
so = Solution()
# test 1
word1 = "horse"
word2 = "ros"
# test 2
# word1 = "intention"
# word2 = "execution"
# test 3
# word1 = "dinitrophenylhydrazine"
# word2 = "benzalphenylhydrazone"
# test 4
# word1 = ""
# word2 =""
print(so.minDistance(word1, word2))
上面代码总是在第一个测试用例中出错,得到结果是4。经过好长时间分析测试,发现是以下代码有问题:
dp = [[None] * (len(word2)) for _ in range(len(word1))]
print(dp)
dp1 = [[None] * (len(word2))] * (len(word1))
print(dp1)
上面两种方式初始化列表,虽然得到结果形式上是一致的,却能造成不同的结果。目前还没想清楚,这两种方式的不同。
20190603补充:
上面两种初始化方式在更新列表数据时有差别,我们来看两个例子,感受一下:
例1:
dp = [[1] * 2] * 4
print(dp)
dp[0][1]=5
print(dp)
结果:
例2:
dp = [[1] * 2 for _ in range(4)]
print(dp)
dp[0][1]=5
print(dp)
结果:
已经AC的代码,如下:
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
l1 = len(word1)
l2 = len(word2)
dp = dp = [[-1] * (len(word2)) for _ in range(len(word1))]
def tryMinDistance(i, j):
if i == -1:
return j + 1
elif j == -1:
return i + 1
elif dp[i][j] != -1:
return dp[i][j]
elif (word1[i] == word2[j]):
dp[i][j] = tryMinDistance(i - 1, j - 1)
else:
dp[i][j] = 1 + min(tryMinDistance(i - 1, j - 1), # replace i
tryMinDistance(i - 1, j), # delete i
tryMinDistance(i, j - 1) # add i+1 index
)
return dp[i][j]
return tryMinDistance(l1 - 1, l2 - 1)
so = Solution()
# test 1
# word1 = "horse"
# word2 = "ros"
# test 2
# word1 = "intention"
# word2 = "execution"
# test 3
# word1 = "dinitrophenylhydrazine"
# word2 = "benzalphenylhydrazone"
# test 4
# word1 = ""
# word2 =""
# test 5
word1 = "sea"
word2 = "eat"
print(so.minDistance(word1, word2))
(3)动态规划解法
上面图片来源:https://blog.csdn.net/zxzxzx0119/article/details/82054807
二维动态规划就是使用一个二维数组从左到右,从上到下来递归出最后的答案,注意几点:
- dp数组的大小 dp[chs1.length + 1] [chs2.length + 1],因为一开始是空串" ",所以要多开一个;
- 一开始初始化第一行和第一列,第一行表示的是chs1为空字符串""变成chs2要添加chs2长度的次数。第一列表示的是chs2为空字符串""变成chs1需要添加chs1长度的次数。
已经AC的代码:
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
l1 = len(word1)
l2 = len(word2)
dp = [[0] * (l2+1) for _ in range(l1+1)]
# print(dp)
for i in range(l1+1):
dp[i][0] = i
for j in range(l2+1):
dp[0][j] = j
for i in range(1, l1+1):
for j in range(1, l2+1):
if word1[i - 1] == word2[j - 1]:
c = 0
else:
c = 1
dp[i][j] = min(dp[i-1][j-1]+c, min(dp[i][j-1], dp[i-1][j])+1)
return dp[l1][l2]
Reference:
【1】https://leetcode.com/problems/edit-distance/
【2】花花酱 LeetCode 72. Edit Distance - 刷题找工作 EP87
【3】LeetCode - 72. Edit Distance(编辑距离问题)(三个依赖的滚动优化)
【4】https://blog.csdn.net/zxzxzx0119/article/details/82054807