. Leetcode10 regular expression match - three characteristics of a dynamic programming model

Introduced

Leetcode in the face of such a question :

10. The regular expression matches
to give you a character string s and a law p, invite you to implement a support '' and ' positive' regular expression matching.
'.' Matches any single character
'
' matches zero or more of the preceding element that a
so-called matching, to cover the entire string s, and not part of the string.

When analyzing this question, it is easy to think of going one by one match, the first of a string p matches the string s, if they .skip a letter, encountered *the situation is complicated because See going subsequent substring matching.

So it is easy to think that the first method: backtracking . However, too much backtracking need to determine the conditions and more difficult to come out, that solution can go in the title refer to the official explanations.

Here, we introduce dynamic programming. In addition to the common dynamic programming "FIG" seeking the minimum distance, or array ( "step") find the minimum number of steps, in certain topics tend to have the effect of stunning.
171. For example, the title race week minimum distance 1320. The two-finger input is to use dynamic programming to do.
Another example of this problem, dynamic programming is often better understood.

"Three models feature a" theory

Dynamic programming as a very sophisticated algorithm, a lot of people are doing a very comprehensive summary of this part of my theory summed up as "a model three features."

First, "a model" refers to the dynamic programming model for problem-solving . I put this model is defined "as a multi-stage decision-optimal solution ."

Specifically, we generally use dynamic programming to solve optimization problems. The process of problem-solving, decision-making need to go through multiple stages.Each decision stage corresponds to a set state. Then we look for a set of decision sequence after sequence of decisions which the group can generate the optimal value of the final desired solved.

"Three features", are optimal substructure, and no after repeated sub-problems . These three concepts more abstract, one by one to explain.

  1. Optimal substructure
    optimal substructure means that optimal solutions contain sub-optimal solution of the problem . Conversely say is that we can sub-optimal solution of the problem, the problem of the optimal solution is derived. If we take the optimal substructure, corresponds to the problem of dynamic programming model we defined earlier, then we can be understood as the state behind the stage can be derived by the previous state .
  2. No after-effect
    no after-effect, has two meanings, the first layer of meaning, when the latter is derived phase state, we only care about the state of the value of the previous stage, I do not care about how this state is a step by step deduced . The second meaning is that at some stage the state is established, it is not subject to influence decision-making after the stage . No after-effect is a very "loose" requirements. As long as the problem of dynamic programming model mentioned earlier, in fact, it will basically meet no after-effect.
  3. Repeat sub-problems
    this concept, the previous one, has been mentioned several times. In a word is this: when a different sequence of decisions, reach a certain stage of the same, may result in duplicate state .

FIG Returning to our problem: Suppose we have a matrix W is multiplied by n of the n [n] [n]. It is a positive integer matrix storage. Pawn start position in the upper left end position in the lower right corner. We will move from the top left corner of the lower right corner pieces. You can only move one to the right or down. The whole process, there will be many different paths to choose from. We each path through the numbers add regarded as the length of the path. That the shortest path length from the top left to bottom right is how much?
Here Insert Picture Description
Let's look at the question whether the "model"?

  • From 0 , 0 (0, 0) until n 1 , n 1 (n-1, n-1) total go 2 n 1 2*(n-1) step, i.e. corresponding to 2 n 1 2*(n-1) stages. Each stage has to go right or go down two kinds of decision-making, and each stage will correspond to a set of states.

Here Insert Picture Description
So, the problem is the optimal solution of a multi-stage decision, in line with the model of dynamic programming.

Let's look at the issue whether the "three characteristics" ?

  • We can use backtracking algorithm to solve this problem. If the code you write, draw about recursive tree, you will find a recursive tree has duplicate nodes. Repeating node, said node from a position corresponding to the upper left corner, there are a variety of routes, which can explain the existence of this problem repeats problem .

Here Insert Picture Description

  • 如果我们走到(i, j)这个位置,我们只能通过(i-1, j),(i, j-1)这两个位置移过来,也就是说,我们想要计算(i, j)位置对应的状态,只需要关心(i-1, j),(i, j-1)这两个位置的状态,并不关心棋子是通过什么样的路线到达这两个位置的。而且,我们仅仅允许往下和往右移动,不允许后退,所以,前面阶段的状态确定之后,不会被后面阶段的决策所改变。所以这个问题,是符合“无后效性”的

  • 刚刚定义状态的时候,我们把从起始位置(0, 0)到(i, j)的最小,记作 min_dis(i, j)。因为我们只能往右或者往下移动,所以,我们只有可能从(i-1, j),(i, j-1)两个位置到达(i, j)。也就是说,到达(i, j)的最短路径要么经过(i-1, j),要么经过(i, j-1),而且到达(i, j)的最短路径肯定包含到达这两个位置的最短路径之一。换句话说就是,min_dis(i, j)可以通过min_dis(i, j-1)和min_dis(i-1, j)两个状态推导出来。这就说明,这个问题符合“最优子结构”。

动态规划一般是通过状态转移方程来解决的。
比如:min_dist(i, j) = w[i][j] + min(min_dist(i, j-1), min_dist(i-1, j))
所以解决动态规划问题,最重要的就是找到它的状态转移方程。

Leetcode题解

因为题目拥有 最优子结构 ,一个自然的想法是将中间结果保存起来。我们通过用 dp(i,j)表示 s 的前 i 个是否能被 p 的前 j 个匹配。我们可以用更短的字符串匹配问题来表示原本的问题。也就是需要一个m*n的dp数组来存储每一位的匹配结果。 由于只用保存是否匹配,所以用bool值就可以。

转移方程

怎么想转移方程?首先想的时候从已经求出了 dp[i-1][j-1] 入手,再加上已知 s[i]p[j],要想的问题就是怎么去求 dp[i][j]

已知 dp[i-1][j-1] 意思就是前面子串都匹配上了,不知道新的一位的情况。
那就分情况考虑,所以对于新的一位 p[j] s[i] 的值不同,要分情况讨论:

  1. 相同字符的匹配,即p[j] == s[i],那么直接可以推出dp[i][j] = dp[i-1][j-1]
  2. 字符与.的匹配,即p[j]=='.',那么也可以推出dp[i][j] = dp[i-1][j-1]
  3. 最难的一种情况,即p[j] ==" * "。我们单独细说。

首先给了*,明白 *的含义是匹配零个或多个前面的那一个元素,所以要考虑他前面的元素p[j-1]*跟着他前一个字符走,前一个能匹配上 s[i]* 才能有用,前一个都不能匹配上 s[i]*也无能为力,只能让前一个字符消失,也就是匹配 00 次前一个字符。
所以按照 p[j-1]s[i] 是否相等,我们分为两种情况:

  1. 如果p[j-1] != s[i] ,那么可以推出:dp[i][j] = dp[i][j-2]
    比如(ab, abc * )。遇到 * 往前看两个,发现前面 s[i]abp[j-2]ab 能匹配,虽然后面是 c*,但是可以看做匹配 0次 c相当于直接去掉 c * ,所以也是 True。
    但需要注意 (ab, abc**) 是 False。这种情况需要再判断。
  2. p[j-1] == s[i] 或者 p[j-1] == ".",表示*前面那个字符,能匹配 s[i],或者 * 前面那个字符是万能的 .,那么*是必然能够向后匹配的。该情况转移方程需要分三种情况讨论。

上述情况2的转移方程:

  • dp[i][j] = dp[i-1][j] ,即多个字符匹配的情况 。如果后面匹配了多个p[j-1]的字符,那么,相当于不消耗p的字符,直接将s的字符向后移。对应的就是s的前i个、s的前i-1个与p的前j个的匹配结果相同。
  • dp[i][j] = dp[i][j-1]。单个字符匹配的话,比如(ac,a*c),相当于去掉p中的*,只匹配两个ac。所以消耗了一个*字符。
  • dp[i][j] = dp[i][j-2] 。表示没有匹配的情况,与之前的情况1一样,相当于匹配了0次,相当于直接去掉 c *

最后,其代码如下:

public boolean isMatch(String s,String p){
            if (s == null || p == null) {
                return false;
            }
            boolean[][] dp = new boolean[s.length() + 1][p.length() + 1];
            dp[0][0] = true;//dp[i][j] 表示 s 的前 i 个是否能被 p 的前 j 个匹配
            for (int i = 0; i < p.length(); i++) { // here's the p's length, not s's
                if (p.charAt(i) == '*' && dp[0][i - 1]) {
                    dp[0][i + 1] = true; // here's y axis should be i+1
                }
            }
            for (int i = 0; i < s.length(); i++) {
                for (int j = 0; j < p.length(); j++) {
                    if (p.charAt(j) == '.' || p.charAt(j) == s.charAt(i)) {//如果是任意元素 或者是对于元素匹配
                        dp[i + 1][j + 1] = dp[i][j];
                    }
                    if (p.charAt(j) == '*') {
                        if (p.charAt(j - 1) != s.charAt(i) && p.charAt(j - 1) != '.') {//如果前一个元素不匹配 且不为任意元素
                            dp[i + 1][j + 1] = dp[i + 1][j - 1];
                        } else {
                            dp[i + 1][j + 1] = (dp[i + 1][j] || dp[i][j + 1] || dp[i + 1][j - 1]);
                            /*
                            dp[i][j] = dp[i-1][j] // 多个字符匹配的情况	
                            or dp[i][j] = dp[i][j-1] // 单个字符匹配的情况
                            or dp[i][j] = dp[i][j-2] // 没有匹配的情况
                             */
                            
                        }
                    }
                }
            }
            return dp[s.length()][p.length()];
        }
Published 385 original articles · won praise 326 · views 160 000 +

Guess you like

Origin blog.csdn.net/No_Game_No_Life_/article/details/103969064