Question

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Note:

s could be empty and contains only lowercase letters a-z.

p could be empty and contains only lowercase letters a-z, and characters like . or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 4:

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".

Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

Answer

　　题意很容易明白，根据pattern判断string是否有效，就是说string是否符合pattern模式。

　　string中的每个字符从a-z中选择，string可以为空。

　　pattern由任意数量的a-z字母、’ . ' 以及 a*、b*到 z* 、. * 组成的字符串。

　　（字符+星号*）表示有任意（包括零个）个字符，如a*表示有任意多个a，.* 表示有任意个字符。

　　比如

Input:
s = "aa" p = "a"

我们根据p逐字匹配，s第一个字符与p第一个字符匹配，然而p没有第二个字符了，s还有一个a，就是s不与p匹配，所以输出false;

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

　　和第一个例子一样，p的前两个字符和s匹配，p第三个字符加*表示有任意数量的s，这里匹配到s的第四个字符，接着s第五到第七个字符（iss)也与p相匹配，然后s的第八个字符i并没有在p中有所匹配，所以也输出false;

　　根据上面的例子，我们可以开始考虑解决我们的问题了。

　　首先，如果不考虑p中(字符+*)的组合，问题会特别简单，我们只需将p与s中的字符一一匹配即可。

　　然而，如果有(字符+*)的存在，问题就有些复杂了。比如

Input:
s = "aab"

　　你会发现

p = "a*ab" 或 p = "a*b" 或 p = "a*.b"

　　都与s匹配。换句话说你不知道a*能匹配到多少个a，你不知道什么时候a*才算结束，比如说p = "a*ab"中a*对应s一个a，p="a*b"对应s两个a，p="a*.b"对应s一个a。最初我以为自己能根据a*后的字符判断a*与多少个a匹配，最后觉得思维有点乱便放弃了，主要原因是a*与多少个a匹配都有可能，比如p='a*b'，若a*只与一个a匹配，后面就可能匹配不上了。

　　所以，我们不妨将a*所有情况都考虑。逐个字符匹配时，若遇到类似a*这类型的两个字符时，我们分别考虑以下两种情况：

　　① a*与0个a匹配。

　　② a*与1个a匹配，然后p还是以a*与s的下一个字符匹配（就是说下次匹配继续考虑①②这两种情况）。

　　这样就能考虑到a*的所有情况了。

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".

　　然后根据以上算法我们编写代码：

    bool isMatch(string s, string p) {
        int s_len = s.length();
        int p_len = p.length();
        if (p_len == 0)
            return s_len == 0;
        bool match = (s_len != 0) && (s[0] == p[0] || p[0] == '.');
        if (p_len >= 2 && p[1] == '*')
        {
            return isMatch(s, p.substr(2)) || (match && isMatch(s.substr(1), p));
        }
        else if (p_len >= 1)
        {
            return match && isMatch(s.substr(1), p.substr(1));
        }
    }

　　代码思路时很清晰的，match判断下一个字符是否匹配，if (p_len >= 2 && p[1] == '*') 对是否出现类似a*的情况进行判断，若出现就像上面说的两种情况进行操作，若没有出现则一一匹配。

　　最后要注意一下边界，特别是s为空，p为非空这两种情况。

　　p为空，s为非空已经能说不匹配了； p,s都为空，说明匹配成功。

　　p为非空，s为非空说明还能继续匹配，因为每次调用函数都是缩小字符串长度的。

　　s为空，p为非空，有可能是p中出现类似a*的情况，也有可能p还剩a-z或'.'字符，也有可能是这两种组合。

　　如果是p中形如a*正在匹配，我们从p中去掉它，因为相当于没有a。

　　如果是后一种情况正在匹配，则匹配失败。

Leetcode Week1 Regular Expression Matching

Question

Answer

猜你喜欢