LeetCode Question 10: Regular Expressions

The experimental report just did this question. Since it was required to give lectures in class, I chose this question to explain. This article is mainly used to describe my own thinking when doing the questions and sorting out the questions, for your reference

topic description

Given a string s and a character pattern p, please implement a regular expression matching that supports '.' and '*'.

'.' Matches any single character
'*' Matches zero or more of the previous element The
so-called match is to cover the entire string s, but
part of the string.

Topic example

Example 1:

Input: s = "aa" p = "a"
Output: false
Explanation: "a" cannot match the entire string of "aa".

Example 2:

Input: s = "aa" p = "a*"
Output: true
Explanation: Because '*' means that zero or more of the preceding element can be matched, and here the preceding element is 'a'. Therefore, the string "aa" can be considered as 'a' repeated once.

Example 3:

Input: s = "ab" p = ". "
Output: true
Explanation: ".
" means that zero or more ('*') any character ('.') can be matched.

Example 4:

Input: s = "aab" p = "c a b"
Output: true
Explanation: Because '*' means zero or more, here 'c' is 0, 'a' is repeated once. So the string "aab" can be matched.

Example 5:

Input: s = "mississippi" p = "mis is p*."
Output: false

hint:

0 <= s.length <= 20
0 <= p.length <= 30
s may be empty and contain only lowercase letters from az.
p may be empty, and contains only lowercase letters from az, and the characters . and *.
Ensure that every time the character * appears, a valid character is matched in front of it

Topic link:

https://leetcode-cn.com/problems/regular-expression-matching

topic explanation

At first glance, it seems that dynamic programming is used for this question, and then the specific ideas will be analyzed.

First of all, we first create a dp[m + 1][n + 1] array, dp[i][j] indicates whether s[0:i] and p[0:j] can match, m and n are strings respectively The length of s and p, the reason for +1 is to give the array an initial value, that is, two empty strings must match, dp[0][0] = true.

The requirement of the title is that the two strings can be matched according to certain rules, where the s string is all English characters, and the p string contains English characters and two symbols: ' . ' and ' * '.
Then we can divide the whole idea into three situations according to the type of characters in p:

  1. p[j] is an ordinary English character
  2. p[j] is ' . '
  3. p[j] is '*'

1. p[j] is an ordinary English character

This is the most general situation, assuming that the s string has reached the position i, and the p string has reached the position j, as shown in the figure below:
Example 1
In the example above, s[0:i] and p[0:j] can There are two conditions for successful matching: first, s[0: i-1] and p[0: j-1] must be satisfied to be able to match successfully, and secondly, s[i] and p[j] must be satisfied to be successful in matching.
From this it can be concluded that:

dp[i][j] = dp[i - 1][j - 1] && s[i] == p[j]

2. p[j] is ' . '

This situation is actually a special version of the previous situation, the reason is: ' . ' can match any common English character, two key points: "any" and "one", that is, when p[j] is' . ', no matter what s[i] is, under the rules of the topic, s[i] == p[j] will always be satisfied. The example is as follows:
Example 2
According to the rules, the above figure satisfies: ' b ' == ' . '
It can be concluded from this:

dp[i][j] = dp[i-1][j-1]

3. p[j] is '*'

This situation is actually the most complicated one. '*' can represent a non-negative number of the previous character, and we don't know how many it represents. This is the complexity of this situation. ·
Example 3
Let’s look at the example above first: According to the naked eye, at this time, '*' represents 0 previous characters, which is equivalent to "a*". Such a whole is equivalent to a "" string, that is, an empty string, which is equivalent to Putting "a*" in the p-string is gone.

So when do we consider ' * ' to represent 0 previous characters?

Obviously, if s[i] is not equal to p[j-1], then it means that in the current position, '*' represents 0 previous characters. Right now:

s[i] != p[j-1] => dp[i][j] == dp[i][j-2]

Then when s[i] == p[j], it is necessary to consider the case that ' * ' represents the previous character other than 0. Let's look at an extreme example: here ' * ' represents multiple ' e
Example 4
' , since we don't know exactly how many it represents, so here we assume that it represents 1, what is the result of the assumption?

Yes, let i move one unit to the left, then

dp[i][j] = dp[i - 1][j]。

After moving to the left, you will encounter this situation again, so you will move to the left again, and finally the following situation will appear:
![示例5](https://img-blog.csdnimg.cn/2020113022345886.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L0JXUTIwMTk=,size_16,color_FFFFFF,t_70
Does this situation seem familiar?
Yes, this case belongs to the case where ' * ' represents 0 previous letters, which happens to be the case of s[i] != p[j-1] we mentioned before. at this time:

dp[i][j] = dp[i][j-2]

That is to say, no matter how many cases ' * ' represents the previous letter, it will return to the case representing 0 in the end.

In summary,

p[j] is an ordinary character, then:
dp[i][j] = dp[i-1][j-1] && s[i]==p[j] p[j]
is ' . ', then :
dp[i][j] = dp[i-1][j-1]
p[j] is '*', then:
if s[i] == p[j-1]
dp[i][j ] = dp[i][j-2] or dp[i-1][j]
if s[i] != p[j-1]
dp[i][j] = dp[i][j-2 ]
can write the following state transition equation:
d p [ i ] [ j ] = { i f ( p [ j ] ≠ ′ ∗ ′ ) = { d p [ i − 1 ] [ j − 1 ] , matches(s[i],p[j]) f a l s e , otherwise o t h e r w i s e = { d p [ i − 1 ] [ j ]   o r   d p [ i ] [ j − 2 ] , matches(s[i],p[j-1]) d p [ i ] [ j − 2 ] , otherwise dp[i][j]= \begin{cases} if(p[j] ≠ '*') = \begin{cases} dp[i-1][j-1], & \text{matches(s[i],p[j])}\\ false, &\text{otherwise}\\ \end{cases}\\ otherwise = \begin{cases} dp[i-1][j] or dp[i][j-2], & \text{matches(s[i],p[j-1])}\\ dp[i][j-2], & \text{otherwise} \end{cases} \end{cases} dp[i][j]=ifp[j]=={ dp[i1][j1]false,matches(s[i],p[j])otherwiseotherwise={ dp[i1][j] or dp[i][j2]dp[i][j2]matches(s[i],p[j-1])otherwise

code show as below:

class Solution {
    
    
    public boolean isMatch(String s, String p) {
    
    
        int m = s.length();
        int n = p.length();

        boolean[][] dp = new boolean [m+1][n+1];
        dp[0][0] = true;
        for(int i = 0; i <= m; i++){
    
    
            for(int j = 1; j <= n; j++){
    
    
                if( p.charAt(j-1) == '*' ){
    
    
                    dp[i][j] = dp[i][j - 2];
                    if( ismatch(s,p,i,j-1) ){
    
    
                        dp[i][j] = dp[i][j] || dp[i-1][j];
                    }
                }else{
    
    
                    if( ismatch(s,p,i,j) ){
    
    
                        dp[i][j] = dp[i-1][j-1];
                    }
                }
            }
        }
        return dp[m][n];
    }
    public boolean ismatch(String s, String p, int i, int  j){
    
    
        if(i == 0){
    
    
            return false;
        }
        if( p.charAt(j-1) == '.' ){
    
    
            return true;
        }
        return s.charAt(i-1) == p.charAt(j-1);
    }
}

Guess you like

Origin blog.csdn.net/BWQ2019/article/details/110406497