Revisit the Data Structure and Algorithm: Manacher Algorithm

Preface

Palindrome substring, as the name suggests, is a substring in a string that satisfies the palindrome property. In algorithm design and analysis, palindrome substrings are often the focus of research and discussion, such as the longest palindrome substring problem in the POJ 3974 question, and the problem of solving the length of the longest palindrome substring in the LeetCode 0005 question.

The center expansion algorithm is a simple and intuitive method that finds palindrome substrings by extending each character from the center to both sides. However, its time complexity is O ( n 2 ) O(n^2)O ( n2 ), dynamic programming also has the problem of low efficiency.

The Manacher Algorithm is an algorithm that efficiently solves the palindrome string problem. It achieves linear time complexity solution by avoiding repeated calculations and exploiting the symmetric properties of palindromic strings. The horse-drawn cart algorithm quickly finds the longest palindrome substring in a given string.

1. Classic Algorithm

1.1 Center expansion method

The center expansion method takes each character as the center and expands to both sides to find the palindrome substring, because the palindrome substring may be of odd or even length:

  • Palindrome strings of odd length centered on this character
  • An even-length palindrome is centered on the empty string to the right of this character

Faced with these two situations, you can deal with them separately, or you can fill in special symbols before and after all characters, such as #, to make them an odd number.

The following code will be processed with padding #. As for the characters before and after the original string, the # sign is added to ensure that there is no palindrome substring like b#b. Otherwise, compared with #b#, the length will be the same, but after removing the #, the former is bb, and then The one is b. The length selection is incorrect. Adding # before and after will ensure that the palindrome string starts and ends with #, and the above problem will not occur.

public String longestPalindrome(String s) {
    
    
    int start = 0, end = 0;
    String joinedStr  = "#" + String.join("#", s.split("")) + "#";
    for (int i = 0; i < joinedStr.length(); i++) {
    
    
        int len = expandAroundCenter(joinedStr, i);

        if (len > end - start) {
    
    
            start = i - len / 2;
            end = i + len / 2 + 1;
        }
    }
    return s.substring(start / 2, end / 2);
}

public int expandAroundCenter(String s, int center) {
    
    
    int left = center, right = center, ans = 0;
    while (left >= 0 && right < s.length()  && s.charAt(left) == s.charAt(right) ) {
    
    
        ans = right - left + 1;
        left--;
        right++;
    }
    return ans;
}

1.2 Dynamic programming method

The idea of ​​dynamic programming is to use a two-dimensional array dpto record whether each substring in the string is a palindrome substring. The three important conditions for using dp are as follows:

  • Status definition: dp[i][j]Indicates s[i...j]whether the string is a palindrome substring

  • Initial state: dp[i][i]represents a single character in the string, which must be a palindrome string

  • State transition: s[i...j]Whether a palindrome substring is related to s[i+1...j-1]whether it is a palindrome and dp [ i ] [ j ] = { dp [ i + 1 ] [ j − 1 ] ∧ s [ i ] = = s [ j ] j - i > 1 s [ i ] = = s [ j ] j - i == 1 \begin{equation} dp[i][j]=\begin{cases} dp[i + 1][j - 1] \land s[ i] == s[j] & \text{j - i > 1}\\ s[i] == s[j] & \text{j - i == 1} \end{cases} \end{equation }s[i]== s[j]
    dp[i][j]={ dp[i+1][j1]s[i]==s[j]s[i]==s[j]j - i > 1j - i == 1

    Because dp[i][j]the values ​​of and dp[i+1][j-1]are related, we need to reverse the order, first calculate i+1, and then calculate i

public String longestPalindrome(String s) {
    
    
    boolean[][] dp = new boolean[s.length()][s.length()];
    int start = 0, maxLen = 1;
    for (int i = 0; i < s.length(); i++) {
    
    
        dp[i][i] = true;
    }
    for (int i = s.length() - 1; i >= 0; i--) {
    
    
        for (int j = i + 1; j < s.length(); j++) {
    
    
            if (s.charAt(i) == s.charAt(j)) {
    
    
                if (j - i == 1 || dp[i + 1][j - 1]) {
    
    
                    dp[i][j] = true;
                    if (j - i + 1 > maxLen) {
    
    
                        maxLen = j - i + 1;
                        start = i;
                    }
                }
            }
        }
    }
    return s.substring(start, start + maxLen);
}

2. Horse-drawn cart algorithm

2.1 Principle steps

Manacher's algorithm is an efficient algorithm that exploits symmetry for solving palindrome string problems. Its main purpose is to find the longest palindrome substring in a string. Here is how the Manacher algorithm works in detail:

  1. Preprocessing: In order to simplify the problem, we first preprocess the original string. Insert a special character (usually using '#') between each character, which can handle the case of odd and even length palindrome strings.
  2. Auxiliary array and auxiliary pointer: define an auxiliary array P to save the maximum palindrome radius centered on each character, and an auxiliary pointer C and R
    • C represents the center position of the current rightmost palindrome substring.
    • R represents the right boundary position of the palindrome substring
  3. Traverse a string: Traverse each character in the string from left to right.
  4. Fill the auxiliary array: For the currently traversed character, if it is within the right boundary of the current rightmost palindrome substring, you can use the symmetry property to set its corresponding auxiliary array element to be equal to the element at its symmetric position. This can be achieved by following these steps:
    • Calculate the mirror position of the current character with respect to C mirror = 2C - i;
    • Check whether the palindrome radius at position i is within the range of the current rightmost palindrome substring;
    • If within the range, set the auxiliary array element of the current character to be equal to the element at the mirror position.
  5. Extending the palindrome radius: Find longer palindrome substrings by continuously extending the palindrome radius while filling the auxiliary array. The specific operation is as follows:
    • Starting from the current position, check whether the characters at the corresponding positions are equal in the left and right directions;
    • If equal, the palindrome radius increases by 1;
    • Continue expanding in both directions until characters are not equal.
  6. Update C and R: If the right boundary of the expanded palindrome substring exceeds the right boundary of the current rightmost palindrome substring, update C and R:
    • Use the current position as the center C of the new rightmost palindrome substring;
    • Update the right boundary R of the current rightmost palindrome substring to the right boundary of the expanded palindrome substring.
  7. Record the longest palindrome substring: During the traversal process, record and update the length and starting position of the longest palindrome substring.
  8. Return result: After the traversal is completed, the final result can be obtained based on the length and starting position of the longest recorded palindrome substring.

Through the above steps, the Manacher algorithm can efficiently find the longest palindrome substring in a given string. Its time complexity is O ( n ) O(n)O ( n ) , where n is the length of the string.

2.2 Java implementation

public String longestPalindrome(String s) {
    
    
    // 预处理字符串,在字符之间插入特殊字符'#'
    String processedStr = "#" + String.join("#", s.split("")) + "#";

    int[] P = new int[processedStr.length()];  // 辅助数组,保存每个字符为中心的最大回文半径
    int C = 0;  // 当前最右边回文子串的中心位置
    int R = 0;  // 回文子串的右边界位置

    int maxLen = 0;  // 最长回文子串的长度
    int centerIndex = 0;  // 最长回文子串的中心位置

    for (int i = 1; i < processedStr.length() - 1; i++) {
    
    
        int mirror = 2 * C - i;  // 当前位置关于C的镜像位置

        // 如果i在R的范围内,则可以利用对称性质快速计算P[i]
        if (i < R) {
    
    
            P[i] = Math.min(R - i, P[mirror]);
        }


        // 扩展回文半径,检查左右两个位置上的字符是否相等
        while (i + 1 + P[i] < processedStr.length() && i - 1 - P[i] >= 0
               && processedStr.charAt(i + 1 + P[i]) == processedStr.charAt(i - 1 - P[i])) {
    
    
            P[i]++;
        }

        // 更新最右边回文子串的中心位置和右边界位置
        if (i + P[i] > R) {
    
    
            C = i;
            R = i + P[i];

            // 更新最长回文子串的长度和中心位置
            if (P[i] > maxLen) {
    
    
                maxLen = P[i];
                centerIndex = i;
            }
        }
    }

    // 计算最长回文子串的起始位置和结束位置
    int start = (centerIndex - maxLen) / 2;
    int end = start + maxLen;

    return s.substring(start, end);
}

3. LeetCode in practice

3.1 The longest palindrome substring

https://leetcode.cn/problems/longest-palindromic-substring/

Given a string s, find sthe longest palindrome substring in .

If the reverse order of a string is the same as the original string, the string is called a palindrome string.

Example 1:

输入:s = "babad"
输出:"bab"
解释:"aba" 同样是符合题意的答案。

Example 2:

输入:s = "cbbd"
输出:"bb"

hint:

  • 1 <= s.length <= 1000
  • sComposed only of numbers and English letters

For classic questions, just use the above code.

public String longestPalindrome(String s) {
    
    
    // 预处理字符串,在字符之间插入特殊字符'#'
    String processedStr = "#" + String.join("#", s.split("")) + "#";

    int[] P = new int[processedStr.length()];  // 辅助数组,保存每个字符为中心的最大回文半径
    int C = 0;  // 当前最右边回文子串的中心位置
    int R = 0;  // 回文子串的右边界位置

    int maxLen = 0;  // 最长回文子串的长度
    int centerIndex = 0;  // 最长回文子串的中心位置

    for (int i = 1; i < processedStr.length() - 1; i++) {
    
    
        int mirror = 2 * C - i;  // 当前位置关于C的镜像位置

        // 如果i在R的范围内,则可以利用对称性质快速计算P[i]
        if (i < R) {
    
    
            P[i] = Math.min(R - i, P[mirror]);
        }


        // 扩展回文半径,检查左右两个位置上的字符是否相等
        while (i + 1 + P[i] < processedStr.length() && i - 1 - P[i] >= 0
               && processedStr.charAt(i + 1 + P[i]) == processedStr.charAt(i - 1 - P[i])) {
    
    
            P[i]++;
        }

        // 更新最右边回文子串的中心位置和右边界位置
        if (i + P[i] > R) {
    
    
            C = i;
            R = i + P[i];

            // 更新最长回文子串的长度和中心位置
            if (P[i] > maxLen) {
    
    
                maxLen = P[i];
                centerIndex = i;
            }
        }
    }

    // 计算最长回文子串的起始位置和结束位置
    int start = (centerIndex - maxLen) / 2;
    int end = start + maxLen;

    return s.substring(start, end);
}

3.2 Palindental skewers

https://leetcode.cn/problems/palindromic-substrings/

Given a string s, please count and return the number of palindrome substrings in this string .

A palindrome string is the same string that is read forward and backward.

A substring is a sequence of consecutive characters in a string.

Substrings with different starting or ending positions, even if they consist of the same characters, are considered different substrings.

Example 1:

输入:s = "abc"
输出:3
解释:三个回文子串: "a", "b", "c"

Example 2:

输入:s = "aaa"
输出:6
解释:6个回文子串: "a", "a", "a", "aa", "aa", "aaa"

hint:

  • 1 <= s.length <= 1000
  • sComposed of lowercase English letters

You can also use the horse-drawn cart algorithm. There is no need to record the length and position of the longest palindromic substring. You can get the number of substrings by adding (radius + 1)/2.

public int countSubstrings(String s) {
    
    
    int ans = 0;
    // 预处理字符串,在字符之间插入特殊字符'#'
    String joinedStr = "#" + String.join("#", s.split("")) + "#";
    int joinedStrLen = joinedStr.length();
    // 辅助数组,保存每个字符为中心的最大回文半径,半径不包括中心
    int [] p = new int[joinedStrLen];
    // 当前最右边回文子串的中心位置
    int center = 0;
    // 当前最右边回文子串的右边界位置
    int right = 0;

    for (int i = 0; i < joinedStrLen; i++) {
    
    
        // 如果i在 right 的范围内,则可以利用对称性质快速计算p[i],这部分是马拉车算法的重点
        if (i < right) {
    
    
            // 当前位置i关于center的镜像位置
            int minor = 2 * center - i;
            // 选择最小半径作为起始
            p[i] = Math.min(right - i, p[minor]);
        }

        // 扩展回文半径,检查左右两个位置上的字符是否相等,注意边界检查
        while (i - 1 - p[i] >= 0 && i + 1 + p[i] < joinedStrLen
               && joinedStr.charAt(i - 1 - p[i]) == joinedStr.charAt(i + 1 + p[i]))  {
    
    
            p[i]++;
        }
        // 更新 center,right
        if (i + p[i] > right) {
    
    
            center = i;
            right = i + p[i];
        }

        ans += (p[i] + 1) / 2;
    }
    return ans;
}

reference

  1. horse drawn cart algorithm

Guess you like

Origin blog.csdn.net/qq_23091073/article/details/132178425