Manacher palindromic algorithm for the longest substring

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/wangjiangrong/article/details/91449637

Brush LeetCode a problem when there is mentioned a method Manacher, can be determined in the longest substring palindromic case time complexity of O (n), since it is an algorithm created heard, go to the next check and myself realized under the effect, look at this record.

Topic links: https://leetcode-cn.com/problems/longest-palindromic-substring/

Palindromic substring: popular talk string is reverse positive sequence are the same, e.g. aba, abba, asdfdsa these.

Subject is required, such as a given string asdfdsaba, then we have to outputs palindromic longest substring, i.e. sdfds.

To solve this problem there are many, such as the simplest is to traverse the bit string, and then each extend about, has acquired the position as the center of the longest palindrome substring, and finally find each palindrome substring longest to (the center of diffusion). This is inside a pit, e.g. aba is, when you go to b, extending around give aba palindrome substring abba but when you get to about b extending abb or bba not palindromic substrings, resulting in an error. Therefore, for this situation, we prefer each string is inserted in a special character, such as '#', then appeal operation, i.e. abba is # a # b # b # a #.

Next Manacher algorithm we want to say, my understanding is that the above method is optimized, the optimization of the place is not need to traverse the time do not need every bit of text that extends back about substring operation, and only in some special in the case will go to perform, greatly improving efficiency. So what to look for when you need a substring, do not need to do anything when they get to know this is equivalent to get to know Manacher algorithm.

Manacher algorithm

First we have to understand a few concepts

Palindromic center point: i.e., the center point of a palindromic sequence, # a # b # a # a center point b, # a # b # b # a # # central point for the third.

Palindromic radius: i.e., half the length of a palindromic sequence, not the center point (center point some will count, it does not matter in fact, only affect the calculation of the rear portions of code), for example, a radius of aba 1, abba radius 2

Palindrome borders: the left edge of the left-most position that is a palindrome string of characters, that is, the right border of the rightmost character position.

For example, we now have string qawasdsaz, wherein the palindromic substring AWA, center point w, the subscript 2, a radius of 1, the subscript a left margin, right below the three landmarks. Palindromic substring asdsa, center point d, the subscript 5, a radius of 2, the subscript 3 left boundary, the lower right boundary marked 7.

Found: palindrome right border = + radius of the center point, the center point of the palindrome left border = - Radius

In palindromic substrings in the center point of each character is obtained, which is composed of the radius of the new array, i.e. palindromic radius array . For example just a string qawasdsaz, the subscript 0 is the center of the palindrome substring q, radius 0, ..., subscript 2 as a center AWA palindromic substring, radius 1, ..., subscript 5 centered palindromic substring asdsa, a radius of 2 ... so obtained is palindromic radius array [0, 0, 1, 0, 0, 2, 0, 0, 0]. Is the maximum value in the array palindromic longest substring radius of array subscript is the center point of where the palindromic substring subscript string, to obtain the longest substring palindrome.

process

The next most important thing is how to get a palindrome string array based on the radius (in fact, the central diffusion method is the string that each have to calculate it as the center of the palindrome, and Manacher algorithm is optimized on this basis, so as to avoid a lot of computation).

1. First, we will traverse the string before each insert special symbols (e.g. '#') in operation, to avoid errors caused abba such palindromic sequence.

Every calculates the longest palindrome string of characters as the center point of the radius of 2. Then we start traversing the string, record new array radiusArray  in.

3. We then define a few variables are used to store data, the center point of the current palindrome substring Center , currently palindrome right border of the substring right

4. We assume a traversal time, subscript i is a center point Center = i, by means of diffusion center, i is calculated in the center of the longest substring s, radius r, i.e. radiusArray [ Center ] = r, right boundary right  is the i + r ,.

5. In the latter traversal i + j, if i + j < right , also described substring i + j in s. i + j through the  center  point of symmetry is the ij, and because i + j ij on the left, so we are ij palindrome known radius, i.e. radiusArray [ij]. We also know that the left edge of the substring subscript s is Center - radiusArray [ Center ], the palindromic symmetry point ij substring left margin IJ- radiusArray [ij]. We claim is radiusArray value [i + j] of

After the above data known, we can compare two left border, without the use of the central diffusion method, in some cases it is judged that i + j palindrome radius. The palindromic nature, we can know the center - radiusArray [ center ] to the center  descending equal  center  to center + radiusArray [ center ]

  • If the ij-radiusArray [ij]> center -radiusArray [center], i.e., the left boundary point of symmetry in the substring s, s string since both sides are identical, so that i + j is equal palindromic sequence palindromic sequence ij , radiusArray [I + J] = radiusArray [ij of] , avoiding the use of the central diffusion method, center and right constant value. (There may be a start question why i + j is not a palindromic sequence longer or shorter, but exactly equal ij palindromic sequence, because when the internal palindromic sequence of s, because of symmetrical, If i + j palindrome longer or shorter palindromic simultaneously affect a corresponding change in the ij)
  • If the ij-radiusArray [ij] <center -radiusArray [center], i.e., outside the left border point of symmetry substring s, this time radiusArray [I + J] = right - (I + J) , the center diffusion method avoids the use of , center and right of the same value. String (ij of the center-radiusArray [center] i + j is equal to the string right, right == center + radiusArray [center ], but the center-radiusArray [center] - 1 is not equal to right + 1, otherwise the length of the string s We will further increase the radius of 1. Therefore palindromic sequence of i + j stop at the right)
  • If the ij-radiusArray [ij] == center-radiusArray [center], i.e., the left boundary point of symmetry in the substring s left boundary, can not be determined at this time i + j palindromic sequence, may stop at the right, it may be more long, it is necessary to obtain a new substring s diffusion through the center, while the center and right of the updated value. (Since the same two left border, we can determine the i + j to the right is the right half of the palindromic sequence of i + j, but may be longer)

6. If J + I>  right , directly by the method of diffusion centers to obtain a new substring s, i.e. updating center  and right  values.

7. End traversal to find radiusArray maximum value of the radius is the longest sub-palindromic string, i.e. the index corresponding to a center point substring index in the string. Suppose radiusArray [I] the maximum value of r, then the original string (without '#') of substrings subscript (ir) / 2 to the subscript (i + r) / 2 is the longest sub-palindromic string

So when there is a palindromic sequence longer or more, the right half of the palindromic sequence are most calculated directly, eliminating the need for a large amount of computation. Here is the code to achieve:

//求最长的回文子串
public string LongestPalindrome(string s)
{
    if (s.Length <= 1)
    {
        return s;
    }

    //添加特殊符号 #
    System.Text.StringBuilder sb = new System.Text.StringBuilder();
    for (int i = 0, len = s.Length; i < len; i++)
    {
        sb.Append('#');
        sb.Append(s[i]);
    }
    sb.Append('#');
    string newString = sb.ToString();
    sb.Clear();
    sb = null;

    //回文半径数组
    int[] radiusArray = new int[newString.Length];
    //center当前回文串中心点,right当前回文串右边界,left当前回文串左边界,symmetryIndex当前遍历下标的对称下标,symmetryIndexLeft对称下标的回文串左边界
    int center = -1, right = -1, left = -1, symmetryIndex = -1, symmetryIndexLeft = -1;

    //maxRadius 最大半径的值,maxIndex 最大半径的下标
    int maxRadius = -1, maxIndex = -1;
    
    //遍历,如果最长回文串的半径>剩下的长度,则返回。最后两个字符可以不用遍历,因为其回文串肯定是本身
    for (int i = 0, len = newString.Length; i < len - 2 && maxRadius < (len - i); i++)
    {
        if (i > right)
        {
            right = GetRight(newString, i);
            center = i;
            radiusArray[i] = right - i;
        }
        else
        {
            symmetryIndex = center - (i - center);
            symmetryIndexLeft = symmetryIndex - radiusArray[symmetryIndex];
            left = center - (right - center);

            if (symmetryIndexLeft > left)
            {
                radiusArray[i] = radiusArray[symmetryIndex];
            }
            else if (symmetryIndexLeft < left)
            {
                radiusArray[i] = right - i;
            }
            else
            {
                right = GetRight(newString, i);
                center = i;
                radiusArray[i] = right - i;
            }
        }

        //纪录最大值
        if (radiusArray[i]> maxRadius)
        {
            maxRadius = radiusArray[i];
            maxIndex = i;
        }
    }

    return s.Substring((maxIndex - maxRadius) / 2, maxRadius);
}

//中心扩散 获取某个下标的回文串右边界
public int GetRight(string s, int index)
{
    int right = index;
    for(int i=1,len = s.Length; index + i < len && index - i >= 0; i++)
    {
        if (s[index - i] != s[index + i])
        {
            return right;
        }
        else
        {
            right = index + i;
        }
    }
    return right;

Submitted by

 

 

 

 

 

 

Guess you like

Origin blog.csdn.net/wangjiangrong/article/details/91449637
Recommended