Manacher's algorithm -- Java O(n) time and O(n) space solution


  • 0
    F

    As others pointed out, the original problem is equivalent to finding the longest prefix palindrome of the input string (i.e., the longest palindrome that is also a prefix of the input string). The final output string can be constructed by prepending the reverse of the remaining substring (i.e., the part excluding the prefix palindrome in the input string) to the input string.

    There are multiple ways for obtaining the longest prefix palindrome of the given string, one example would be the KMP-based algorithm applied to the concatenation of the input string and its reverse. Here I will introduce another one, based on Manacher's algorithm, which can find the longest palindromes centered at any index of the given string (note the prefix palindrome will be a special case).

    In essence, both KMP-based and Manacher's algorithms take advantage of known information to avoid unnecessary comparisons, which leads to huge improvement in performance compared to the naive algorithms. For the KMP case, the known information is the length of the longest prefix that is also a suffix of the substring ending at each index. For the Manacher case, the known information will be the length of the longest palindrome centered at each index. The ideas will be detailed in the following paragraphs.

    Assume the input string is s. We will first transform it into a new string by adding "boundary" at its beginning and end and in between two consecutive characters. The "boundary" can be any sentinel character that is not present in s (for example, it can be \u0000, the null character). The reason we do this transformation is to deal with the two cases equally when the length of s is even or odd. It can be shown that the length of palindromic substrings found in this new string will always be odd.

    Let arr be the character array corresponding to the transformed string above, span be the integer array where span[i] represents the half length of the longest palindrome centered at arr[i] (note since the total length of such a palindrome is always odd, the total length will be 2 * span[i] + 1). Our goal is to fill out the span array. The algorithm proceeds as usual by scanning each position in arr. Naively to find the half length of the longest palindrome centered at each position, we have to compare characters that are mirror images of each other about that position, which yields the O(n^2) solution. To make use of known information, we will maintain a reference palindrome, which satisfies the following two conditions:

    1. Its center position c is less than i which is the center position currently being scanned. Note this implies the centered position of the reference palindrome must have been scanned.
    2. Compared to other palindromes satisfying condition 1, the position of its rightmost character, r, is largest. Note r = c + span[c].

    With the reference palindrome in place, there are two general cases depending on whether i is contained within the reference palindrome or not. If it is outside the reference palindrome, meaning i > r, the information from the reference palindrome won't help so we are forced to go with the naive way described above. If however, it is contained within the reference palindrome, meaning c < i <= r, there will be a mirror position in the reference palindrome corresponding to i, which after simple calculation will be 2c - i. The key here is that this position is to the left of c, i.e., 2c - i < c < i, therefore it must have been scanned so its half length information (span[2c-i]) will be readily available.

    The situation will be further divided into two cases, depending on whether the longest palindrome centered at 2c - i is totally contained within the reference palindrome or not (here totally means the leftmost character of the palindrome centered at 2c - i is to the right of the leftmost character of the reference palindrome). If it is, meaning c - span[c] < (2c - i) - span[2c - i], or i + span[2c - i] < r, then the half length at position i will be the same as 2c - i, i.e., span[i] = span[2c-i]. This is because comparing characters that are mirror images of each other about position i (these characters from one group) will be exactly the same as comparing those that are mirror images of each other about position 2c - i (these characters form another group), since these two groups are mirror images of each other about position c. Things are a little different for the other case, when i + span[2c - i] >= r. We can only be sure about characters contained within the reference palindrome. This implies span[i] >= (2c - i) - (2c - r) = r - i, that is the half length of the longest palindrome centered at position i is at least the same as the part contained in the reference palindrome of the one centered at position 2c - i. To get the remaining part, we are again forced to apply the naive method above.

    Once we find the longest palindrome centered at i, we then decide whether to update the reference palindrome or not. Also if this palindrome is a prefix palindrome (meaning i = span[i]), we also need to determine whether to update the center position mc of the longest prefix palindrome.

    Here is the Java program based on the above ideas. The program runs in linear time despite of the nested for loop, by noting that the rightmost position of the reference palindrome, r, is always increasing, therefore the initial value of the parameter k is always increasing. Also note that since only half of the prefix palindrome contains valid characters from the original string s, the effective length of the longest prefix palindrome of s will actually be mc.

    public String shortestPalindrome(String s) {
        char[] arr = new char[s.length() * 2 + 1];
        int[] span = new int[arr.length];
        int c = 0, r = 0, mc = 0;
        
        for (int i = 1 ; i < arr.length; i += 2) arr[i] = s.charAt(i >> 1);
        
        for (int i = 1; i < arr.length; i++) {
        	if (i < r && i + span[c * 2 - i] < r) {
        	    span[i] = span[c * 2 - i];
        	    continue;
        	}
        		
        	if (i < r) span[i] = r - i;
        		
        	int k = i + span[i] + 1, j = i * 2 - k;
        	for (; j >= 0 && k < arr.length && arr[j] == arr[k]; j--, k++) span[i]++;
        		
        	if (r < i + span[i]) r = i + span[c = i];
        	if (i == span[i] && mc < span[i]) mc = i;
        }
        
        return new StringBuilder(s.substring(mc)).reverse().append(s).toString();
    }
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.