Accepted KMP solution in java for reference


  • 24
    Z
    public String strStr(String haystack, String needle) {
    	//KMP algorithms
    	if(needle.equals("")) return haystack;
    	if(haystack.equals("")) return null;
    	char[] arr = needle.toCharArray();
    	int[] next = makeNext(arr);
    
    	for(int i = 0, j = 0, end = haystack.length(); i < end;){
    		if(j == -1 || haystack.charAt(i) == arr[j]){
    			j++;
    			i++;
    			if(j == arr.length) return haystack.substring(i - arr.length);
    		}
    		if(i < end && haystack.charAt(i) != arr[j]) j = next[j];
    	}
        return null;
    }
    
    private int[] makeNext(char[] arr){
    	int len = arr.length;
    	int[] next = new int[len];
    
    	next[0] = -1;
    	for(int i = 0, j = -1; i + 1 < len;){
    		if(j == -1 || arr[i] == arr[j]){
    			next[i+1] = j+1;
    			if(arr[i+1] == arr[j+1]) next[i+1] = next[j+1];
    			i++;
    			j++;
    		}
    		if(arr[i] != arr[j]) j = next[j];
    	}
    
    	return next;
    }

  • 0
    A

    if(arr[i+1] == arr[j+1]) next[i+1] = next[j+1];

    This is redundant.

    Thanks for sharing though :)


  • 0
    W

    you must remove this line: if(arr[i+1] == arr[j+1]) next[i+1] = next[j+1];
    With this line, the next array is wrong.


  • 2
    Y

    @angelvivienne @wangqi0316
    if(arr[i+1] == arr[j+1]) next[i+1] = next[j+1];
    This line is the most important part for KMP alg.
    when arr[i+1] failed, if same, j+1 will failed too. so need to call next[j+1].
    In theory , need check next[next[j+1]] for optimization.


  • 1

    Thanks for sharing! I did this using KMP this time. But I checked prefix function but not sure why it's even slower than brute force.
    For example, t="ababa", f = [0, 0, 1, 2, 3]. I guess it's due to the test case of Leetcode?

        public int strStr(String s, String t) {
            int[] f = computePrefixFunc(t);
            int i, j;
            for (i = 0, j = 0; i < s.length() && j < t.length(); i++) { // never back up i
                while (j > 0 && s.charAt(i) != t.charAt(j)) j = f[j - 1]; // back up j recursively till next char match
                if (s.charAt(i) == t.charAt(j)) j++; // if matched move j, otherwise give up current i and move on
            
            return j == t.length() ? i - j : -1;
        }
        
        // Similar to strStr except compare t against itself
        private int[] computePrefixFunc(String t) {
            int[] f = new int[t.length()];
            for (int i = 1, j = 0; i < t.length(); i++) { // now i = #matched
                while (j > 0 && t.charAt(i) != t.charAt(j)) j = f[j - 1];
                if (t.charAt(i) == t.charAt(j)) j++;
                f[i] = j;
            }
            return f;
        }
    

  • 0
    S

    @cdai Hi cdai, could you explain the line while (j > 0 && t.charAt(i) != t.charAt(j)) j = f[j - 1];
    What's the meaning of j?


  • 0
    J

    Here is my KMP implementation, which is almost identical to the pseudo code on CLRS.

    public class Solution {
        public int strStr(String haystack, String needle) {
            if (needle.length() == 0) return 0;
            if (needle.length() > haystack.length() || haystack.length() == 0) return -1;
            char[] ndl = needle.toCharArray();
            char[] hay = haystack.toCharArray();
            int[] pai = new int[ndl.length];
            pai[0] = -1;
            int k = -1;
            for (int i = 1; i < ndl.length; i++) {
                while (k > -1 && ndl[k + 1] != ndl[i]) {
                    k = pai[k];
                }
                if (ndl[k + 1] == ndl[i]) {
                    k++;
                }
                pai[i] = k;
    
            }
            k = -1;
            for (int i = 0; i < hay.length; i++) {
                while (k > -1 && ndl[k + 1] != hay[i]) {
                    k = pai[k];
                }
                if (ndl[k + 1] == hay[i]) {
                    k++;
                    if (k == ndl.length - 1) {
                        return i - k;
                    }
                }
            }
            return -1;
        }
    }
    

    The code performs worse than brute force method, its because the brute force is O(N) in its best case, while KMP is worst case O(N) for matching, but it has an extra O(M) for preprocessing.

    Following is the same code but more concise and less readable.

    public class Solution {
        public int strStr(String haystack, String needle) {
            if (needle.length() == 0) return 0;
            if (needle.length() > haystack.length() || haystack.length() == 0) return -1;
            char[] ndl = needle.toCharArray();
            char[] hay = haystack.toCharArray();
            int[] pai = new int[ndl.length];
            pai[0] = -1;
            for (int i = 1, k = -1; i < ndl.length; i++) {
                while (k > -1 && ndl[k + 1] != ndl[i]) k = pai[k];
                pai[i] = ndl[k + 1] == ndl[i] ? ++k : k;
            }
            for (int i = 0, k = -1; i < hay.length; i++) {
                while (k > -1 && ndl[k + 1] != hay[i]) k = pai[k];
                if (ndl[k + 1] == hay[i] && ++k == ndl.length - 1) return i - k;
            }
            return -1;
        }
    }
    

  • 1

    @cdai A short String case may cause KMP is slower than brute force since you made a next[] before iteration.

    A longer String and more repeated long substring can show the great improvement by KMP. :)


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.