Implement Princeton KMP version by Python, but got Time exceeded error


  • 0

    I implemented a Python version of KMP followed by Princeton Prof Robert Sedgewick's discussion. (Ref : https://www.youtube.com/watch?v=n-7n-FDEWzc&index=41&list=PLxc4gS-_A5VDvP_9W8JJ04zk6m1qTolzG)

    However, I got Time Limit Exceeds Error. I am new here, so I am not sure I understand TLE error correctly : Per leetcode explanation, TLE indicates that the solution is correct but it is extremely inefficient when test case with huge input size.

    If I understand TLE correctly, this means that my solution works, but some part of my solution should be inefficient and need to be enhanced? I guess due to extremely large space cost when precomputing the DFA table?

    Call for help :( Thanks!

    class Solution(object):

    def dfa(self, haystack, needle):
    
    	R = 256 # R : the alphabet size
    	M = len(needle) # M : the patten length
    
    	# create a dictionary to encode every unique char from needle with a unique int value for later dfa retrieval.
    	global needle_dict
    	needle_dict = dict(zip(list(set(needle).union(set(haystack))), range(len(list(set(needle).union(set(haystack)))))))
    	# create dfa table, where the # of dfa columns is M, the # of dfa rows is R.
    	dfa = [[0 for i in range(M)] for j in range(R)]
    
    	# At state 0  find the first char in the needle and set this first char's corresponding state as 1
    	dfa[needle_dict[needle[0]]][0] = 1
    
    	X = 0 # X to record previous state
    
    	for state in range(1, M): # dfa column
    		for row in range(0, R): # dfa row
    			dfa[row][state] = dfa[row][X] # copy mismatch cases
    		dfa[needle_dict[needle[state]]][state] = state + 1 # update the match case for present state
    		X = dfa[needle_dict[needle[state]]][X] # update X
    
    	return dfa
    
    def strStr(self, haystack, needle):
    	if len(needle) == 0:
    		return 0
    	if len(haystack) == 0:
    		return -1
    	# precompute dfa[][]from pattern
    	dfa_table = self.dfa(haystack, needle)
    
    	i, state = 0, 0
    	while (i < len(haystack) and state < len(needle)):
    		state = dfa_table[needle_dict[haystack[i]]][state]
    		i += 1
    
    	if state == len(needle):
    		return i - len(needle) 
    	else:
    		return -1

  • 0

    First, TLE means your code is too slow which doesn't means your code is correct but too slow. Your code is stopped because it reached a time limitation. So your code has no chance to give an answer and the correctness is unknown.
    Second, your code looks so different from other KMP program. I have to admit I don't understand your code, maybe due to my poor English :confounded: I see no need to build a dfa table or needle_dict. You can save the match and mismatch cases in a single row of a list simply.


  • 0
    H

    Dr. Sedgewick mentioned in his KMP lecture that there is an improved version of KMP for large alphabets, which uses an NFA instead of a DFA.

    You may want to take a look at other posts using KMP. Then you can notice that most of them use a while loop to back up and check prev. states. Since we are not sure to which state we back up before checking them, this is an NFA implementation. And the improved version also avoids the intricate automaton construction.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.