40ms trie solution in Python


  • 0
    R

    In most of the DP solutions, they keep testing if the substring matches. Obviously, it involves many redundant computations. While using a trie can make the testing on all the possible substrings in one pass.

    _TERM = object()
    
    class Solution(object):
        def wordBreak(self, s, wordDict):
            """
            :type s: str
            :type wordDict: List[str]
            :rtype: bool
            """
            trie = {}
            refs = {id(trie): trie}
            for word in wordDict:
                it = trie
                for c in word:
                    it.setdefault(c, {})
                    it = it[c]
                    refs[id(it)] = it
                it[_TERM] = True
    
            ids = {id(trie)}
            for c in s:
                if not ids:
                    break
                forwards = set()
                for i in ids:
                    it = refs[i]
                    if c in it:
                        it = it[c]
                        if _TERM in it:
                            forwards.add(id(trie))
                        forwards.add(id(it))
                ids = forwards
            return any(_TERM in refs[i] for i in ids)
    

    The basic idea is to walk all the way down the trie, if the iterator hits a terminal node, it should create another new iterator at the root of trie, representing a word being fully matched. Once all the characters in the input string being iterated, the result is determined by whether any of the iterators stopped at a terminal node.

    Note that it's important to use a set to eliminate duplicate iterators to make this solution efficient. Otherwise, in the worst case, where many words in the dictionary are the substring of other words, it will degrade into a brute force solution. But when using a set, it's equivalent to the DP solution while the substring testing is more efficient.

    The runtime of this solution is determined by the dictionary. If there is no word in the dictionary is a substring of another one, the runtime is in linear time. In the worst case, if every word except the longest one is a substring of another one, the runtime will be Theta(len(s)*len(wordList))


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.