Short and efficient solution using rolling hash


  • 0
    I

    Instead of using tries, use rolling hash to calculate prefix's hashes: wikipedia. It's the same idea used in Rabin-Karp string search algorithm. It might sound complicated at first, but once you get used to it, you can implement faster than tries. My implementation was in python, but you can extend it to your prefered programming language easily.

    R = 26 # radix size - 26 lowercase letters
    M = 997 # prime number - you can choose a bigger one
    
    def hash_word(word):
        # avoid starting hash_val with 0 or else 'a', 'aa', 'aaa', etc. will end up having the same hash value
        hash_val = 31
        for ch in word:
            num = ord(ch) - ord('a')
            hash_val = (hash_val * R + num) % M
        return hash_val
    
    
    class Solution(object):
        def replaceWords(self, dictionary, sentence):
            # for each hash: list of root words matching that hash
            hash_vals = {}  
            for root in dictionary:
                hash_val = hash_word(root)
                if hash_val not in hash_vals:
                    hash_vals[hash_val] = []
                hash_vals[hash_val].append(root)
            result = []
            # scan successor words using rolling hashes
            for successor in sentence.split():
                rolling_hash = 31
                for i, ch in enumerate(successor):
                    num = ord(ch) - ord('a')
                    rolling_hash = (rolling_hash * R + num) % M
                    if rolling_hash in hash_vals and successor[:i + 1] in hash_vals[rolling_hash]:
                        result.append(successor[:i + 1])
                        break
                else: # hasn't found the prefix, append the full word
                    result.append(successor)
            # format output from list to string
            return ' '.join(result)
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.