Insights / Optimizing / Hacking :-)


  • 4

    I found two simple optimizations that helped me get my solution accepted:

    • Filter out all dictionary words that don't have the same length as the target. They can never have the same abbreviation, because an abbreviation tells you exactly how long the abbreviated word is.
      If after this no dictionary words are left, just abbreviate the target by a single number.
    • Filter out all dictionary words that don't match the target at any position. For example, "leetcode" and "codeleet" have no common abbreviation (other than "8"), as they don't have the same character at any of their eight positions. The reason is that an abbreviation tells you exactly where its letters are, for example "3t2d1" tells you that the fourth letter is a "t" and the seventh letter is a "d".
      If after this no dictionary words are left, just abbreviate the target by its first letter and a single number for the rest (if any).

    The two by far largest dictionaries in the test suite have 1000 words each, and with the above optimizations, they get reduced to 347 and 5 words, respectively. And for example the test case "chinaandusaarefre", ["china","are","and","usa","friends","us","en","ar"] gets reduced to the trivial "chinaandusaarefre", [] (that one was bad for me, because my solution computes and sorts all abbreviations of the target word, but thanks to the optimizations I don't have to do that at all in this case).

    Here's some Python code taking advantage of these optimizations:

    def minAbbreviation(self, target, dictionary):
    
        # Throw out wrong-length words
        dictionary = [word for word in dictionary
                      if len(word) == len(target)]
        if not dictionary:
            return str(len(target))
    
        # Throw out words not matching the target in any position.
        dictionary = [word for word in dictionary
                      if any(map(operator.eq, word, target))]
        if not dictionary:
            return target if len(target) < 2 else target[0] + str(len(target) - 1)
    
        # Regular solution
        ...
    

    And here's the effect of the optimizations on the test suite's 45 test cases:

       1 => 1    dictionary words of length 5
       3 => 2    dictionary words of length 5
       1 => 0    dictionary words of length 5
       1 => 0    dictionary words of length 5
       2 => 1    dictionary words of length 5
       3 => 3    dictionary words of length 6
       4 => 0    dictionary words of length 5
       2 => 2    dictionary words of length 7
       1 => 1    dictionary words of length 11
       5 => 5    dictionary words of length 5
       2 => 2    dictionary words of length 5
       5 => 2    dictionary words of length 5
       5 => 2    dictionary words of length 5
       5 => 2    dictionary words of length 5
       5 => 0    dictionary words of length 4
       5 => 0    dictionary words of length 4
       5 => 0    dictionary words of length 8
       3 => 3    dictionary words of length 3
       6 => 0    dictionary words of length 7
       6 => 0    dictionary words of length 8
       7 => 0    dictionary words of length 8
       8 => 0    dictionary words of length 9
       4 => 0    dictionary words of length 8
       4 => 0    dictionary words of length 4
       4 => 0    dictionary words of length 14
       6 => 0    dictionary words of length 16
      12 => 0    dictionary words of length 8
       8 => 0    dictionary words of length 17
       7 => 6    dictionary words of length 5
       6 => 6    dictionary words of length 8
      13 => 12   dictionary words of length 3
       0 => 0    dictionary words of length 21
      12 => 8    dictionary words of length 4
       9 => 8    dictionary words of length 9
      18 => 14   dictionary words of length 4
       7 => 6    dictionary words of length 3
      12 => 11   dictionary words of length 4
      12 => 11   dictionary words of length 4
      16 => 0    dictionary words of length 1
      12 => 11   dictionary words of length 8
       6 => 3    dictionary words of length 4
      10 => 10   dictionary words of length 6
       9 => 9    dictionary words of length 11
    1000 => 347  dictionary words of length 10
    1000 => 5    dictionary words of length 10
    

    I found out those numbers by submitting this solution, which "hacks" the judge a bit. It collects the optimization statistics and prints them during the last test case, where it intentionally returns something wrong so the judge shows me the output:

    class Solution(object):
        optimizations = []
        def minAbbreviation(self, target, dictionary):
            self.optimizations.append((len(dictionary),
                                       sum(len(word) == len(target) and any(map(operator.eq, word, target))
                                           for word in dictionary),
                                       len(target)))
            if len(self.optimizations) == 45:
                for opt in self.optimizations:
                    print '%4d => %-4d dictionary words of length %d' % opt
                return None
    
            (and then here follows my actual solution)
    

  • 0
    M

    @StefanPochmann Stefan, thanks for the tips. Are you planning to post your full solution? I am particularly interested to see an elegant/short implementation for "isValidWordAbbreviation (or problem 408)". Mine is lengthy and looks silly..


  • 0

    @mleetcode Yes, I just did :-). It's ok, I guess, but I'm not super happy with it. That's one reason why I posted these optimizations separately.


  • 0
    M

    @StefanPochmann Thanks! :)


  • 0
    L

    To my best of knowledge, this problem can be solved in O(m*2^m + mn) time by using DP. However, I am too lazy to write such a solution. :-)


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.