Easy dynamic programming solution with explained base case and recursion


  • 2
    A

    Inspired by a neat technique for finding self-repeating pattern (https://discuss.leetcode.com/topic/71442/short-python).

    1. DP for substring. For convenience, just use substring as key of memoization. strEncoding[str] = shortest encoding of str.

    2. Base case (in fact this 'base' is still recursive):
      self-repeating: repeatingIdx = (s + s).find(s, 1), repeatingIdx < len(s).
      Find index of s in s + s starting from index 1.
      E.g., 'abbbabbbcabbbabbbc' return 9, return 2[strEncoding(abbbabbbc)].

    3. Recursion by Iterating string splitting point:
      E.g., 'abbbabbbc' -> strEncoding(abbbabbb) + strEncoding(c)

    class Solution(object):
        def encode(self, s):
            strEncoding = dict()
            return self.getEncoding(s, strEncoding)
    
        def getEncoding(self, s, strEncoding):
            encoding = s
            # Base case: self-repeating, such as 'abbbabbbcabbbabbbc' -> 2[strEncoding(abbbabbbc)]
            # Find index of s in s + s starting from index 1. 
            # E.g., 'abbbabbbcabbbabbbc' return 9. 'abcabcabcabc' return 3.
            repeatingIdx = (s + s).find(s, 1)
            if repeatingIdx < len(s):
                repeatingSubstr = s[: repeatingIdx]
                numRepeating = len(s) / len(repeatingSubstr)
                if repeatingSubstr not in strEncoding:
                    strEncoding[repeatingSubstr] = self.getEncoding(repeatingSubstr, strEncoding)
                repeatingSubstrEncoding = strEncoding[repeatingSubstr]
                candidateEncoding = str(numRepeating) + '[' + repeatingSubstrEncoding + ']'
                if len(candidateEncoding) < len(encoding):
                    encoding = candidateEncoding
            # If not self-repating, split string into substrings, 
            # such as 'abbbabbbc' -> strEncoding(abbbabbb) + strEncoding(c)
            else:
                for i in range(1, len(s)):
                    leftSubstr = s[: i]
                    rightSubstr = s[i :]
                    if leftSubstr not in strEncoding:
                        strEncoding[leftSubstr] = self.getEncoding(leftSubstr, strEncoding)
                    leftSubstrEncoding = strEncoding[leftSubstr]
                    if rightSubstr not in strEncoding:
                        strEncoding[rightSubstr] = self.getEncoding(rightSubstr, strEncoding)
                    rightSubstrEncoding = strEncoding[rightSubstr]
                    if len(leftSubstrEncoding) + len(rightSubstrEncoding) < len(encoding):
                        encoding = leftSubstrEncoding + rightSubstrEncoding
            return encoding
    

  • 0
    A

    Just in case that someone is curious about the complexity of find(). It is O(n) with KMP algorithm for string matching.

    Similarly, by making use of the prefixTable from the KMP algorithm, we can also check whether a string is self-repeating such as 'abbbabbbcabbbabbbc' or 'abcabcabc' with:

        def isSelfRepeating(self, str):
            table = getPrefixTable(str)
            return table[-1] % (len(str) - table[-1]) == 0 and table[-1] > 0
    
        def getPrefixTable(self, needle):
            prefixIdx = 0
            table = [0] * len(needle)
            for suffixIdx in range(1, len(needle)):
                # Importantly, while s[i] fails in matching current char, i = table[i - 1].
                while prefixIdx > 0 and needle[prefixIdx] != needle[suffixIdx]:
                    prefixIdx = table[prefixIdx - 1]
                if needle[prefixIdx] == needle[suffixIdx]:
                    prefixIdx += 1
                table[suffixIdx] = prefixIdx
            return table
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.