Short Python


  • 20

    Either don't encode s at all, or encode it as one part k[...] or encode it as multiple parts (in which case we can somewhere split it into two subproblems). Whatever is shortest. Uses @rsrs3's nice trick of searching s in s + s.

    def encode(self, s, memo={}):
        if s not in memo:
            n = len(s)
            i = (s + s).find(s, 1)
            one = '%d[%s]' % (n / i, self.encode(s[:i])) if i < n else s
            multi = [self.encode(s[:i]) + self.encode(s[i:]) for i in xrange(1, n)]
            memo[s] = min([s, one] + multi, key=len)
        return memo[s]

  • 2
    A

    Curious how i = (s + s).find(s, 1) accomplishes finding the optimal repeating x for a "k[x]" encoding of s. Any explanation of this?

    J.K. should've read @rsrs3 's post, nice solution though :)


  • 1

    great solution! I may have a naive question but do you mean by ...

    multi = [self.encode(s[:i], memo) + self.encode(s[i:], memo) for i in xrange(1, n)]

    also would you mind explaining the complexity? Thanks


  • 0

    @amaximov said in Short Python:

    Curious how i = (s + s).find(s, 1) accomplishes finding the optimal repeating x for a "k[x]" encoding of s. Any explanation of this?

    J.K. should've read @rsrs3 's post, nice solution though :)

    I had a post to prove why condition (L = (s + s).find(s, 1)) < s.size() is equivalent to s has the shortest repetitive substring of length L. (let N = s.size())

    Basically, if you consider function f(i) = s[i%N] from int to char which obviously has a positive period N, then the key is that L = (s + s).find(s, 1)) actually indicates L is the minimum positive period of f. Note that we always have i = L*(i/L) + i%L for any i, so

    1. f(i) = f(i%L) since L*(i/L) is an integral multiple of L.

    Also, N = L*(N/L)+N%L, so f(i) = f(i+N) = f(i+N%L), which means N%L (<L) is also a period of f. But L is the minimum by string::find(s,1) definition (i.e., first occurrence of match after 0), so we must have

    1. N%L = 0.

    Combining equations 1 and 2 proves that s.subtr(0,L) must be the shortest repetitive substring of s.


  • 0
    V

    What would be the time complexity of the program? With memoization is it O(n^3)?


  • 0
    Y

    @vignesh6v I am thinking it is O(n^3) because there are O(n^2) possible substrings, each of which takes O(n) time to construct. Would like to hear others thoughts though.


  • 0
    S

    I think worst case complexity is O(n^3).

            multi = [self.encode(s[:i]) + self.encode(s[i:]) for i in xrange(1, n)]
    

    This line basically runs over every substring of s. There are total of O(n^2) substrings, so this loop runs for O(n^2).

            i = (s + s).find(s, 1)
    

    Is O(n) which in worst case happens for every substring of s (O(n^2))

    Memoization prevents exponential run time, some substrings may repeat but memo will quickly end the recursion

    So all in all O(n^3) complexity....correct me if wrong.


  • 0
    L

    @roc571 no need to add memo to each recursive call. The dictionary is created when the function encode is created, and the default value of the memo argument is the reference/id/pointer/address to that dictionary.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.