Short Python


  • 18

    Either don't encode s at all, or encode it as one part k[...] or encode it as multiple parts (in which case we can somewhere split it into two subproblems). Whatever is shortest. Uses @rsrs3's nice trick of searching s in s + s.

    def encode(self, s, memo={}):
        if s not in memo:
            n = len(s)
            i = (s + s).find(s, 1)
            one = '%d[%s]' % (n / i, self.encode(s[:i])) if i < n else s
            multi = [self.encode(s[:i]) + self.encode(s[i:]) for i in xrange(1, n)]
            memo[s] = min([s, one] + multi, key=len)
        return memo[s]

  • 2
    A

    Curious how i = (s + s).find(s, 1) accomplishes finding the optimal repeating x for a "k[x]" encoding of s. Any explanation of this?

    J.K. should've read @rsrs3 's post, nice solution though :)


  • 0

    great solution! I may have a naive question but do you mean by ...

    multi = [self.encode(s[:i], memo) + self.encode(s[i:], memo) for i in xrange(1, n)]

    also would you mind explaining the complexity? Thanks


  • 0

    @amaximov said in Short Python:

    Curious how i = (s + s).find(s, 1) accomplishes finding the optimal repeating x for a "k[x]" encoding of s. Any explanation of this?

    J.K. should've read @rsrs3 's post, nice solution though :)

    I had a post to prove why condition (L = (s + s).find(s, 1)) < s.size() is equivalent to s has the shortest repetitive substring of length L. (let N = s.size())

    Basically, if you consider function f(i) = s[i%N] from int to char which obviously has a positive period N, then the key is that L = (s + s).find(s, 1)) actually indicates L is the minimum positive period of f. Note that we always have i = L*(i/L) + i%L for any i, so

    1. f(i) = f(i%L) since L*(i/L) is an integral multiple of L.

    Also, N = L*(N/L)+N%L, so f(i) = f(i+N) = f(i+N%L), which means N%L (<L) is also a period of f. But L is the minimum by string::find(s,1) definition (i.e., first occurrence of match after 0), so we must have

    1. N%L = 0.

    Combining equations 1 and 2 proves that s.subtr(0,L) must be the shortest repetitive substring of s.


  • 0
    V

    What would be the time complexity of the program? With memoization is it O(n^3)?


  • 0
    Y

    @vignesh6v I am thinking it is O(n^3) because there are O(n^2) possible substrings, each of which takes O(n) time to construct. Would like to hear others thoughts though.


  • 0
    S

    I think worst case complexity is O(n^3).

            multi = [self.encode(s[:i]) + self.encode(s[i:]) for i in xrange(1, n)]
    

    This line basically runs over every substring of s. There are total of O(n^2) substrings, so this loop runs for O(n^2).

            i = (s + s).find(s, 1)
    

    Is O(n) which in worst case happens for every substring of s (O(n^2))

    Memoization prevents exponential run time, some substrings may repeat but memo will quickly end the recursion

    So all in all O(n^3) complexity....correct me if wrong.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.