O(log n) space using recursive generators


  • 7

    There's a paper called A Space-Efficient Algorithm for Calculating the Digit Distribution in the Kolakoski Sequence about an O(log n) space and O(n) time algorithm. I didn't read the paper but wrote my code based on Figure 1 on page 3. I use one generator per level, and to get things going, I hardcode the first three digits caused by the first two digits of the level above (which I therefore ignore).

    def magicalString(self, n):
        def gen():
            for x in 1, 2, 2:
                yield x
            for i, x in enumerate(gen()):
                if i > 1:
                    for _ in range(x):
                        yield i % 2 + 1
        return sum(x & 1 for x in itertools.islice(gen(), n))
    

    I think it's O(log n) space and O(n) time. Here are test results for different n, counting the numbers of generators and yields to measure space and time:

         n      generators  yields
                 (=space)   (=time)
             1         1         1
            10         4        25
           100        10       295
          1000        16      3001
         10000        22     30028
        100000        27    299935
       1000000        33   2999958
      10000000        39  29999888
     100000000        44 300001534
    

    The number of generators is very close to log1.5(n), which makes sense because apparently there are about equally many ones and twos in any prefix of the sequence, so on average one digit in a generator causes 1.5 digits in the generator using it.

    The number of yields is very close to 3n, which also makes sense. The outermost generator yields n times, the generator it uses yields about (2/3)n times, the next inner generator yields about (2/3)2n times, and so on. So the total number of yields is about:
    n ⋅ ∑i=0..∞ (2/3)i = n ⋅ 1/(1-2/3) = 3n

    Here's the testing code:

    import itertools
    
    class Solution(object):
        def magicalString(self, n):
            def gen():
                global generators, yields
                generators += 1
                for x in 1, 2, 2:
                    yields += 1
                    yield x
                for i, x in enumerate(gen()):
                    if i > 1:
                        for _ in range(x):
                            yields += 1
                            yield i % 2 + 1
            return sum(x & 1 for x in itertools.islice(gen(), n))
    
    print '     n      generators  yields'
    print '             (=space)   (=time)'
    for e in range(9):
        n = 10**e
        generators = yields = 0
        Solution().magicalString(n)
        print '%10d' * 3 % (n, generators, yields)
    

  • 0
    L

    @StefanPochmann Thanks for providing this good solution. But it turns out to be slower than other methods. Is that because of the yield or the recursion?


  • 0

    @louis_wang Would be good if you told us how much slower it is. Anyway, I ran it myself now, it gets accepted in about 640ms. The runtime distribution graph shows a hill around 220ms. Python baseline (judge time overhead) is about 40ms. So my solution takes about 600/180 = 3.333... times as long as those others. That's close to the factor 3 I described when I said "The number of yields is very close to 3n". Don't know what's responsible for the remaining small speed difference, really hard to compare such very different codes.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.