5-line C++ solution no KMP with detailed time complexity analysis O(n^(1+1/loglog(n))) or o(n^(1+e))


  • 0

    As you may know using KMP to pre-calculate the longest length L of proper prefix which is also a suffix costs O(n) time and O(n) space, and the answer for this problem will simply be L && n%(n-L)==0.

    Without using KMP, we can implement a more straightforward algorithm with a little more time spending. As many has posted the solution to check whether s has a proper prefix which is also a suffix with length L such that n%(n-L)==0 simply using straightforward string comparison.

        bool repeatedSubstringPattern(string s) {
          for (int i = 1, n = s.size(), sqt = sqrt(n); i <= sqt; ++i)
            if (n%i == 0)
              for (int L : {n-i, n-n/i}) 
                // O(n) space or char-wise comparison for O(1) space
                if (L && s.substr(n-L) == s.substr(0, L)) return true;
    
          return false;        
        }
    

    The interesting question is what is the "a little more" time spending? The loop index i goes up to sqrt(n) to check substrings of length n-n%i only for n%i==0, so the time complexity will be

    • O(√n + Σd|n (n-d)) ≤ O(√n + Σd|n n) = O(nd(n))

    where d(n) is the divisor function of n (number of divisors). So we got the upper asymptotic bound.

    For lower asymptotic bound, note that any proper divisor d ≤ n/2, so

    • O(√n + Σd|n (n-d)) ≥ O(Σd|n n/2) = O(nd(n)).

    Function d(n) has upper asymptotic bound O(n1/loglog(n)), so the overall time complexity of the problem is

    • O(n1+1/loglog(n)) or o(n1+e) for any small e > 0,

    which means even though the KMP-less method is slower than KMP, but it is just infinitely tiny slower for worst case :-)


  • 0

    You used O(n) as the cost of the substring comparisons, right? But their cost isn't n, it's L. If you consider that, does it lead to an even better complexity?


  • 0

    @StefanPochmann Yes, actually, I tried to use a tighter bound L for cost of s.substr(n-L) == s.substr(0, L), which will lead to overall time complexity of

    • O(nd(n) - σ(n)), where σ(n) = Σd|n d is the order 1 divisor function (sum of divisors).

    This is indeed tighter than O(nd(n)). But I don't have a compact growth rate estimate for nd(n) - σ(n).

    My intuitive guess is that nd(n) - σ(n) will be pretty close to nd(n) since the largest proper divisor of n is at most n/2. I tried n = 2k which leads to nd(n) - σ(n) = O(nlog(n)).


  • 0

    Ah, never mind, I hadn't thought it through. L is at least n/2, so it doesn't matter.


  • 0
    A

    @zzg_zzm Unfortunately this is not O(1) space, since every substr() call creates a copy of the original string, which is O(n) space at least.
    To achieve O(1) space, you have to use two pointers to do the comparison.


  • 0

    @ayuanx You are right. I have updated the post with comments. For O(1) space, it needs to do directly the char-wise comparison

    for (int j = 0; j < L && s[j] == s[j+n-L]; ) if (++j == L) return true;
    

    to save the substr() call in s.substr(n-L) == s.substr(0, L) since the algorithm does not have to make copies of chars.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.