A note about the complexity analysis of the binary search solutions


  • 0
    A

    Before I goes into this, take a look of the binary search solutions, and understand them before you proceed reading.
    All of these solutions (with minor variations) goes like this:

    • Get the low bound value by choosing the max number from the array.
    • Get the high bound value by summing all the numbers of the array.
    • Perform a binary search between those values, and check if the solution is a valid split.

    The thing is, this binary search complexity is dependant on the actual values within the array, and not on the count of elements of the array.
    So, now you may be thinking, the complexity of the binary search phase (bin search + array check) should be: O (log(high_bound - low_bound) * n) , right?
    Well, not exactly.
    Let M = high_bound - low_bound.
    In computer science, a complexity analysis is dependent on the length of the input. So actually if M is of length D bits, M = O(2^D).
    So, the run time of the binary search phase is actually: O(log(M)*n) = O(log(2^D)n) = O(Dn).
    In the real world, this algorithm is still quite efficient (since the max number of bits in an integer is usually 32/64), but you should still be aware of how a correct complexity analysis is performed.

    Further reading: Pseudo-polynomial time


  • 0

    I don't think anybody thinks the binary search phase is O(log(sum - max)). It's O(log(sum - max) ⋅ n), where n is the number of elements.


  • 0
    A

    @StefanPochmann
    I meant by "binary search phase" to be only the binary search (without the array check after each search).
    I fixed my explanation to be more clear - now includes the array check time.


  • 0

    @apelz First you say it's "not exactly" O(log(high_bound - low_bound) * n) and then you say it "is actually: O(log(M)*n)", but that's the same. So is it or is it not?


  • 0
    A

    @StefanPochmann it depends how you view "high_bound"/"low_bound" - if you view them as length of bits in the input (as I defined), so it's the same.
    The point is, you can't say (formally) the complexity of the algorithm is: O(log(high_bound - low_bound) * n) because it is meaningless.
    The correct way to formulate the complexity is in terms of length of input, meaning D and n, as I defined them.
    This is a subtle point, but still I thought worth mentioning.


  • 0

    @apelz said in A note about the complexity analysis of the binary search solutions:

    you can't say (formally) the complexity of the algorithm is: O(log(high_bound - low_bound) * n)

    We definitely can. It's correct. And meaningful.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.