O(n*log2(n)) time and O(1) space

  • 5

    The value space S is {1,2,3..n}. Our array nums has cardinality n+1, so by the pigeonhole principle we must have at least one duplicate.

    This solution is inspired on binary search. We're going to look at log2(n) order statistics of S, starting with the expected median and then recursing left or right.

    In S, given median m, we have m values smaller than or equal to m and m - 1 values bigger than m. In nums we have a single value v that is repeated twice or more. By definition, either v <= m or v > m. So in nums we'll either have more than m values smaller than or equal to m, or more than m - 1 values greater than m. We can apply this recursively.

    For example, for S = {1,2,3,4,5} and nums = [3, 2, 4, 4, 1, 5]:

    median = 3
    3 items are smaller than or equal to 3 (as in S)
    3 items are greater than 3 (one more than in S)

    in S we have 2 values > 3, in m we have 3. The repeated value will be on the right. Now we recurse:

    Next order statistic = (3 + 5) / 2 = 4
    5 items are smaller than or equal to 4 (one more than in S)
    1 item is greater than 4 (as in S)

    We can't recurse any further, because there is no more value between 3 and 4 in S. Therefor, the current value of m must be the repeated value v.

    In code:

    class Solution(object):
        def findDuplicate(self, nums):
            :type nums: List[int]
            :rtype: int
            l, h, m = 0, len(nums) - 1, len(nums) >> 1
            while h - l > 1:
                if sum(1 for i in nums if i <= m) > m:
                    h = m
                    l = m
                m = (h + l + 1) >> 1
            return m

    This ran in 64ms, has time complexity O(n*log2(n)) and space complexity O(1) (I'm not nitpicking about Python versions; I assume the comprehension in sum to be a generator). I saw the cycle detection solution as posted earlier, which is faster and really very elegant. But if it took Knuth 24 hours to come up with a O(n) solution, I'm not feeling bad about coming up with this O(nlogn) cutie :-)

  • 0

    If n is even then numbers greater than the median will be m (not m-1), for example for 1,2,3,4 - The median is 2 and numbers greater than 2 are 2 (3 and 4)

  • 0

    Actually, the median of an even number of values is usually defined as the mean of the middle two. I need an order statistic here though, and I use the higher of the two middle values for convenience (because (n+1) >> 1 evaluates to the middle element if n is uneven).

Log in to reply

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.