Solution by awice


  • 0

    Approach #1: Brute Force [Accepted]

    Intuition

    We'll find the median of the concatenation of both arrays.

    Algorithm

    Concatenate both arrays into one large array arr. Recall that we can find a median after sorting by choosing the middle element if the array's length is odd, and the average of the middle elements if the array's length is even.

    Python

    class Solution(object):
        def findMedianSortedArrays(self, nums1, nums2):
            def median(arr):
                arr.sort()
                N = len(arr)
                if N % 2:
                    return arr[N/2]
                else:
                    return (arr[N/2 - 1] + arr[N/2]) / 2.0
    
            return median(nums1 + nums2)
    

    Complexity Analysis

    • Time Complexity: Let $$M, N$$ be the lengths of the given arrays. Concatenating them takes $$O(M+N)$$ work, and sorting takes $$O( (M+N) \log (M+N) )$$ work. The second factor dominates, so the total complexity is $$O( (M+N) \log (M+N) )$$.

    • Space Complexity: We need $$O(M + N)$$ space to store the new array. Sorting is in-place, so it doesn't require additional space.


    Approach #2: Naive Divide And Conquer [Accepted]

    Intuition

    We want to use the fact that the given arrays are sorted to help us find a more efficient algorithm, since we did not use that fact in our brute force solution.

    If we erased some small elements from the start of the arrays, we could erase an equal number of large elements from the end of the arrays. As long as we don't erase elements that contribute to the median, then the answer for the remaining arrays will be the same.

    Algorithm

    Let's firm up our intuition. Suppose we are currently considering intervals [i1, j1] of nums1 and [i2, j2] of nums2.

    What do we learn when we know nums1[m1] <= nums2[m2]? Because the arrays are sorted, the m1-1 elements in nums1 occurring before nums1[m1] are smaller than nums1[m1], and the len(nums2) - m2 - 1 elements in nums2 occurring after nums2[m2] are larger than nums2[m2]. We could erase an equal number of these elements and it wouldn't change the median.

    If we chose m1, m2 to be the midpoints of [i1, j1], [i2, j2], then we would be erasing about half of min(j1-i1, j2-i2) points at a time. However, if say i1 and j1 are close together, this wouldn't be sufficient.

    Fortunately, we have another insight. Say one array is size 1000, and the other is size 100. At the very beginning, we could remove roughly 900 elements from the first array - 450 small elements and 450 big ones. This is because adding 100 elements from the second array would not move the relative position of the median in the first array by more than 50 places. This is what our radius code does.

    Then, because of our radius calculation, we are guaranteed that j1 - i1 and j2 - i2 are close together, so we will be reducing both intervals by around half each time.

    Algorithm

    class Solution(object):
        def findMedianSortedArrays(self, A, B):
            if len(A) < len(B):
                A, B = B, A
    
            radius = max(0, (len(A) - len(B)) // 2 - 1)
            i1, j1, i2, j2 = radius, len(A)-1 - radius, 0, len(B)-1
    
            while i1 +3 < j1 and i2 +3 < j2:
                m1 = i1+j1 >> 1
                m2 = i2+j2 >> 1
                if A[m1] <= B[m2]:
                    delta = min(m1 - i1, j2 - m2) - 1
                    i1 += delta
                    j2 -= delta
                else:
                    delta = min(m2 - i2, j1 - m1) - 1
                    i2 += delta
                    j1 -= delta
    
            def median(nums):
                nums.sort()
                N = len(nums)
                return (nums[N // 2] + nums[(N - 1) // 2]) / 2.0
    
            return median(A[i1: j1+1] + B[i2: j2+1])
    

    Complexity Analysis

    • Time Complexity: Let $$M, N$$ be the lengths of the given arrays. The width of our interval is min(M, N), and we decrease it by roughly half each time, which takes $$\log(\min(M, N))$$ steps. Taking the median at the end of a few integers is negligible. In total, the complexity is $$\log(\min(M, N))$$.

    • Space Complexity: We need $$O(1)$$ additional space to store the pointers we are using.


    Approach #3: Binary Search by Position [Accepted]

    Intuition

    What makes taking medians hard is that the answer changes slightly depending on if there are an odd or even number of elements.

    Instead of looking at numbers, let's look at positions. An array of length N has 2N+1 positions: a position at each number, between adjacent numbers, before all numbers, and after all numbers. For example, [1, 2, 3] has 7 positions: [*, 1, *, 2, *, 3 *].

    The motivation for this is that we can now succinctly compare unequal halves of an array cut at p. Given some position p in array A, the left-neighbor of p is A[(p-1) // 2], and the right-neighbor of p is A[p // 2].

    Investigation

    With this idea in hand, let's investigate further. Some position q of B breaks B into two halves: elements at or to the left of the left-neighbor, and elements at or to the right of the right-neighbor. The corresponding position p = len(A) + len(B) - q of A will break A into halves, and the total number of elements in the left halves of A and B will be equal to the total number of elements in the right halves of A and B.

    Indeed, there are (q-1)//2 + 1 elements in the left half of B, and len(B) - (q // 2) elements on the right half. Similarly, there are (len(A) + len(B) - q - 1)//2 + 1 elements in the left half of A, and len(A) - (len(A) + len(B) - q) // 2 in the right half of A. With some algebra and case work, we can prove these are equal.

    Algorithm

    Returning to the problem, if we have broken A and B by some cut, into halves of an equal number of elements, represented by neighbors L1, R1 and L2, R2, this cut represents the median if $$\max(L_1, L_2) \leq \max(R_1, R_2)$$.

    Without loss of generality, say len(B) $$\leq$$ len(A). We will binary search for the correct cut of B (with corresponding cut for A) that represents the median. The interval under consideration for what the cut could be is [lo, hi]. If L1 > R2, then we needed to increase the right-neighbor of B's cut - ie., lo = q+1, and vice-versa.

    class Solution(object):
        def findMedianSortedArrays(self, A, B):
            if len(A) < len(B):
                A, B = B, A
    
            lo, hi = 0, len(B) * 2
            while lo <= hi:
                q = lo + hi >> 1
                p = len(A) + len(B) - q
                L1 = A[p-1 >> 1] if p else float('-inf')
                R1 = A[p >> 1] if p < 2*len(A) else float('inf')
                L2 = B[q-1 >> 1] if q else float('-inf')
                R2 = B[q >> 1] if q < 2*len(B) else float('inf')
                if L1 > R2:
                    lo = q + 1
                elif L2 > R1:
                    hi = q - 1
                else:
                    return (max(L1, L2) + min(R1, R2)) / 2.0
    

    Complexity Analysis

    • Time Complexity: Let $$M, N$$ be the lengths of the given arrays. The width of our interval is 2 * min(M, N), and we decrease it by roughly half each time, which takes $$\log(\min(M, N))$$ steps. Taking the median at the end of a few integers is negligible. In total, the complexity is $$\log(\min(M, N))$$.

    • Space Complexity: We need $$O(1)$$ additional space to store the pointers we are using.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.