Intuitive Python O(log (m+n)) solution, by kth smallest in the two sorted arrays, 252ms


  • 52
    C

    The idea is in the comment:

    def findMedianSortedArrays(self, A, B):
        l = len(A) + len(B)
        if l % 2 == 1:
            return self.kth(A, B, l // 2)
        else:
            return (self.kth(A, B, l // 2) + self.kth(A, B, l // 2 - 1)) / 2.   
        
    def kth(self, a, b, k):
        if not a:
            return b[k]
        if not b:
            return a[k]
        ia, ib = len(a) // 2 , len(b) // 2
        ma, mb = a[ia], b[ib]
        
        # when k is bigger than the sum of a and b's median indices 
        if ia + ib < k:
            # if a's median is bigger than b's, b's first half doesn't include k
            if ma > mb:
                return self.kth(a, b[ib + 1:], k - ib - 1)
            else:
                return self.kth(a[ia + 1:], b, k - ia - 1)
        # when k is smaller than the sum of a and b's indices
        else:
            # if a's median is bigger than b's, a's second half doesn't include k
            if ma > mb:
                return self.kth(a[:ia], b, k)
            else:
                return self.kth(a, b[:ib], k)

  • 0
    B

    I am learning python and here's my accepted iterative version using the kth smallest logic.

    ~175 ms

    Comments welcome.

    class Solution:
    def findMedianSortedArrays(self, a, b):
    	#finding kthe smallest when k = (a+b)//2 depending on whether its odd or even
    	n = len(a)+len(b)
    	if n&1:
    		return self.kthSmallest(a,b,n//2+1)
    	else:
    		return (self.kthSmallest(a,b,n//2+1) + self.kthSmallest(a,b,n//2))/2.0
    		
    def kthSmallest(self,a,b,k):
    	if len(a)+len(b) < k:
    		return None
    	i=0
    	j=0
    	flag = ''
    	# true for a and false for b
    	while k>0:
    		if i >= len(a):
    			j+=1
    			flag = False
    		elif j >= len(b):
    			i+=1
    			flag = True
    		elif a[i] <= b[j]:
    			i+=1
    			flag = True
    		elif a[i] > b[j]:
    			j+=1
    			flag = False
    		k-=1
    
    	if flag:
    		return a[i-1]
    	else:
    		return b[j-1]
    

  • 0
    B

    Seems your solution is O(m+n)?


  • 0
    B

    Awesome solution!


  • 0
    B

    I am afraid no. Its actually o(k).


  • 0
    R

    since k = (m+n) / 2. your answer is o(m+n), it does not meet the criterion.


  • 1
    R

    Your answer is very enlightening, gracefully taking advantage of devide and conquer method.
    The time complexity is o(log(n) +log(m)). Great job.


  • 6
    B

    The slice operation for list is o(k). So as you are using a[:ia] & a[ia+1:], these operations are really o(n)/o(m). Thus you cannot use slice operations. Passing start/end index is the solution. Please refer to this link: https://wiki.python.org/moin/TimeComplexity


  • 4
    D

    My solution without slicing.

    class Solution:
    # @param {integer[]} nums1
    # @param {integer[]} nums2
    # @return {float}
    def find(self, nums1, s1, e1, nums2, s2, e2, k):
        if e1 - s1 < 0:
            return nums2[k + s2]
        if e2 - s2 < 0:
            return nums1[k + s1]
        if k < 1:
            return min(nums1[k + s1], nums2[k + s2])
        ia, ib = (s1 + e1) // 2 , (s2 + e2) // 2
        ma, mb = nums1[ia], nums2[ib]
        if (ia - s1) + (ib - s2) < k:
            if ma > mb:
                return self.find(nums1, s1, e1, nums2, ib + 1, e2, k - (ib - s2) - 1)
            else:
                return self.find(nums1, ia + 1, e1, nums2, s2, e2, k - (ia - s1) - 1)
        else:
            if ma > mb:
                return self.find(nums1, s1, ia - 1, nums2, s2, e2, k)
            else:
                return self.find(nums1, s1, e1, nums2, s2, ib - 1, k)
    
    def findMedianSortedArrays(self, nums1, nums2):
        l = len(nums1) + len(nums2)
        if l % 2 == 1:
            return self.find(nums1, 0, len(nums1) - 1, nums2, 0, len(nums2) - 1, l // 2)
        else:
            return (self.find(nums1, 0, len(nums1) - 1, nums2, 0, len(nums2) - 1, l // 2) + self.find(nums1, 0, len(nums1) - 1, nums2, 0, len(nums2) - 1, l // 2 - 1)) / 2.0
    

  • 0
    A

    Here's use the start and end position to avoid slice

    def findMedianSortedArrays(self, num1, num2):
        # start from 0
        def kth(num1, num2, k, s1, e1, s2, e2):
            if s1 >= e1:
                return num2[s2+k]
            if s2 >= e2:
                return num1[s1+k]
            if k == 0:
                return min(num1[s1], num2[s2])
            
            m = (e1+s1)/2
            n = (e2+s2)/2
            
            if (m-s1)+(n-s2) < k:
                if num1[m] < num2[n]:
                    return kth(num1, num2, k-(m-s1)-1, m+1, e1, s2, e2)
                else:
                    return kth(num1, num2, k-(n-s2)-1, s1, e1, n+1, e2)
            else: # m+n >= k, m+n cover k+1 element so we can reduce larger half
                if num1[m] < num2[n]:
                    return kth(num1, num2, k, s1, e1, s2, n)
                else:
                    return kth(num1, num2, k, s1, m, s2, e2)
    
        m = len(num1)
        n = len(num2)
        length = m+n
        if length % 2 == 0:
            return (kth(num1, num2, length/2-1, 0, m, 0, n) + kth(num1, num2, length/2, 0, m, 0, n))/2.0
        return kth(num1, num2, length/2, 0, m, 0, n)

  • 0
    I

    Can you explain the logic behind the kth smallest function?


  • 0
    X

    @IWantToPass function "kth(A, B, k)" is meant to find the kth smallest number in sorted([A.append(B)])

    If you think in this way, the logic becomes really clear, right?


  • 0
    P

    Hey I think this solution is really smart but it's not very easy to understand or intuitive. There are small details of choices that's hard to understand. For example,

    1. why did you choose to use k = (l //2) and ( l//2 -1) instead of (l //2) and ( l//2 +1) when len(A)+len(B) is even
    2. the indexing below is very smart, I wouldn't be able to come up with it at the first try until I realize the recursion will be infinite if I dont have the plus one.
      return self.kth(a, b[ib + 1:], k - ib - 1)
      return self.kth(a[ia + 1:], b, k - ia - 1)

    I'm just trying to understand it better so I don't forget about it during interviews.
    Thank you for explaining in advance!!


  • 6
    O

    @pjerryhu I am also trying to figure this also ha! It's also not that intuitive but seems vary elegant to me. To answer your questions.

    The reason for using (l //2) and ( l//2 +1) is simply because the sub function kth is getting the element at index k in his context, which would be the k+1-th small element in A+B. You can see it from when l = len(A)+len(B) is odd, he is getting the element at index l/2.

    Some notes from my understanding, please feel free to correct me.

    ia, ib = len(a) / 2 , len(b) / 2
    a[ia] = ma, b[ib] = mb
    supposed all the elements that are at left of ma are la, right of ma are ra , same for lb, rb

    • if ia + ib < k: means the k-th element still exist in some larger part of the array
      • if ma < mb => la < ma < mb: solution can't exist in la
      • if ma > mb => lb < mb < ma: solution can't exist in lb
      • since we are deleted some smaller part, original we are seeking for the k-th, now we are seeking for the k - (len(smaller part)) th in the remaining two array
    • if ia + ib > k: means the k-th element still exist in some smaller part of the array
      • if ma < mb => ma < mb < rb: solution can't exist in rb
      • if ma > mb => mb < ma < ra: solution can't exist in la
      • since we are deleted some larger part, we are still finding the k-th element in the array

    So personally I don't think the two indexing part you mentioned are supposed to ensure the recursion terminate. It's simply how the algorithm goes.


  • 0
    P

    @chungyushao Hey, a small note on your first point, OP used (l//2) and (l//2-1) instead of plus one. But your explanation still holds.

    Secondly, your understanding is very solid. Thank you for sharing that. I think the genius part is that OP used the plus one and minus one in the case where (ia+ib < k) i.e. index+1 and k-index-1. This ensures even if the "la" and "lb" in your case is null, it's still shrinking the size of the arrays.


  • 0

    This solution is elegant, but I have trouble understanding why he can ignore the case when (ma + mb == k). I know that in this case, the "else" part of the condition statement will execute. Yet, in that piece of code, it will eliminate the median in both A and B. How can he predict that k is not at the median of a or b?

    Oh. When I typed here, I realized it. As long as both A and B have elements, k is not possible to be at the median of either A or B. Ha! Smart!


  • -1
    A

    The time complexity is o(log(n) +log(m)) , not o(log(m+n))
    log(n) +log(m) = log(mn) != log(m+n)


  • 0

    @Andy_Guo said in Intuitive Python O(log (m+n)) solution, by kth smallest in the two sorted arrays, 252ms:

    o(log(n) +log(m)) , not o(log(m+n))

    Those are the same. Also, @roger3 did say it's o(log(n) +log(m)).

    Btw, why are you guys talking about little-o instead of big-O?


  • 0
    B

    I still don't understand why your code can pass the test.
    You make k equal to (m+n)/2 and iterate for k times, I think the time complexity is o(m+n).
    Am I wrong ?


  • 0
    G

    @ManuelP But why O(log(m+n)) is the same with O(log(m)+log(n))??? I am a little confused here.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.