A little more than just Binary Search


  • 1

    Binary search the correct index (from which are the papers counted to H-index) came first in my head:

    class Solution(object):
        def hIndex(self, citations):
            # Binary Search using number of papers
            n = len(citations)
            l = 0; r = n
            while l < r:
                m = (l+r)/2
                if citations[m] == n-m: return n-m
                elif citations[m] < n-m: l = m+1
                else: r = m
            return n-l
    

    That one takes 124ms.
    Then I thought, wait, we can also binary guess the correct H-index by doing this:

    class Solution(object):
        def hIndex(self, citations):
            # Binary Search on paper's highest citation: binary guess the h-index
            n = len(citations)
            if n == 0: return 0
            l = 0; r = min(citations[-1], n)
            while l < r:
                m = l + r - (l+r)/2
                if citations[n-m] == m: return m
                elif citations[n-m] < m: r = m-1
                else: l = m
            return l
    

    The second one took 120ms.
    Now, do you see how we can optimize the solution a little bit more? For example, when there are many papers with low-index (The second solution does this better than the first one)? Or, when there are very few papers but with high H-index (The first solution handles this better than the second one)?

    Yes, you're right: combining them! Here is the final solution (which is a little faster than the 2 first ones with 112ms and beats 95% submissions, but when you add more variations to the dataset, it will make a lot of difference):

    class Solution(object):
        def hIndex(self, citations):
            n = len(citations)
            if n == 0: return 0
            
            if citations[-1] > n:
                # Binary Search on number of papers: binary search the index
                l = 0; r = n
                while l < r:
                    m = (l+r)/2
                    if citations[m] == n-m: return n-m
                    elif citations[m] < n-m: l = m+1
                    else: r = m
                return n-l
            
            else:
                # Binary Search on paper's highest citation: binary guess the h-index
                l = 0; r = citations[-1]
                while l < r:
                    m = l + r - (l+r)/2
                    if citations[n-m] == m: return m
                    elif citations[n-m] < m: r = m-1
                    else: l = m
                return l
    

    0_1469395339958_Screen Shot 2016-07-24 at 2.11.39 PM.png


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.