Clean iterative solution with two binary searches (with explanation)


  • 269
    S

    The problem can be simply broken down as two binary searches for the begining and end of the range, respectively:

    First let's find the left boundary of the range. We initialize the range to [i=0, j=n-1]. In each step, calculate the middle element [mid = (i+j)/2]. Now according to the relative value of A[mid] to target, there are three possibilities:

    1. If A[mid] < target, then the range must begins on the right of mid (hence i = mid+1 for the next iteration)
    2. If A[mid] > target, it means the range must begins on the left of mid (j = mid-1)
    3. If A[mid] = target, then the range must begins on the left of or at mid (j= mid)

    Since we would move the search range to the same side for case 2 and 3, we might as well merge them as one single case so that less code is needed:

    2*. If A[mid] >= target, j = mid;

    Surprisingly, 1 and 2* are the only logic you need to put in loop while (i < j). When the while loop terminates, the value of i/j is where the start of the range is. Why?

    No matter what the sequence originally is, as we narrow down the search range, eventually we will be at a situation where there are only two elements in the search range. Suppose our target is 5, then we have only 7 possible cases:

    case 1: [5 7] (A[i] = target < A[j])
    case 2: [5 3] (A[i] = target > A[j])
    case 3: [5 5] (A[i] = target = A[j])
    case 4: [3 5] (A[j] = target > A[i])
    case 5: [3 7] (A[i] < target < A[j])
    case 6: [3 4] (A[i] < A[j] < target)
    case 7: [6 7] (target < A[i] < A[j])
    

    For case 1, 2 and 3, if we follow the above rule, since mid = i => A[mid] = target in these cases, then we would set j = mid. Now the loop terminates and i and j both point to the first 5.

    For case 4, since A[mid] < target, then set i = mid+1. The loop terminates and both i and j point to 5.

    For all other cases, by the time the loop terminates, A[i] is not equal to 5. So we can easily know 5 is not in the sequence if the comparison fails.

    In conclusion, when the loop terminates, if A[i]==target, then i is the left boundary of the range; otherwise, just return -1;

    For the right of the range, we can use a similar idea. Again we can come up with several rules:

    1. If A[mid] > target, then the range must begins on the left of mid (j = mid-1)
    2. If A[mid] < target, then the range must begins on the right of mid (hence i = mid+1 for the next iteration)
    3. If A[mid] = target, then the range must begins on the right of or at mid (i= mid)

    Again, we can merge condition 2 and 3 into:

    2* If A[mid] <= target, then i = mid;
    

    However, the terminate condition on longer works this time. Consider the following case:

    [5 7], target = 5
    

    Now A[mid] = 5, then according to rule 2, we set i = mid. This practically does nothing because i is already equal to mid. As a result, the search range is not moved at all!

    The solution is by using a small trick: instead of calculating mid as mid = (i+j)/2, we now do:

    mid = (i+j)/2+1
    

    Why does this trick work? When we use mid = (i+j)/2, the mid is rounded to the lowest integer. In other words, mid is always biased towards the left. This means we could have i == mid when j - i == mid, but we NEVER have j == mid. So in order to keep the search range moving, you must make sure the new i is set to something different than mid, otherwise we are at the risk that i gets stuck. But for the new j, it is okay if we set it to mid, since it was not equal to mid anyways. Our two rules in search of the left boundary happen to satisfy these requirements, so it works perfectly in that situation. Similarly, when we search for the right boundary, we must make sure i won't get stuck when we set the new i to i = mid. The easiest way to achieve this is by making mid biased to the right, i.e. mid = (i+j)/2+1.

    All this reasoning boils down to the following simple code:

    vector<int> searchRange(int A[], int n, int target) {
        int i = 0, j = n - 1;
        vector<int> ret(2, -1);
        // Search for the left one
        while (i < j)
        {
            int mid = (i + j) /2;
            if (A[mid] < target) i = mid + 1;
            else j = mid;
        }
        if (A[i]!=target) return ret;
        else ret[0] = i;
        
        // Search for the right one
        j = n-1;  // We don't have to set i to 0 the second time.
        while (i < j)
        {
            int mid = (i + j) /2 + 1;	// Make mid biased to the right
            if (A[mid] > target) j = mid - 1;  
            else i = mid;				// So that this won't make the search range stuck.
        }
        ret[1] = j;
        return ret; 
    }

  • 1
    N

    The explanation and answer are both wonderful !


  • 10
    C

    int mid = (i + j + 1) /2;

    for the second search may be a little better.


  • 0
    S

    @cfjmonkey. Yes you are right. Thank you for pointing it out!


  • 1
    B

    The explanation is detailed and fantastic, learned a lot!
    However, it would be better to compute mid with i + (j - i) / 2 to avoid possible int overflows. According to Wikipedia, this is a bug that has existed in Java SDK at Arrays.binarySearch() from 1.2 to 5.0.


  • 0
    L

    It confused me why int mid = (i + j + 1) /2; for the right index searching is better. How is it possible make a bias to right can improve time efficiency?


  • 2
    S

    @lym5523 Of course the right-biased mid does NOT improve time efficiency. Why would you think it was used for that purpose? It is NOT "better" either. I chose to do that simply because given the way the logic is implemented, the mid HAS to be biased to the right or the loop will enter an infinite cycle. I prefer this implementation because the first and second half look mathematically symmetrical, not because it has an advantage in running time.


  • 0
    V

    Your condition 2* for finding left boundary is wrong in explanation. If you combine conditions 2 & 3 it should have been If A[mid] >= target, j = mid;


  • 0
    S

    @venkata_sundara Thank you for pointing that out. I'll correct it accordingly.


  • 9
    S

    use the standard library you can do this like this :

       class Solution {
                public:
            vector<int> searchRange(vector<int>& nums, int target) {
                
                auto lit=lower_bound(nums.begin(),nums.end(),target);
                auto rit=upper_bound(nums.begin(),nums.end(),target);
                --rit;
                if(*lit!=target||(*rit)!=target)
                    return {-1,-1};
                return {lit-nums.begin(),rit-nums.begin()};
            }
        };

  • 1
    S

    I'd definitely go for this solution in a real-world application; it is concise, uses STL and runs in O(logN) time. However, it is likely that this solution is not acceptable as-is by the interviewer, since it does not show your ability (or lack thereof) to implement a binary search. He may ask you to implement your own lower_bound / upper_bound function instead.


  • 0
    S

    why do you have to set ret(2,-1) instead of ret(-1,-1) ?


  • 0
    S

    (2, -1) means TWO -1's in the definition of vector. You can use {-1,-1} also but note that you need to use curly braces.


  • 0
    S

    many thanks! seems that I need to revise for CPP first lol


  • 10

    Great solution and explanation! The trick mid = (i + j) / 2 + 1 is damn clever :-)

    Since the C++ function signature of this problem has been updated, I rewrite the code using your idea as follows. BTW, I use the suggested revision of the trick to be (i + j + 1) / 2 of @dfjmonkey :-)

    class Solution {
    public:
        vector<int> searchRange(vector<int>& nums, int target) {
            int l = left(nums, target);
            if (l == -1) return {-1, -1};
            return {l, right(nums, target)};
        }
    private:
        int left(vector<int>& nums, int target) {
            int n = nums.size(), l = 0, r = n - 1;
            while (l < r) {
                int m = l + ((r - l) >> 1);
                if (nums[m] < target) l = m + 1;
                else r = m;
            }
            return nums[l] == target ? l : -1;
        }
        int right(vector<int>& nums, int target) {
            int n = nums.size(), l = 0, r = n - 1;
            while (l < r) {
                int m = l + ((r - l + 1) >> 1); 
                if (nums[m] > target) r = m - 1;
                else l = m;
            }
            return r;
        }
    }; 
    

    Update: we may unify the left and right boundaries using just a single function. In Stefan's post, only the function for the left boundary is used. If we want to find the right boundary of target, we just find the left boundary of target + 1 and subtract it by 1. The code is rewritten below.

    class Solution {
    public:
        vector<int> searchRange(vector<int>& nums, int target) {
            int n = nums.size();
            int l = left(nums, target);
            if (l < 0 || l >= n || nums[l] != target) return {-1, -1};
            return {l, left(nums, target + 1) - 1};
        }
    private:
        int left(vector<int>& nums, int target) {
            int n = nums.size(), l = 0, r = n; 
            while (l < r) {
                int mid = (l + r) / 2;
                if (nums[mid] < target) l = mid + 1;
                else r = mid;
            }
            return l;
        }
    };
    

  • 0

    Nice lower_bound and upper_bound functions. Well, the problem is what @stellari describes :-)


  • 9
    S

    I think your case 2 is incorrect:

    "case 2: [5 3] (A[i] = target > A[j])"

    This case cannot occur(3 cannot appear after 5) because the array is sorted. Please let me know if I have misunderstood.


  • 1
    Z

    Actually, I think to get j, it should be mid = (i+j+1)/2 instead of (i+j)/2+1.


  • 0
    X

    If target is bigger than all in nums, lower_bound() and upper_bound() returns nums.end().

    It's not right to use *lit and *rit directly.

    If target is smaller than all in nums, upper_bound() returns nums.begin().

    Again, it's not right to --rit.

    Here is the improvement:

    vector<int> searchRange(vector<int>& nums, int target) {
            auto low_it = std::lower_bound(nums.begin(), nums.end(), target);
            auto upper_it = std::upper_bound(nums.begin(), nums.end(), target);
            if (low_it != nums.end() && *low_it == target)
                return{ low_it - nums.begin(), upper_it - nums.begin()-1};
            else
                return{ -1, -1 };
    }
    

  • 0
    L

    why set n nums.size() not nums.size()-1?


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.