AC O(N) solution in Java using buckets with explanation


  • 261
    L

    As a followup question, it naturally also requires maintaining a window of size k. When t == 0, it reduces to the previous question so we just reuse the solution.

    Since there is now a constraint on the range of the values of the elements to be considered duplicates, it reminds us of doing a range check which is implemented in tree data structure and would take O(LogN) if a balanced tree structure is used, or doing a bucket check which is constant time. We shall just discuss the idea using bucket here.

    Bucketing means we map a range of values to the a bucket. For example, if the bucket size is 3, we consider 0, 1, 2 all map to the same bucket. However, if t == 3, (0, 3) is a considered duplicates but does not map to the same bucket. This is fine since we are checking the buckets immediately before and after as well. So, as a rule of thumb, just make sure the size of the bucket is reasonable such that elements having the same bucket is immediately considered duplicates or duplicates must lie within adjacent buckets. So this actually gives us a range of possible bucket size, i.e. t and t + 1. We just choose it to be t and a bucket mapping to be num / t.

    Another complication is that negative ints are allowed. A simple num / t just shrinks everything towards 0. Therefore, we can just reposition every element to start from Integer.MIN_VALUE.

     public class Solution {
        public boolean containsNearbyAlmostDuplicate(int[] nums, int k, int t) {
            if (k < 1 || t < 0) return false;
            Map<Long, Long> map = new HashMap<>();
            for (int i = 0; i < nums.length; i++) {
                long remappedNum = (long) nums[i] - Integer.MIN_VALUE;
                long bucket = remappedNum / ((long) t + 1);
                if (map.containsKey(bucket)
                        || (map.containsKey(bucket - 1) && remappedNum - map.get(bucket - 1) <= t)
                            || (map.containsKey(bucket + 1) && map.get(bucket + 1) - remappedNum <= t))
                                return true;
                if (map.entrySet().size() >= k) {
                    long lastBucket = ((long) nums[i - k] - Integer.MIN_VALUE) / ((long) t + 1);
                    map.remove(lastBucket);
                }
                map.put(bucket, remappedNum);
            }
            return false;
        }
    }
    

    Edits:

    Actually, we can use t + 1 as the bucket size to get rid of the case when t == 0. It simplifies the code. The above code is therefore the updated version.


  • 0
    W

    Hello,your corner case is so brilliant


  • 1
    P

    Is this bucket size must only be t or t + 1?
    if t+2, then one bucket may contain 2 numbers;
    if t-1, then bucket[i] and bucket[i+2] may contain a pair of number, whose distance is t


  • 0
    L

    You are absolutely correct. That was how I reasoned about the bucket size too. :D


  • 0
    W

    Very smart solution!!!


  • -20
    W

    the time complexity should still be O(NlogK)?
    the map put / erase operation is O(logK), totally O(N) times.
    right?


  • 0
    B

    time complexity of put / erase operation in hashmap can be considered as constant time.


  • 8
    S

    May I ask why it's necessary to reposition every element to start from Integer.MIN_VALUE?


  • 0

    It's a hashmap, where these operations should be O(1).


  • 1
    L

    I suppose it is not necessary if you use a different way to do the assignment of buckets. In this case, the elements are assigned to a bucket by a simple division of a bucket size. Hence, it will, for example, assign both -2 and +2 to bucket 0 if the bucket size is > 2. If we want that elements having the same bucket 0 are immediately considered duplicates, such assignment would be undesirable. Hence, a repositioning is used to cope with this. And starting from MIN_VALUE is just because of the constraints on the input range.


  • 0
    G
    This post is deleted!

  • 0
    A

    Feels like returning false whenever t < 0 isn't correct.
    Consider the test case: [-2, -1], 1, -1
    Should return true since -2 - (-1) = -1 which is <= t and difference in indices is <= k

    Am I missing something?


  • 3
    A

    This solution is excellent.
    I think we can replace
    if (map.entrySet().size() >= k)
    with
    if (i >= k) to make it more concise.


  • 0
    S

    Excellent Solution!


  • 0
    L

    Smart. This problem should add one more tag "Math".


  • 9
    W

    May I know how do you deal with duplicate keys? According to my understanding, hashMap will replace the previous value, for example, hashmap.put(1, 4), and hashmap.put(1, 6) when you call hashmap.get(1), it will return the most recently added value which is 6. So 4 is lost in this case. Will this cause problem when you say

    if (map.containsKey(bucket)
                        || (map.containsKey(bucket - 1) && remappedNum - map.get(bucket - 1) <= t)
                            || (map.containsKey(bucket + 1) && map.get(bucket + 1) - remappedNum <= t))
                                return true;
    

    Maybe I am wrong about it, but I just don't understand, can anybody explain this to me? Thank you so much :)


  • 1
    H

    In my understanding, -1 is not allowed, since we treat all difference as positive


  • 0
    F

    brilliant solution and nice description, thanks


  • 0
    C

    When there are duplicate keys, the function will return true immediately.


  • 1
    Y

    how could I say? brilliant!
    I have thought about bucket sort, but I stopped when I realized even in the same range could be put into different bucket. I did not know how to deal with that case.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.