Why no one posted O(N) solution in C++ using bucket sort?


  • 5
    M

    I couldn't find the C++ solution using bucket sort in the discussions. So I want to share my accepted version. Hopefully it could help those who only use C++ like me and really want to see people could improve upon it since it now runs 36ms.
    Here is the explanation of this code. The logic is borrowed from https://leetcode.com/discuss/38206/ac-o-n-solution-in-java-using-buckets-with-explanation by lx223. My initial effort using the bucket sort on this problem failed miserably because I did not consider the situation that nums[i] and nums[j] could be put into neighboring buckets even if their difference is not greater than t. So I searched the forum and saw lx223's brilliant design/implementation. As the Java version, I used a unordered_map "buckets" to store the most recent k nums,
    where the key is the bucket index calculated from nums[i] , and the value is simply the nums[i]. Therefore, whenever we encounter a new nums[i] there are 3 possibilities we can find a match: 1. the same bucket index of nums[i]: "idx" is already in buckets, since we define the bucket width to be w=t+1, this means a match has been found. 2. If the neighboring index idx-1 is in "buckets" and it's corresponding value nums[j] satisfies the constraint: abs(nums[i]-nums[j])<=t, then a match is also found. 3. The same condition check as #2 for the neighbor idx+1. Finally, whenever the size of buckets reached the limit k, we need to remove the oldest idx from buckets for the new one. However C++ unordered_map does not keep the insertion order of its keys, so I used a list to keep track of the oldest idx which is always the first node in the list. BTW: the using of long long is to handle the test cases where the t and values of nums are INT_MAX.

    class Solution {
    public:
        bool containsNearbyAlmostDuplicate(vector<int>& nums, int k, int t) {
            int len=(int) nums.size();
            if(len<2||k<1||t<0) return false;
            
            int mn=nums[0]];
            for(int i=1; i<len; ++i) mn=min(mn, nums[i]);
            
            long long w=t+1;
            int numBuckets=k;
            unordered_map<int, int> buckets;
            list<int> idxList;
            for(int i=0; i<len; ++i) {
                int idx=((long long)nums[i]-mn)/w;
                
                if(buckets.find(idx)!=buckets.end()) return true;
                else if(buckets.find(idx-1)!=buckets.end()
                        &&abs((long long)nums[i]-buckets[idx-1])<=t) return true;
                else if(buckets.find(idx+1)!=buckets.end()
                         &&abs((long long)nums[i]-buckets[idx+1])<=t) return true;
                         
                if((int)buckets.size()==numBuckets) {
                    buckets.erase(idxList.front());
                    idxList.pop_front();
                }
                
                buckets[idx]=nums[i];
                idxList.push_back(idx);
            }
            return false;
        }
    };

  • 1
    D

    @maruchan76 sorry only read the title. it looks like "bucket sort" focuses on "isolation", that is to say, bucket one's elements must be smaller than bucket two's elements. However, "bucket" can be used in another context because of three properties: 1st) elements within the same bucket must have gap < bucket width; 2nd) elements within neighbor buckets might have gap < bucket width; 3rd) elements within non-contiguous buckets must have gap > bucket width.

    So it seems this problem requires us to use the second "bucket" instead of "bucket sort".


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.