Java O(n) Solution - Bucket Sort


  • 195
    M

    Idea is simple. Build a array of list to be buckets with length 1 to sort.

    public List<Integer> topKFrequent(int[] nums, int k) {
    
    	List<Integer>[] bucket = new List[nums.length + 1];
    	Map<Integer, Integer> frequencyMap = new HashMap<Integer, Integer>();
    
    	for (int n : nums) {
    		frequencyMap.put(n, frequencyMap.getOrDefault(n, 0) + 1);
    	}
    
    	for (int key : frequencyMap.keySet()) {
    		int frequency = frequencyMap.get(key);
    		if (bucket[frequency] == null) {
    			bucket[frequency] = new ArrayList<>();
    		}
    		bucket[frequency].add(key);
    	}
    
    	List<Integer> res = new ArrayList<>();
    
    	for (int pos = bucket.length - 1; pos >= 0 && res.size() < k; pos--) {
    		if (bucket[pos] != null) {
    			res.addAll(bucket[pos]);
    		}
    	}
    	return res;
    }

  • 7

    Nice solution, I initially thought about using PriorityQueue, but build PriorityQueue needs n(logn) time.


  • 36

    A Python version:

    def topKFrequent(self, nums, k):
        bucket = [[] for _ in nums]
        for num, freq in collections.Counter(nums).items():
            bucket[-freq].append(num)
        return list(itertools.chain(*bucket))[:k]

  • 1
    S

    Maybe you can calculate max frequency and reduce size of array:

    public List<Integer> topKFrequent(int[] nums, int k) {
    	Map<Integer, Integer> frequencyMap = new HashMap<>();
    	int maxFrequency = 0;
    
    	for (int n : nums) {
    		int frequency = frequencyMap.getOrDefault(n, 0) + 1;
    		frequencyMap.put(n, frequency);
    		maxFrequency = Math.max(maxFrequency, frequency);
    	}
    
    	// here i is the frequency and bucket.get(i) is the numbers that having this frequency
    	List<List<Integer>> bucket = new ArrayList<>(maxFrequency);
    	while (maxFrequency-- >= 0) {
    		bucket.add(new ArrayList<>());
    	}
    
    	for (Map.Entry<Integer, Integer> entry : frequencyMap.entrySet()) {
    		int frequency = entry.getValue();
    		bucket.get(frequency).add(entry.getKey());
    	}
    
    	List<Integer> res = new ArrayList<>();
    	for (int pos = bucket.size() - 1; pos >= 0 && res.size() < k; pos--) {
    		res.addAll(bucket.get(pos));
    	}
    	return res;
    }

  • 75
    R

    Thanks for sharing, only one nitpick:

    Think about the case when K=2,
    and you have 1 number that has max frequency, say 10 times.
    and you have 10 numbers that has 2nd max frequency, say 9 times.
    With your algo, the returned list will contain 11 numbers instead of 2.

    Any easy fix:
    return res.subList(0,k);

    (It seems the above scenario is not covered by the existing test cases.)


  • 5
    M

    You can create a minHeap, and size() == k. The time complexity is nlgk which is smaller than nlgn


  • -3
    R

    return map(lambda i: i[0], Counter(nums).most_common(k)) # just for fun


  • 1
    B

    I have the same question. Thank you!


  • 0
    C

    @ran3, your comments make sense~


  • 0
    M

    I think the problem is not clear about how to deal with even. Technically, all the 10 numbers equally have 2nd max frequency, I have to return them all.

    Generally speaking, the fact is: top ten scores of an exam does not indicate there have to be exactly ten students.


  • 1

    your so short code is really cool !


  • 16
    S

    I think res.addAll(bucket[pos]) is not entirely correct,because res’s size can be greater than k, what do you think?


  • 0
    S

    Since it says return the k most frequent "elements", I guess we need to return k elements. So I think "return res.subList(0,k)" makes more sense.


  • 0
    P

    frequencyMap.put and bucket.add would have the time complexity of log(n) so this solution would be not better than n*log(n). Am I missing something? Could someone explain? Thanks.


  • 0
    P

    bucket may have many empty list element in the case nums consists of high frequency numbers. How about:

    def topKFrequent(self, nums, k):
    r = sorted(collections.Counter(nums).items(), cmp = lambda x, y: y[1] - x[1])
    return [x[0] for x in r][:k]


  • 0
    P

    Sorry. I tried to format the code but didn't know how. :(


  • 0
    A

    Lets see if the array is 1,1,1,2,2,3,3 and k == 2

    your code should return [1,2,3]. In fact the correct answer is [1,2] or [1, 3]. Am i right?


  • 0
    G

    I think u r right..


  • 3
    G

    The solution will failed for case:
    [1,1,1,1,2,2,2,2]
    1

    Your code gives [1,2] while the expected solution is [1].
    We shall not simply use addAll() for the res.


  • 0
    R

    I agree with you. If the number of of elements of each bucket is greater than k,the result isnot raisonable. But the solution is still accepted. I think the test case is not compelet.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.