C++ O(n log(n-k)) unordered_map and priority_queue(maxheap) solution


  • 49
    S
    class Solution {
    public:
        vector<int> topKFrequent(vector<int>& nums, int k) {
            unordered_map<int,int> map;
            for(int num : nums){
                map[num]++;
            }
            
            vector<int> res;
            // pair<first, second>: first is frequency,  second is number
            priority_queue<pair<int,int>> pq; 
            for(auto it = map.begin(); it != map.end(); it++){
                pq.push(make_pair(it->second, it->first));
                if(pq.size() > (int)map.size() - k){
                    res.push_back(pq.top().second);
                    pq.pop();
                }
            }
            return res;
        }
    };

  • -1
    R
    This post is deleted!

  • 2
    E

    Same Idea in Java

    public List<Integer> topKFrequent(int[] nums, int k) {
        HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
        for(int num: nums){
            map.put(num, map.containsKey(num)? map.get(num) + 1 : 1);
        }
        PriorityQueue<Map.Entry<Integer, Integer>> pque = 
            new PriorityQueue<Map.Entry<Integer, Integer>>((o1, o2) -> o2.getValue() - o1.getValue());
        pque.addAll(map.entrySet());
        List<Integer> ret = new ArrayList<Integer>();
        for(int i = 0; i < k; i++){
            ret.add(pque.poll().getKey());
        }
        return ret;
    }

  • 0
    C

    I thought when we using priority_queue to sort elements, we have to overload the compare operators if the elements are not the basic data types. You used 'pair' to contain freq and num, then why didn't you overload the function? THX!


  • 5
    S

    If you do not specify your own comparator, sorting is done by a built-in function compare. From first member to last member in pair. compare works for many types of values, ordering them in one particular way: increasing numeric order for numbers; lexicographic order (aka dictionary order) for strings, symbols, and keywords; shortest-to-longest order by Clojure vectors, with lexicographic ordering among equal length vectors.


  • 0
    Q

    Can u plz explain how the complexity is O(nlog(n-k)) ?


  • 1
    S

    the size of priority_queue is (n - k), every operation in priority_queue takes O(log(n-k)), multiply n operations. The result is O(n log(n-k)).


  • 0
    A

    If you keep the largest ones in the priority queue and evacuate the smaller ones when the size reaches k, and collect the large ones in the end, then the complexity would become O(n log k), am I right?


  • 0
    Q

    Got it . Thanks !!!


  • 0
    R

    I don't understand why and how the size of the priority queue is (n-k) operations. Could someone please explain?


  • 0
    L

    really neat solution, thanks!


  • 0
    Z

    Could you please analyze the time complex for building the first unordered_map?
    I thought it was O(nlogn) to build a map with n elements for the worst case.
    Thanks!


  • 0
    S

    I think the time complexity is O(n*logk) . Am I wrong ?


  • 3
    C

    I think if you build a min heap, the time complexity is O(nlgk). But if you build a max heap, the time complexity is O(nlg(n-k)).


  • 1
    C

    I think if you use hash table to implement the map, building a map with n elements takes almost O(n). If you use BST to implement the map, it's O(n*lgn). Please correct me if I'm wrong.


  • 0
    X
    This post is deleted!

  • 1
    X

    @ccpromise Can't agree more! Here is my min-heap solution.

    class Solution {
    private:
        struct bigger{
            bool operator()(pair<int, int> &one, pair<int, int>two){
                return one.second > two.second;
            }
        };
    public:
        vector<int> topKFrequent(vector<int>& nums, int k) {
            /* Use unordered_map and priority_queue(minheap) solution.
             *
             * Use unordered_map to avoid red-black hash implement, which takes almost O(n*lgn).
             * We build a min heap with size k, so the time complexity is O(nlgk).
             */
            unordered_map<int, int> num_count;
            for(const auto &n: nums){
                num_count[n] += 1;
            }
    
            priority_queue<pair<int, int>, vector<pair<int, int>>, bigger> frequent_heap;
            // Build the min-heap with size k.
            for(auto it = num_count.begin(); it != num_count.end(); it++){
                if(frequent_heap.size() < k){
                    frequent_heap.push(*it);
                }
                else if(it->second >= frequent_heap.top().second){
                    frequent_heap.pop();
                    frequent_heap.push(*it);
                }
            }
    
            vector<int> ans;
            while(!frequent_heap.empty()){
                auto top = frequent_heap.top();
                frequent_heap.pop();
                ans.push_back(top.first);
            }
            return ans;
        }
    };
    

  • 1
    R

    In fact , we can use an min-heap to get O(n log k) time complexity.

    class Solution {
    public:
        vector<int> topKFrequent(vector<int>& nums, int k) {
            using mapIter = unordered_map<int, size_t>::iterator;
            unordered_map<int,size_t> freqCount; // num -> frequency
            for(int num : nums){ ++freqCount[num]; }
            priority_queue<mapIter, vector<mapIter>, function<bool(mapIter, mapIter)>> topkHeap(
                [](mapIter lhs, mapIter rhs)
                {
                    return lhs->second > rhs->second; // build an min-heap!
                },
                vector<mapIter>()
            ); // frequency -> num
            for(mapIter iter = freqCount.begin(); iter != freqCount.end(); ++iter)
            {
                topkHeap.push(iter);
                if(topkHeap.size() > k){ topkHeap.pop(); } // remove the min value currently
            }
            // now heap hold the elements that elements less than them are all removed.
            vector<int> result(k);
            for(int i = k-1; i >= 0; --i)
            {
                mapIter iter = topkHeap.top();
                result[i] = iter->first;
                topkHeap.pop();
            }
            return result;
        }
    };
    

  • 0
    C

    How about that we need the return array is sorted by frequency of number?
    Do you think about how to implement this?


  • 0
    R

    @couragecc If you ues heap, it will be done naturally.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.