Clean 16ms C++ O(N) Space O(KlogN) Time Solution using Priority queue


  • 50
    F

    Here, N = min(k, n).
    K = min(k, mn)
    where m, n is the size of two arrays and k is the k in the problem.

    class Solution {
    public:
        vector<pair<int, int>> kSmallestPairs(vector<int>& nums1, vector<int>& nums2, int k) {
            vector<pair<int,int>> result;
            if (nums1.empty() || nums2.empty() || k <= 0)
                return result;
            auto comp = [&nums1, &nums2](pair<int, int> a, pair<int, int> b) {
                return nums1[a.first] + nums2[a.second] > nums1[b.first] + nums2[b.second];};
            priority_queue<pair<int, int>, vector<pair<int, int>>, decltype(comp)> min_heap(comp);
            min_heap.emplace(0, 0);
            while(k-- > 0 && min_heap.size())
            {
                auto idx_pair = min_heap.top(); min_heap.pop();
                result.emplace_back(nums1[idx_pair.first], nums2[idx_pair.second]);
                if (idx_pair.first + 1 < nums1.size())
                    min_heap.emplace(idx_pair.first + 1, idx_pair.second);
                if (idx_pair.first == 0 && idx_pair.second + 1 < nums2.size())
                    min_heap.emplace(idx_pair.first, idx_pair.second + 1);
            }
            return result;
        }
    };
    

  • 0

    Can you please explain why you do the following?

    if (idx_pair.first + 1 < nums1.size())
          min_heap.emplace(idx_pair.first + 1, idx_pair.second);
    if (idx_pair.first == 0 && idx_pair.second + 1 < nums2.size())
          min_heap.emplace(idx_pair.first, idx_pair.second + 1);
    

    This is the same as what StefanPochmann did here (See solution 5).


  • 29
    F

    Sure. Suppose you are at pair 0,0 (index 0 and index 0, not value), which is the current minimum.
    Then you know the only two possible followers (immediate larger ones) are 0,1 and 1,0. Any other indices, say 1,1, 1,2, 3,3 have to be larger right?
    So every time you get a current minimum i,j , you want to push i+1,j and i,j+1 into heap. You don't need to worry about others yet.

    The problem here is, if you are at 2,3, you will push 3,3 and 2,4; then later, you are at 3,2. Then you will push 3,3, 4,2. so you pushed 3,3 twice.

    But it is easy to realize, if you are at 2,3, and you haven't seen 3,2 yet (meaning 3,2 is either still in the queue but not popped yet or not even pushed into queue), you don't need to worry about 3,3 at this moment, because 3,2 is guaranteed to be no greater than 3,3. So you can wait till you see 3,2.

    This basically means every time you see i,j you just need to push i+1, j into queue. i, j+1 can be pushed into queue later when you see i - 1, j + 1. The only exception to this is, when i == 0, there is no i-1, j+1 anymore, so you want to push both i+1, j and i, j+1 when i == 0.

    Hope it is clear now.


  • 0

    @fentoyal Thanks for the explanation. This makes it much clearer.
    I had previously tried pushing in (i+1, j) and (i, j+1) for each (i, j) I popped.

    Is there any formal bound to how much this optimization helps with complexity?


  • 2
    F

    @babhishek21
    It is not only for performance concern.
    Because if you don't do it this way, you will have to keep a dictionary of the elements in the queue to avoid pushing same pair of indices twice.
    Avoid pushing same pair of indices twice is not to increase performance, it is to make otherwise wrong code correct.


  • 0

    @fentoyal I forgot to say this, I was going to implement priority_queue::minHeap on a Set or HashSet like interface. So no duplicates.


  • 0
    F

    @babhishek21
    I see, if this is the case, your code can be easier than mine. You can ignore this check. There won't be much performance penalty if you just try pushing every value twice -- it just involves several checks and no extra memory access, which costs much more time than logic comparison.


  • 1
    C

    Could you please explain this?

    auto comp = [&nums1, &nums2](pair<int, int> a, pair<int, int> b) { return nums1[a.first] + nums2[a.second] > nums1[b.first] + nums2[b.second];};
    priority_queue<pair<int, int>, vector<pair<int, int>>, decltype(comp)> min_heap(comp);
    

    Sorry, I am not familiar with usage of auto and decltype


  • 6

    @chenweisomebody Using a standard priority_queue with each element of the template class pair<int, int> would use std::less<> as the comparison class. This is equivalent to a Max-Heap based on strict-weak ordering (i.e. the first element of the first pair must not be smaller than any other pair. Additionally the second element of the pair is used to tie break.

    Using std::less<pair<int, int>>(const pair<int, int> &a, const pair<int, int> &b) would essentially result in the following Heap property:

    (a.first != b.first) ? a.first > b.first : a.second >= b.second
    

    But that is not the Heap property we want. The property that we require is:

    nums1[a.first] + nums2[a.second] < nums1[ab.first] + nums2[b.second]
    

    So how do we go on about achieving that? We use our own custom comparison function. The line:

    auto comp = [&nums1, &nums2](pair<int, int> a, pair<int, int> b) {
        return nums1[a.first] + nums2[a.second] > nums1[b.first] + nums2[b.second];
    };
    

    is essentially a C++11 lambda operator. Lambdas are easy to use one-line (not really :stuck_out_tongue_closed_eyes: ) expressions which perform basic operations (in this case return a boolean to maintain the Heap property). They are stored as high level objects.

    Passing lambdas as comparison functions is done is a different way than the standard priority_queue declaration. decltype() is used to get the expected return type of an expression (in this case the lambda), so that thepriority_queue constructor can decide if such a comparison function is possibly valid or not.

    If you read the standards manual on using priority_queue and using lambdas instead of standard templated comparator objects/functions, you would realise the need to use decltype() on the lambda object. This behaviour is best described in this StackOverflow question.


  • 0
    F

    @chenweisomebody
    @babhishek21 has given a very clear and detailed explanation.
    In case you still have trouble understanding it. I put it in a simpler language:
    priority_queue needs to know "how priority is defined", in other words "for the values you provide, what means larger and what means smaller?".
    My "auto comp = ... " tells compiler "nums1[a.first] + nums2[a.second] > nums1[b.first] + nums2[b.second]" means "b" (the one with smaller sum ) has a higher priority, so it is min_heap;
    But to feed this functional object (lambda) to priority_queue, you need know its type. Unfortunately, each lambda function has its own type, so there is no built-in type like int, string, you can directly put into the template's parameter list. So you have to use a trick "decltype(comp)" to get its real type. decltype(comp) is nothing but "return the type of comp", like "decltype(123)" is essentially "int".


  • 0
    A

    may i know why the space complexity is O(N) N = min(k, max(m, n)) ? Thank you!


  • 0
    F

    @avici.zhu
    This is not a tight bound. I think a more accurate bound should be O(min(k, n)).
    I didn't think much about whether it should be n or max(m, n), but I now think it should be n instead of max(m, n)
    Is this what concerned you?


  • 0

    Very nice solution in C++, thanks for sharing!


  • 0
    I

    @fentoyal could you explain more about why the space is n not max(m,n)....and I now think it may be min(k, m*n)??


  • 0
    V
    This post is deleted!

  • 0
    G

    @fentoyal

    How is "3,2 is guaranteed to be no less than 3, 3. Shouldnt it be the other way round?"


  • 0
    F

    @gg2529 Yes, you are right. good catch. I corrected it.


  • 1
    T

    Similar idea with you!

    vector<pair<int, int>> kSmallestPairs(vector<int>& nums1, vector<int>& nums2, int k) {
            const int N2 = nums2.size();
            auto cmppair = [&nums2](const pair<int, int> &a, const pair<int, int> &b){
                return a.first + nums2[a.second] > b.first + nums2[b.second]; };
            priority_queue<pair<int, int>, vector<pair<int, int>>, decltype(cmppair)> smallRootQue(cmppair);
            vector<pair<int, int>> res;
            if(N2 == 0) return res;
            for(auto n1:nums1)
                smallRootQue.emplace(n1, 0);
            for(int i=0; i<k && !smallRootQue.empty(); ++i){
                auto ctop = smallRootQue.top();
                smallRootQue.pop();
                res.emplace_back(ctop.first, nums2[ctop.second]);
                if(ctop.second + 1 < N2) smallRootQue.emplace(ctop.first, ctop.second+1);
            }
            return res;
        }

  • 0
    Y

    @tju_xu I had similar idea. This solution keeps the heap size no more than N1 and time complexity O(k * log N1). Actually, if N1 > N2, you can use a heap of size N2 for better performance.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.