Slow 1-liner to Fast solutions


  • 103

    Several solutions from naive to more elaborate. I found it helpful to visualize the input as an m×n matrix of sums, for example for nums1=[1,7,11], and nums2=[2,4,6]:

          2   4   6
       +------------
     1 |  3   5   7
     7 |  9  11  13
    11 | 13  15  17
    

    Of course the smallest pair overall is in the top left corner, the one with sum 3. We don't even need to look anywhere else. After including that pair in the output, the next-smaller pair must be the next on the right (sum=5) or the next below (sum=9). We can keep a "horizon" of possible candidates, implemented as a heap / priority-queue, and roughly speaking we'll grow from the top left corner towards the right/bottom. That's what my solution 5 does. Solution 4 is similar, not quite as efficient but a lot shorter and my favorite.

    Solution 1: Brute Force (accepted in 560 ms)

    Just produce all pairs, sort them by sum, and return the first k.

    def kSmallestPairs(self, nums1, nums2, k):
        return sorted(itertools.product(nums1, nums2), key=sum)[:k]
    

    Solution 2: Clean Brute Force (accepted in 532 ms)

    The above produces tuples and while the judge doesn't care, it's cleaner to make them lists as requested:

    def kSmallestPairs(self, nums1, nums2, k):
        return map(list, sorted(itertools.product(nums1, nums2), key=sum)[:k])
    

    Solution 3: Less Brute Force (accepted in 296 ms)

    Still going through all pairs, but only with a generator and heapq.nsmallest, which uses a heap of size k. So this only takes O(k) extra memory and O(mn log k) time.

    def kSmallestPairs(self, nums1, nums2, k):
        return map(list, heapq.nsmallest(k, itertools.product(nums1, nums2), key=sum))
    

    Or (accepted in 368 ms):

    def kSmallestPairs(self, nums1, nums2, k):
        return heapq.nsmallest(k, ([u, v] for u in nums1 for v in nums2), key=sum)
    

    Solution 4: Efficient (accepted in 112 ms)

    The brute force solutions computed the whole matrix (see visualization above). This solution doesn't. It turns each row into a generator of triples [u+v, u, v], only computing the next when asked for one. And then merges these generators with a heap. Takes O(m + k*log(m)) time and O(m) extra space.

    def kSmallestPairs(self, nums1, nums2, k):
        streams = map(lambda u: ([u+v, u, v] for v in nums2), nums1)
        stream = heapq.merge(*streams)
        return [suv[1:] for suv in itertools.islice(stream, k)]
    

    Solution 5: More efficient (accepted in 104 ms)

    The previous solution right away considered (the first pair of) all matrix rows (see visualization above). This one doesn't. It starts off only with the very first pair at the top-left corner of the matrix, and expands from there as needed. Whenever a pair is chosen into the output result, the next pair in the row gets added to the priority queue of current options. Also, if the chosen pair is the first one in its row, then the first pair in the next row is added to the queue.

    def kSmallestPairs(self, nums1, nums2, k):
        queue = []
        def push(i, j):
            if i < len(nums1) and j < len(nums2):
                heapq.heappush(queue, [nums1[i] + nums2[j], i, j])
        push(0, 0)
        pairs = []
        while queue and len(pairs) < k:
            _, i, j = heapq.heappop(queue)
            pairs.append([nums1[i], nums2[j]])
            push(i, j + 1)
            if j == 0:
                push(i + 1, 0)
        return pairs

  • 0

    Are you sure the time complexity for solution 3 is O(mn log k)?


  • 0

    I don't understand how the code below could remove duliplicates.

            if j == 0:
                push(i + 1, 0)
    

    I used a dictionary to remember the position when doing BFS.


  • 3

    @jedihy

    Are you sure the time complexity for solution 3 is O(mn log k)?

    Fairly sure, yes. What do you think it is?

    I don't understand how the code below could remove duliplicates.

            if j == 0:
                push(i + 1, 0)
    

    I used a dictionary to remember the position when doing BFS.

    If we don't do anything against duplicates, then for example cell (5, 5) could get reached from (4, 5) or from (5, 4), right? I just made a decision. Only (5, 4) leads to (5, 5). I only expand in rows and in the first column (where j=0).


  • 1

    @StefanPochmann I know understand your way of removing duplicates. But for my first problem, since heapq.nsmallest is equivalent to sorted(iterable, key=key)[:n], you sort the m*n pairs and return the first k elements. Then, the time should be O(mn + mnlogmn + k) = O(mnlogmn)?


  • 1

    @jedihy said in Slow 1-liner to Fast solutions:

    heapq.nsmallest is equivalent to sorted(iterable, key=key)[:n]

    Their results are equivalent. Not their time/space complexities. Check out the nsmallest source code and you'll see it uses a heap of the requested size.

    I added that information above and also replaced my two-liner with a one-liner. Thanks for reminding me that nsmallest takes a key :-).


  • 0

    wow~so cool~


  • 2
    G

    @jedihy Default, the code iterate the first row, but it is possible before the ending, the next row has a smaller value than the current row(must be the first column, j == 0).
    In an other word, for each time you meet a new row, expand to the next row, waiting for its pop from a heap.


  • 0
    G

    Thanks!! I was thinking of return to i+1, j==0 but could not get your solution 5.


  • 9
    G

    JAVA solution with the idea of solution 5,
    83 ms

    public class Solution {
        
        public List<int[]> kSmallestPairs(int[] nums1, int[] nums2, int k) {
            List<int[]> list = new ArrayList();
            int m = nums1.length;
            int n = nums2.length;
            if(m == 0 || n == 0 || k < 1) return list;
            
            PriorityQueue<Data> heap = new PriorityQueue<Data>((a, b) -> (a.val - b.val));
            
            heap.offer(new Data(0, 0, nums1[0] + nums2[0]));
            
            while(!heap.isEmpty() && k > 0){
                Data d = heap.poll();
                
                int[] pair = {nums1[d.i], nums2[d.j]};
                list.add(pair);
                k--;
                
                //add the next column item
                if(d.j < n - 1){
                    heap.offer(new Data(d.i, d.j + 1, nums1[d.i] + nums2[d.j + 1]));
                }
                // always store the next row (smallest) 
                if(d.j == 0 && d.i < m - 1){
                    heap.offer(new Data(d.i + 1, 0, nums1[d.i + 1] + nums2[0]));
                }
            }
            return list;
        }
            
            class Data{
                int i; 
                int j;
                int val;
                public Data(int i, int j, int val){
                    this.i = i;
                    this.j = j;
                    this.val = val;
                }
            }
    }

  • 0
    A

    @gogoflg2016253

    Can you explain why the first pair in the next row is added to the heap whenever you add the first pair in the current row ?


  • 0
    This post is deleted!

  • 0
    S

    @StefanPochmann A minor suggestion.
    If input is nums1 = [], nums2 = [1], k = 5, from the problem description, it it unclear what we should return.
    An empty list? a list equal to [1]? raise an exception? Maybe its better to define this corner case clearly in description


  • 0
    L
    if j == 0:
      push(i + 1, 0)
    

    Great trick to eliminate the extra space to track visited coordinates!


  • 0

    @Soba It's pretty clear from the problem description's third example.


  • 0
    P

    @abi93k That's how you avoid visit the same (i,j) while you can visit all i,j .


  • 0
    B

    Is the time complexity of solution 5 is O(klgk)?


  • 0

    @XiaoboZhang1991 what makes you think it's k log k?


  • 0

    @agave The while loop runs k times and there is at most 2k elements in the heap.


  • 0
    D

    thx.. I learned another way of using heap


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.