Two C++ solutions: hashtable & sort+binary search. Time & space complexity analyzed.


  • 13
    M

    Let m=nums1.size(), and n=nums2.size()

    Solution 1: hashtable (using unordered_map).

    • time complexity: max(O(m), O(n))
    • space complexity: choose one O(m) or O(n) <--- So choose the
      smaller one if you can

    vector<int> intersect(vector<int>& nums1, vector<int>& nums2) {
        if(nums1.size() > nums2.size()) return intersect(nums2, nums1);
        vector<int> ret;
        unordered_map<int,int> map1;
        for(int num:nums1) map1[num]++;
        for(int num:nums2) {
            if(map1.find(num)!=map1.end() && map1[num]>0) {
                ret.push_back(num);
                map1[num]--;
            }
        }
        return ret;
    }
    

    Solution 2: sort + binary search

    • time complexity: max(O(mlgm), O(nlgn), O(mlgn)) or max(O(mlgm),
      O(nlgn), O(nlgm))
    • O(mlgm) <-- sort first array
    • O(nlgn) <--- sort second array
    • O(mlgn) <--- for each element in nums1, do binary search in nums2
    • O(nlgm) <--- for each element in nums2, do binary search in nums1
    • space complexity: depends on the space complexity used in your
      sorting algorithm, bounded by max(O(m), O(n))

    vector<int> intersect(vector<int>& nums1, vector<int>& nums2) {
        vector<int> ret;
        if(nums1.empty() || nums2.empty()) return ret;
        sort(nums1.begin(), nums1.end());
        sort(nums2.begin(), nums2.end());
        int j=0;
        for(int i=0; i<nums1.size(); ) {
            int index = lower_bound(nums2, nums1[i]);
            int count2 = 0;
            while(index<nums2.size() && nums2[index]==nums1[i]) {
                count2++; 
                index++;
            }
            int count1 = 0;
            while(nums1[j]==nums1[i]) {
                count1++;
                j++;
            }
            ret.insert(ret.end(),min(count1,count2),nums1[i]);
            i=j;
        } 
        return ret;
    }
    
    int lower_bound(const vector<int>& nums, int target) {
        int l=0, r=nums.size()-1;
        while(l<r) {
            int m=l+(r-l)/2;
            if(nums[m]<target) {l=m+1;}
            else {r=m;}
        }
        return r;
    }
    

    So if two arrays are already sorted, and say m is much smaller than n,
    we should choose the algorithm that for each element
    in nums1, do binary search in nums2,
    so that the complexity is O(mlgn).
    In this case, if memory is limited and nums2 is stored
    in disk, partition it and send portions of nums2 piece
    by piece. keep a pointer for nums1 indicating the
    current position, and it should be working fine~


  • 0
    X
    This post is deleted!

  • 0
    X

    Nice solution! I tried to sort only one array and search each element of other unordered arrays, but it does not work.


  • 0
    X

    I assume the nums2 on disks are not sorted. So I would prefer make nums1 a hashmap, then read chunks of nums2, query nums1's hashmap, when meet a same value, reduce 1 record in nums1's hashmap.


  • 0
    H

    @morrischen2008 if Both of array are sorted, the space complexity could be constant, time complexity is O(m+n). Just use two pointers, one for each array, and do pingpong operation.


  • 0

    @hualiang2 Agree, with you, there is no need to cost O(mlogn) or O(nlogm) to search the element from the one from the other among the two arrays. Once both of them are sorted, usingtwo pointers to iterate through the two array become a more obvious solution. below is my solution of quick sort (used built sort), and two pointers:

    public class Solution {
    public int[] intersect(int[] nums1, int[] nums2) {
        ArrayList<Integer> in_sec = new ArrayList<Integer>();
        
        int n1_p = 0;
        int n2_p = 0;
        int n1_len = nums1.length;
        int n2_len = nums2.length;
        int[] in_sec_arr;
        int idx = 0;
        
        Arrays.sort(nums1);
        Arrays.sort(nums2);
        
        while(n1_p<n1_len && n2_p < n2_len){
            while(n1_p<n1_len && n2_p < n2_len && nums1[n1_p] < nums2[n2_p]){ n1_p++; }
            while(n1_p<n1_len && n2_p < n2_len && nums2[n2_p] < nums1[n1_p]){ n2_p++; }
            
            while(n1_p<n1_len && n2_p < n2_len && nums1[n1_p] == nums2[n2_p]){
                in_sec.add(nums1[n1_p]);
                n1_p++;
                n2_p++;
            }
        }
        
        in_sec_arr = new int[in_sec.size()];
        
        for (int val: in_sec){
            in_sec_arr[idx] = val;
            idx++;
        }
        
        return in_sec_arr;
    }
    }

  • 0

    @xuehaohu And some thoughts on the follow up questions:

    1. if two arrays are sorted, usingg two pointers, become easy. ( which would help skip sort steps ,and directly get into the two pointers to find intersection)

    2. if num1/s size is smaller than nums2 size. First, Hashmap solution is time of O(m) (n is the length of the array which was used to do the count, lets say first one: nums1, and m is the length of the other, which is longer) and space of O(3n) (since the first array us smaller) . and Sort and two pointers solution, are time of max(O(mlogm), O(m+n) ,O(n) ) (as O(mlogm > nlogn since m>n)). and space of O(2n) (arraylist:O(n) and finally constructing the int array O(n) )

    So in terms of time: it is a comparsion between: O(m)and max( O(mlogm) , O(m+n) , +O(n) ), and obviously first solution is better, which is using hashmap to do element count for the shorter array.
    And in terms of space complexity, it is a comparsion between: O(3n) and O(2n) , so the second solution is slightly better as both of them are in same magnitude, there is no much difference.
    So based on the two comparison above, the first solution is better.

    1. if the longer array cannot be loaded into memory all at one time, it is easy to handle it by first solution, which is find the intersection of the first array with each chunk of the second array loaded into memory, then finally the intersection is still right because:
      nums1 ∩ nums2 = nums1 ∩ nums2_chunk1 ∩ nums2_chunk2 ∩ nums2_chunk3 ...
      where nums2 = nums2_chunk1 U nums2_chunk2 U nums2_chunk3 ...

  • 0
    B

    @morrischen2008 your solution is fantastic. quick question. why the hash table solution is not complexity O(m+n) ? My intuitive guess was that making up one table takes m insert operations and looking up the element takes n operations. Please kindly suggest your answer. Thanks!


  • 0
    O

    who can tell me why not use find(nums1.begin(),nums1.end(),num)?


  • 0
    C

    Using a unordered_multiset, hope it helps!

    vector<int> intersectFirst(vector<int>& nums1, vector<int>& nums2)
    {
        unordered_multiset<int> hash(nums1.begin(), nums1.end());
        
        vector<int> result;
        int n2 = nums2.size();
        for(int i = 0; i < n2; ++i)
        {
            auto iter = hash.find(nums2[i]);
            if(iter != hash.end())
            {
                result.push_back(nums2[i]);
                hash.erase(iter);
            }
        }
        
        return result;
    }
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.