Easiest 2ms Java Solution


  • 92

    This is 1D version of Number of Islands II. For more explanations, check out this 2D Solution.

    1. n points = n islands = n trees = n roots.
    2. With each edge added, check which island is e[0] or e[1] belonging to.
    3. If e[0] and e[1] are in same islands, do nothing.
    4. Otherwise, union two islands, and reduce islands count by 1.
    5. Bonus: path compression can reduce time by 50%.

    Hope it helps!

    public int countComponents(int n, int[][] edges) {
        int[] roots = new int[n];
        for(int i = 0; i < n; i++) roots[i] = i; 
    
        for(int[] e : edges) {
            int root1 = find(roots, e[0]);
            int root2 = find(roots, e[1]);
            if(root1 != root2) {      
                roots[root1] = root2;  // union
                n--;
            }
        }
        return n;
    }
    
    public int find(int[] roots, int id) {
        while(roots[id] != id) {
            roots[id] = roots[roots[id]];  // optional: path compression
            id = roots[id];
        }
        return id;
    }
    

  • 1

    Path compression, not pass compression.


  • 0

    Haha. Thanks just updated Stefan.


  • 0
    Y

    what is the time complexity ? is it O(E*E) ?


  • 2

    The complexity for M quick union + path compression on N objects should be N + MlgN, I think in this problem, M = 2E, N = V, so O(V + 2ElgV), correct me if this is wrong @yavinci


  • 0
    B

    @Ximin.Z I am wondering if find() method is O(1) in this case?


  • 0

    a smart way of using union found when already know there will n points in total


  • 0
    D

    exactly the same idea! union find, a clever and fast processing data structure!


  • 1
    C

    Hi, @yavinci Thanks for sharing your code. For the path compression part, it is compressing the node for one level smaller, but not directly to its root, right? I think this is good enough, but just to check with you about intention behind that code snippet.


  • 1
    C

    @yavinci Just another improvement to your path compression union find algorithm, add weight to each root in the tree. When only use path compression, the time complexity is (N + MlogN), with weighted quick union path compression the time complexity will be (N + M)*lgN. M is the number is unions in the array, and N is the number of nodes.

    public int countComponents(int n, int[][] edges) {
            
            if (null == edges || 0 == edges.length) return n;
            
            // build root array
            int[] roots = new int[n];
            // build size array (add weight to the tree)
            int[] size = new int[n];
            for (int i = 0; i < n; i ++) roots[i] = i;
            Arrays.fill(size, 1);
            
            // for each edge, change its root to its parant
            for (int i = 0; i < edges.length; i ++) {
                // for the nodes in each edge, use path compression to let it pointed to its root 
                int root1 = find(roots, edges[i][0]);
                int root2 = find(roots, edges[i][1]);
                if (root1 != root2) {
                    if (size[root1] < size[root2]) {
                        roots[root1] = root2;
                        size[root1] += size[root2];
                    }
                    else {
                        roots[root2] = root1;
                        size[root2] = size[root1];
                    }
                    n --;
                }
            }
            
            return n;
        }
        
        private int find(int[] roots, int node) {
            
            while (roots[node] != node) {
                roots[node] = roots[roots[node]];
                node = roots[node];
            }
            return node;
        }
    

  • 0
    D

    @yavinci

    Could you explain what exactly the path compression does ?


  • 0
    D

    @dreamer92

    Here if we don't write roots[id] = roots[roots[id]] (which is path compression) then every time we call find for id, we are traversing parent tree till root, but by writing this line we are reducing path to parent. That means we have reduced path to reach parent of current id, Hence this is called path compression.

    If it's still not clear then take a sample graph and see how parent array is different with and without path compression.

    HTH.


  • 0
    B

    @dilip7 So we only set the node to its grandparent? What if its grandparent is not the root of this node?


  • 0
    D

    @BVBSC Yes, so if the grandparent is not the root of the tree then in that case as well, u reduce the number of nodes you are visiting to reach the parent.

    I believe that this is confusing as you are thinking that we directly store parent node of current node which is not the case always unless grandparent is indeed the parent.

    HTH.


  • 1

    @Ximin.Z I do not quite agree with your conclusion.

    Discussion of Complexity

    I am not currently capable of coming up with a formal proof or explanation, but for now, I have two bones to pick with your conclusion:

    • M = 2E does not seem valid to me: even if each edge triggered a union operation, which is a worst case scenario, we would still have M = E per your definition of M, combined with the code by op.
    • the lg part does not seem justified or illustrated for me: what is the base of your lg? With path compression. The value difference is not asymptotically significant, but does cause a difference in understanding.

    I am only putting two cases here:
    ###Best Case
    Consider such a set of edge: (1,0), (2,0), (3,0), (4,0)....
    It would be easy to verify the cost in such a scenario(note that each edge triggers one valid union in this case, and also, path compression does not matter):

    Also note that each union's contribution to the total cost is the sum of the two find calls.

    Edge Graph contribution to Cost
    (1,0) 1->0 1 + 1
    (2,0) 2->0<-1 1 + 1
    (3,0) ... 1 + 1
    (4,0) ... 1 + 1

    You can verify by drawing the graph on paper yourself, and I can't do it here since the graph goes beyond one dimension upon iteration. It is not hard to generate in this case that the best case performance is O(N + 2M) (constant factor ignored).

    Worst Case

    First, let's assume without the path compression. Consider such a case:

    Edge Graph contribution to Cost
    (0,1) 0->1 1 + 1
    (0,2) 0->1->2 1 + 1
    (0,3) 0->1->2->3 2 + 1
    (0,4) 0->1->2->3->4 3 + 1

    It is not hard to generalize in this case that for M unions, the total cost is gonna be O(N + M + sum(1 to M)) = O(N + M^2).

    What about putting path compression by halving on?
    Consider this case below. You have to draw something on paper to actually see the point of this case. The graph is just impossible to put on here.

    Edge contribution to Cost (reminder: two find calls)
    (0,1) 1 + 1
    (0,2) 1 + 1
    (0,3) 2 + 1
    (0,4) 2 + 1
    (1,5) 3 + 1
    (1,6) 3 + 1
    (0,7) 4 + 1
    (2,8) 4 + 1

    I do not have a theoretical for this case, but it is likely to generalize here that the total cost, with the aid of path compression by halving, would be O(N + M + 2 * sum(1 to M/2)), which would still be O(N + M^2). (the portion of cost done by M unions are reduced by half though).

    So in conclusion, I do not think it is justified to say that the cost here is O(N + MlgN) here. I am tempted to say O(N + M^2) but I do not for now have a formal proof.

    If we do path compression in the textbook way (every one visited compressed to root) rather than by halving, I believe the total cost could be able to be significantly reduced.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.