@Ximin.Z I do not quite agree with your conclusion.Discussion of Complexity
I am not currently capable of coming up with a formal proof or explanation, but for now, I have two bones to pick with your conclusion:M = 2E does not seem valid to me: even if each edge triggered a union operation, which is a worst case scenario, we would still have M = E per your definition of M, combined with the code by op. the lg part does not seem justified or illustrated for me: what is the base of your lg? With path compression. The value difference is not asymptotically significant, but does cause a difference in understanding.
I am only putting two cases here:
Consider such a set of edge: (1,0), (2,0), (3,0), (4,0)....
It would be easy to verify the cost in such a scenario(note that each edge triggers one valid union in this case, and also, path compression does not matter):
Also note that each union's contribution to the total cost is the sum of the two find calls.Edge Graph contribution to Cost (1,0) 1->0 1 + 1 (2,0) 2->0<-1 1 + 1 (3,0) ... 1 + 1 (4,0) ... 1 + 1
You can verify by drawing the graph on paper yourself, and I can't do it here since the graph goes beyond one dimension upon iteration. It is not hard to generate in this case that the best case performance is O(N + 2M) (constant factor ignored).Worst Case
First, let's assume without the path compression. Consider such a case:Edge Graph contribution to Cost (0,1) 0->1 1 + 1 (0,2) 0->1->2 1 + 1 (0,3) 0->1->2->3 2 + 1 (0,4) 0->1->2->3->4 3 + 1
It is not hard to generalize in this case that for M unions, the total cost is gonna be O(N + M + sum(1 to M)) = O(N + M^2).
What about putting path compression by halving on?
Consider this case below. You have to draw something on paper to actually see the point of this case. The graph is just impossible to put on here.
I do not have a theoretical for this case, but it is likely to generalize here that the total cost, with the aid of path compression by halving, would be O(N + M + 2 * sum(1 to M/2)), which would still be O(N + M^2). (the portion of cost done by M unions are reduced by half though).
So in conclusion, I do not think it is justified to say that the cost here is O(N + MlgN) here. I am tempted to say O(N + M^2) but I do not for now have a formal proof.
If we do path compression in the textbook way (every one visited compressed to root) rather than by halving, I believe the total cost could be able to be significantly reduced.