I think the key to understanding this greedy algorithm is to understand two assumptions:

After sorting:

The longest chain must contain pairs[0].
Suppose it doesn't, that is, suppose the longest chain C starts at pairs[k] for some k > 0. And we obviously have pairs[k][1] ≥ pairs[0][1] due to sorting. We can always form a new chain C' by replacing pairs[k] with pairs[0], whose length would be no shorter than C, obviously. We just have to prove C' to be also a chain without overlapping. This is trivial: pairs[0] ends no later than pairs[k], and if you can lead a chain with pairs[k], you can also lead a chain with pairs[0] without introducing new overlapping;
Tie situations does not matter: you do not have to specify how to sort the pairs when the ends are at a tie.
Consider such an example: (1,2), (1,4), (3,4), we start at (1,2), and (1,4) is skipped. And then we add (3,4). If we have instead: (1,2), (3,4), (1,4), then we start at (1,2), and then we add (3,4), and then we skip (1,4). In the two different situations, the pair (1,4) (that is, the pair that is tied but has a smaller start) is skipped anyway, albeit by different is. You can directly try to prove the statement: when there are intervals with tying ends, the optimal solution will never choose the one with smaller start over the other one which has a relatively larger start. This is intuitive: same end and smaller start means longer and more overlapping, why bother.

I think the algorithm's strategy then justifies itself in the light of these two assumptions and needs no further proof. Hope this helps.