Two O(n) solutions

  • 88

    I am sharing two of my solutions, one is based on the longest path, and the other is related to Tree DP.

    Longest Path

    It is easy to see that the root of an MHT has to be the middle point (or two middle points) of the longest path of the tree.
    Though multiple longest paths can appear in an unrooted tree, they must share the same middle point(s).

    Computing the longest path of a unrooted tree can be done, in O(n) time, by tree dp, or simply 2 tree traversals (dfs or bfs).
    The following is some thought of the latter.

    Randomly select a node x as the root, do a dfs/bfs to find the node y that has the longest distance from x.
    Then y must be one of the endpoints on some longest path.
    Let y the new root, and do another dfs/bfs. Find the node z that has the longest distance from y.

    Now, the path from y to z is the longest one, and thus its middle point(s) is the answer. Java Solution

    Tree DP

    Alternatively, one can solve this problem directly by tree dp.
    Let dp[i] be the height of the tree when the tree root is i.
    We compute dp[0] ... dp[n - 1] by tree dp in a dfs manner.

    Arbitrarily pick a node, say node 0, as the root, and do a dfs.
    When we reach a node u, and let T be the subtree by removing all u's descendant (see the right figure below).
    We maintain a variable acc that keeps track of the length of the longest path in T with one endpoint being u.
    Then dp[u] = max(height[u], acc)
    Note, acc is 0 for the root of the tree.

                 |                 |
                 .                 .
                /|\               /|\
               * u *             * u *
               / | \
              *  v  *

    . denotes a single node, and * denotes a subtree (possibly empty).

    Now it remains to calculate the new acc for any of u's child, v.
    It is easy to see that the new acc is the max of the following

    1. acc + 1 --- extend the previous path by edge uv;
    2. max(height[v'] + 2), where v != v' --- see below for an example.
               / |
              v' v

    In fact, the second case can be computed in O(1) time instead of spending a time proportional to the degree of u.
    Otherwise, the runtime can be quadratic when the degree of some node is Omega(n).
    The trick here is to maintain two heights of each node, the largest height (the conventional height), and the second largest height
    (the height of the node after removing the branch w.r.t. the largest height).

    Therefore, after the dfs, all dp[i]'s are computed, and the problem can be answered trivially.
    The total runtime is still O(n). Java Solution

  • 1

    Thanks! Your thought helps me!

  • 0

    That's brilliant work you've done!

  • 0

    Hi lixx2100,

    Your solutions are brilliant, and I've seen your code on Github, they're awesome too. However, I have some doubt when solving this. Have you got StackOverflowError when you tried your dfs solution? I think my solution is similar to yours, and I keep getting StackOverflowError. Could you please take a look at ? Was it bcause of this that you divided your dfs into two parts? I'll really appreciate it if you could give some thought.

    Thank you very much!

  • 0

    Hi Magicyuli,

    I replied in your thread. Let me know what do you think. :-)

  • 0

    Nice solutions! And what a good idea to manage code with github! Do you mind I copy the format of your to my repository?

  • 0

    Hi JavaXu, I am glad you love it. Yes, free feel to copy the format.

  • 1

    Thanks for the detailed explanation. I would like to offer my Python solution in case anyone is interested :)

    class Node(object):
        def __init__(self):
            self.outgoing_nodes = set()
    class Solution(object):
        def build_graph(self, edges):
            self.graph = { i: Node() for i in xrange(self.n) }
            for edge in edges:
                x, y = edge
        def bfs_max_dist_node(self, root):
            visited = set()
            dist = { root: 0 }
            queue = [root]
            max_dist = 0
            chosen = root
            while queue:
                now = queue.pop(0)
                for node in self.graph[now].outgoing_nodes:
                    if node in visited:
                    dist[node] = dist[now] + 1
                    if dist[node] > max_dist:
                        max_dist = dist[node]
                        chosen = node
            return (max_dist, chosen)
        def find_endpoint1(self):
            max_dist, chosen = self.bfs_max_dist_node(0)
            return chosen
        def find_path(self, src, target):
            path = [src]
            data = { 'success': False, 'final_path': None }
            def dfs(now, prev):
                if now == target:
                    data['success'] = True
                    data['final_path'] = path[:]
                if data['success']:
                for node in self.graph[now].outgoing_nodes:
                    if node == prev:
                    dfs(node, now)
            dfs(src, -1)
            return data['final_path']
        def get_mid(self, path):
            l = len(path)
            print path, l
            if l % 2 == 0:
                return [path[l/2-1], path[l/2]]
                return [path[l/2]]
        def findMinHeightTrees(self, n, edges):
            :type n: int
            :type edges: List[List[int]]
            :rtype: List[int]
            if n == 0:
                return []
            self.n = n
            # Build the graph
            # BFS and find the endpoint #1 of longest path
            ed1 = self.find_endpoint1()
            # calc height, root: #1, btw find ed2
            _, ed2 = self.bfs_max_dist_node(ed1)
            # DFS (optionally with simulated stack)  path between #1 and #2
            path = self.find_path(ed1, ed2)
            # find middle point(s)
            midpoints = self.get_mid(path)
            return midpoints

  • 0

    I have two questions about the Tree DP function. When you do dfs first time, I mean the process you want to get the height1 and height2 of each node, if it will check the same node several times? I think if we use a hashset to save the points that we have checked will be better.
    Another question is that your Tree DP is based on the same order of dfs, is it right? I just want to see if I understand your ideas.
    Brilliant idea!

  • 0

    Hi jialegavin,

    Both dfs functions will visit each node exactly once via the help of the parent parameter. Of course, one can use a hashset to achieve the same thing, but that is more commonly used in graph traversal.

    For the second question, yes, and it is interesting to see that most Tree-DP can be solved elegantly via a DFS.

  • 0

    Oh, I see why it works. Because it is undirected graph, if one node is checked twice then there will be a cycle, which is impossible. Thanks for your replying.

  • 0

    Cloud you please try the DFS? I've written it but it showed StackOverFlow. Don't what's wrong.

  • 0

    I tried to recall my memory. For the first approach, my initial attempt was a DFS, but it caused the StackOverflowException, then I switched to BFS. This doesn't necessarily mean your code is wrong. It just indicates that the stack size is small, and in this case, one might want to explicitly use a stack to achieve the DFS order, or, like my case, switch to BFS.

    See here for an interesting discussion on the same topic as well.

  • 0

    My solution is more complex than that one. I used the backtracking. End when reach the leaf: graph[i].size() == 1 and visited[graph[i].get(0)] == true.

    I have changed the ArrayList to the Stack, still StackOverFlow..

    Anyway ,thank you for your answer.

  • 4

    Python implementation of the first idea:

    def findMinHeightTrees(self, n, edges):
        neighbors = collections.defaultdict(set)
        for v, w in edges:
        def maxpath(v, visited):
            paths = [maxpath(w, visited) for w in neighbors[v] if w not in visited]
            path = max(paths or [[]], key=len)
            return path
        path = maxpath(0, set())
        path = maxpath(path[0], set())
        m = len(path)
        return path[(m-1)/2:m/2+1]

  • 0

    "It is easy to see that the root of an MHT has to be the middle point (or two middle points) of the longest path of the tree." Can you please elaborate no this, how can this be proved ?

  • 1

    For convenience, let's consider the case where the longest path consists of odd number vertices. (Even case can be done via a similar analysis.) Define the longest path as x ... o ... y, where x and y are the two endpoints and o is the middle vertex.

    For a contradiction, if the root of MHT is not o but some other vertex o', then at least one of the following two statements must be true.

    1. path xo' is strictly longer than path xo
    2. path yo' is strictly longer than path yo

    This means o' is worse than o, i.e., o' cannot be the root of MHT, which is a contradiction.

  • 0

    Nice. Thank You!

  • 1

    DP works well on a directed acyclic graph but this undirected is hard to be solved by DP without using the tricks that I still do not understand.

  • 0

    @jedihy In fact, a tree is always acyclic. If you fix the root, then the directions of all edges can be oriented accordingly with respect to the root.

Log in to reply

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.