In depth explanation of the non-stack solution


  • 0
    F

    First thanks @dietpepsi for bringing up the non-stack solution. For those of you who are interested in a more rigorous explanation, here it is.

    The original post explained the solution in terms of "indegree" and "outdegree". A quick examination revealed that it can equivalently be done by considering the total number of non-null nodes and null nodes. Assume we have obtained the "nodes" array by splitting the given string "preorder", now we would like to scan this "nodes" array to verify if the preorder serialization is correct. The two criteria on which our verification process relies are as follows.

    Suppose x and y are the total numbers of non-null and null nodes, respectively, scanned so far, then for a valid preorder serialization, we should have:

    1. At any point during the scanning (before reaching the last element), x - y > -1;
    2. At the end of the scanning, x - y = -1.

    Of course these are enough for us to generate the solution. But you might be wondering the validity of these statements, so I will go ahead and give the proof of the conclusions we drew above.

    Let me first prove that the second statement is true, which is recast as follows: For any binary tree that is preorder traversal serialized correctly, the difference between the total number of non-null (labeled as x) and null (labeled as y) nodes is -1.

    To proceed, let's do some simple example:

    • empty tree --> "#", we have x = 0, y = 1 and x - y = -1.
    • root-only tree --> "1", "#", "#", we have x = 1, y = 2, x - y = -1. (The root is taken as "1" for example)

    Now let's suppose this is true for any binary tree with total node number less than or equal n (n >= 1). Then for any binary tree with total node number (n+1), it can be divided into three parts: the root, the left subtree and the right subtree. Let n1 and n2 be the total number of nodes for the left and right subtrees, respectively, then n1 >= 1, n2 >= 1 and n1 + n2 + 1 = n + 1. These equations imply that 1 <= n1 <= n and 1 <= n2 <= n. If x1 and y1 are the total number of non-null and null nodes for the left subtree, x2 and y2 are the total number of non-null and null nodes for the right subtree, then from our assumption, we expect: x1 - y1 = -1, x2 - y2 = -1, which leads to: (x1 + x2 + 1) - (y1 + y2) = -1. Note that (x1 + x2 + 1) and (y1 + y2) are the total number of non-null and null nodes of the binary tree with total node number (n+1), by mathematical induction, we know that our conclusion is true for all binary trees. (One point here is that the actual minimum increment of the total number of nodes (include null and non-null) in the binary tree is 2, not 1, i.e., you cannot add a single node to construct a new valid binary tree. Also when n > 1, the root is not null node.)

    With the second statement validated, we will now turn to the first one. Again suppose x and y are the total numbers of non-null and null nodes scanned so far, x'and y' are the total numbers of non-null and null nodes for the rest nodes, then we should have: (x + x') - (y + y') = -1, or (x - y) = -1 - (x' - y'). In order for x - y > -1, we need to make sure that x' - y' < 0. How do we prove that?

    Well, we need to figure out what contributes to the rest of the nodes from the point of view of the original binary tree. If the node currently being scanned is not null, then the rest nodes will be constituted by the two subtrees of the current node, as well as any untraversed subtrees as we go towards the root of the binary tree. If the current node is null, then we will only have the untraversed subtrees towards the root. The bottom line is that in both cases, the rest nodes can be divided into segments, where each segment forms a subtree in the original binary tree. Now let's say xi and yi are the non-null and null nodes for each of these subtrees, according to statement 2, we have (xi - yi) = -1, and if we sum them up, we should obtain: Σ(xi - yi) = Σ(xi) - Σ(yi) = x' - y' < 0, so the first statement is substantiated.

    Here is a quick summary of the solution: we simply scan nodes from the "nodes" array and keep track of the difference between numbers of non-null and null nodes so far. If at any point the difference hits -1 and we have more nodes to scan, it's not a valid preorder serialization. Otherwise we come to the end of the scan, then we just need to check if the difference is -1.

    Now you might think we are done with the problem, but are we?

    If you look at our criteria above more carefully, you'll find out that these are only the necessary conditions for a serialization to be valid. In other words, we only know that if a preorder serialization is valid, then the two criteria should be met, so whenever any of the conditions is violated, we can conclude the serialization is invalid. What happens if there is no violation during the scanning process (i.e., we come to the end smoothly with a node difference '-1')? Can we say the serialization is valid (which is implicitly assumed in the code above)?

    To answer the questions, we need to show that the two criteria are also the sufficient conditions for a valid preorder serialization. Based on the description of the original problem, I shall define "sufficient conditions" the same as that we should be able to construct at least one valid binary tree, such that when it is serialized according to preorder traversal, the exact string of "preorder" will be produced, provided that no violation occurs during the scanning process of the "nodes" array. Here is a quick proof based on mathematical induction.

    First let me restate the problem: given any array of strings called "nodes" with each element in the array representing either a number or the "#" symbol, if there is no violation of the aforementioned criteria occurs during the scanning process, then we should be able to construct at least one valid binary tree from the "nodes" array, such that a preorder traversal serialization of the binary tree will produce the exact string of "preorder".

    Again let's look at an array of length 1, in which case can only be ["#"]. Apparently we can construct an empty binary tree from it. Now suppose it is true for an array with length n (n >= 1), then for any array with length (n+1), if there is no violation occurs during the scanning process, we should have:

    1. The first element of the array is not "#" (i.e. the root is not null);
    2. x - y = -1, where x and y are the total numbers of elements in the whole array representing a number and the "#" symbol, respectively.
    3. x' - y' >= 0, where x' and y' are the total numbers of elements accumulated so far representing a number and the "#" symbol, respectively, at any point before the scanning is done.

    Now let's look at conclusion 3. Here I claim there will exist at least one index i (0 < i < n) such that we have x' - y' = 0 for the scanned subarray [0, i] (both inclusion), otherwise we will expect x' - y' >= 1 right before we reach the last element and eventually we expect x - y >= 0, which contradicts with conclusion 2. Now according to conclusion 1, if we consider only subarray [1, i], let x'' and y'' be the total numbers of elements in it representing a number and the "#" symbol, respectively, then x'' - y'' = -1. Similarly for subarray [i+1, n], x''' - y''' = -1, if x''' and y''' are defined in the similar fashion because (x' + x''') - (y' + y''') = -1 and x' - y' = 0. Again from our presumption, we can construct at least one valid binary tree for subarray [1, i] (called T1) and subarray [i+1, n] (called T2), respectively. If we treat the first element of the original array as a root, set T1 as its left subtree, T2 as its right subtree, then we should be able to construct a valid binary tree for the original array, which indicates our assumption is also true for an array with length (n+1), thus shows the correctness of the "sufficient conditions" claim.


  • 0
    F

    Here is a complete java program (have to put it in an answer due to number of characters limitation):

    public boolean isValidSerialization(String preorder) {
        int d = 0;
    	String[] nodes = preorder.split(",", 0);
    	
    	for (int i = 0; i < nodes.length; i++) {
    		d += ("#".equals(nodes[i]) ? -1 : 1);
    		if (d == -1 && i < nodes.length - 1) return false;
    	}
    	
        return d == -1;
    }
    

    Next we may further figure out comparable criteria for inorder and postorder traversal serializations following a similar line. You are welcome to share any thoughts on that!


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.