# Find Duplicate Subtrees

• Given a binary tree, return all duplicate subtrees.

For example,

``````        1
/ \
2   3
/   / \
4   2   4
/
4
``````

The following are two duplicate subtrees:

``````      2
/
4
``````

and

``````    4
``````

Therefore, return `[ [2,4], [4] ]`.

• When you say a subtree.. is it the complete tree under a node which will go upto the leaves or just any intermediate subtree?

If it is the first case then we can traverse through the tree and at every node store the inorder traversal as well as postorder(or preorder.. does not matter).. if any two distinct nodes in the tree have the same inorder and preorder traversal values then have a duplicate subtree's starting at those nodes...

• @birinder said in Find Duplicate Subtrees:

is it the complete tree under a node which will go upto the leaves or just any intermediate subtree?

It will have to go down to the leaves.

• @1337c0d3r I think then the logic that I gave should work.. the challenge might be to calculate inorder and preorder for each node's subtree efficiently... i will code this soon and post my solution.

• ``````public class getDuplicates(TreeNode node){
if(root==null)
return null;
HashMap<String, HashMap<String, ArrayList<Node>>>() map = new HashMap<String, HashMap<String, ArrayList<Node>>>();
setTraversal(node, map);
/// in the end every ArrayList in the inner HashMap of map will contain the nodes containing the duplicate subtrees
}
public String [] setTraversal(TreeNode root, HashMap<String, HashMap<String, ArrayList<Node>>>() map) {
if(root==null)
return new String[2];

String[] left= setTraversal(root.left);
String[] right= setTraversal(root.right);
String [] response = new String[2];

response[0] = root.val+left[0]+right[0];
response[1] = left[1]+root.val+right[1];

HashMap<String, ArrayList<Node>> innerMap;
if(map.containsKey(response[0])){
innerMap=map.get(response[0]);
}else{
innerMap =  new HashMap<String, ArrayList<Node>>();
map.put(response[0],innerMap);
}
ArrayList<Node> list;
if(innerMap.containsKey(response[1]){
list=innerMap.get(response[1]);
}else{
list= new ArrayList<Node>();
innerMap.put(response[1], list);
}
list.add(root);
return response;
}
``````

• wouldn't it be enough to store all subtrees in a HashSet?

``````class Ideone {
public static Set<TreeNode> findDuplicate(TreeNode root,Set<TreeNode> set){

Set<TreeNode> duplicates = new HashSet<>();
if(root == null)
return duplicates;

duplicates.addAll(findDuplicate(root.left,set));
duplicates.addAll(findDuplicate(root.right,set));

if(set.contains(root))
duplicates.add(root);
else
set.add(root);

return duplicates;
}

static class TreeNode {

private TreeNode left;
private TreeNode right;

private int value;
Integer hashCode = null;

public TreeNode(int v){
value = v;
}

public void setLeft(TreeNode t){
left = t;
hashCode = null;
}

public void setRight(TreeNode t){
right = t;
hashCode = null;
}

public TreeNode getLeft(){
return left;
}

public TreeNode getRight(){
return right;
}

public String toString(){
return " { " + value + " , " + left + " , " + right + " }";
}

public boolean equals(Object o){
if(!(o instanceof TreeNode))
return false;

TreeNode node = (TreeNode)o;
System.out.println("comparing\n" + this + "\nwith\n" + node );
if(node.value != value || ((node.left == null) != (left == null)) ||
((node.right == null) != (right == null)) )
return false;

if(node.left!= null && !node.left.equals(left))
return false;

if(node.right!= null && !node.right.equals(right))
return false;

return true;
}

public int hashCode(){
if(hashCode != null)
return hashCode;
hashCode = value;
if(left != null){
hashCode = 17 * hashCode + left.hashCode();
}
if(right != null){
hashCode = 19 * hashCode + right.hashCode();
}
return hashCode;
}
}
}``````

• How about creating lists from leaf to root every time we hit a leaf. We can then compare these for each levels.

ex.
list 1 : 4 2 1
list 2 : 4 2 3 1
list 3 : 4 3 1

we can use this to figure out the duplicates as 4 & 4,2.
we could of course optimize it.

• @Ab4188 Looks like you'll only detect duplication between root-leaf paths. If there is a subtree duplication, how do you merge two paths together to detect that?

• @birinder I'd point out that if you want to hash the tree rooted at a node. You can do so with one pre-order traversal as long as you print out null as well. That should simplify your code significantly :)

• As the post @xidui mentioned, we can use hash subtree during postorder traversal. The hash function is just seralization of the subtree as Problem. 297

``````def findDuplicateSubtree(root):
#d is the set storing hashed subtree
#lst is the set storing duplicate subtree
global d, lst
d = set()
lst = set()
def recur(node):
global d, lst
if not node:
return "#"
left = recur(node.left)
right = recur(node.right)
hashed = left + "," + str(node.val) + "," + right
if hashed in d:
lst.add(hashed)
d.add(hashed)
return hashed
return [l.split(",") for l in lst]
``````

If we need to conform the stated output format, we can remove the trailing "#"s.

• //Approach:
Using post-order traversal. For each node retrieve an object of type Data for its left subtree and right subtree.Data object contains the following attribuets:
An array list of nodes in the pre-order traversal of a tree, A HashMap()where the key is the hashcode for a subtree's post-order traversal and the value is the corresponding
traversal in a array list, a nested list of lists containing duplicated trees found so far. Once the left (l) and right subtrees (r) of a node (n) have been explored do the following:
For each subtree in l's HashMap, check if that subtree is also in r's HashMap. If true, remove that subtree from r's HashMap and add it to an array list of duplicated trees. If false,
add it to a HashMap which will keep track of unduplicated trees. Return a new data object with the updated array list of duplicated trees, HashMap of unduplicated trees from the left
and right returns, and the array list of the post order traversal rooted at n.
Time Complexity: O(n^2) where n is the size of the tree.

``````public static class Data
{
List<Integer> tree; //Serialized version of post-order traversal
HashMap<Integer,List<Integer>> mp;
List<List<Integer>> duplcs;

public Data(List<Integer> t){
tree = t;
mp = new HashMap<Integer,List<Integer>>();
duplcs = new ArrayList<List<Integer>>();
}
}

public static class Node{
int value;
Node left;
Node right;
}

public List<List<Integer>> duplicates(Node n){
Data d = getDupicates(n);
return d. duplcs;
}

private static Data getDuplicates(Node n){
if(n == null){
return null;
}
Data l = getDuplicates(n.left);
Data r = getDuplicates(n.right);
Data d;
StringBuilder t;

if(l == null && r == null){
List<Intger> ls = new ArrayList<Integer>();
ls.add(n.val);
d = new Data(ls);
d.mp.put(ls.hashCode(),ls);
}
else if (l != null && r == null){
List<Integer> t = l.tree;
t.add(n.val);

d = new Data(l.tree);
l.mp.put(t.hashCode(),t);
d.mp = l.mp;
d.duplcs = l.duplcs;
}
else if (l == null && r != null){
List<Integer> rTree = r.tree;
rTree.add(n.val);
r.mp.put(rTree.hashCode(),rTree);
d = new Data(rTree);
d.mp = r.mp;
d.duplcs = r.duplcs;
}else{
List<Integer> tree = new ArrayList<Integer>();
tree.addAll(l.tree);
tree.addAll(r.tree);
tree.addAll(n.val);
Map<Integer,List<Integer>> mp = new HashMap<Integer,List<Integer>>();
List<List<Integer>> duplcs = new ArrayList<List<Integer>>();
for(Map.Entry<Integer,List<Integer>> e: l.mp.entrySet()){
int key = e.getKey();
if(r.mp.contains(key)){
duplcs.add(e.getValue());
r.mp.remove(key);
}else{
mp.put(key,e.getValue());

}
}
for(Map.Entry<Integer,List<Integer>> e: r.mp.entrySet()){
mp.put(e.getKey(),e.getValue());
}
duplcs.addAll(l.duplcs);
duplcs.addAll(r.duplcs);
d = new Data(tree);
d.mp = mp;
d.mp.put(tree.hashCode(),tree);

}
return d;
}
``````

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.