A simple take on this can be explained as follows:

Let's start with a generic binary tree:
A / \ B C / \ D E / F \ H G

We notice that in the above diagram, we have 7 Nodes (AG) that are part of the tree and one unlinked node (H).

Let's assume we need to find the LCA of nodes D and E, we can find this out by running a DFS for D and tracking it's parents and a DFS for E and doing the same. Parent's are tracked in a bottomup apprach. That is, after we find the child, we add it to the list and then it's parent and all the way back to the root. (keep reading)

If use 2 arrays,
path_to_p
andpath_to_q
to track the ancestors of D and E, we get something likepath_to_p = [D,B,A]
andpath_to_q = [E,B,A]
. Now one can immediately notice that the value of the 1st index of both arrays are the same, and hence it is the LCA.
Under the hood we can implement this with the help of two pointers;p_p
that refers to 0th index ofpath_to_p
andp_q
to the 0th index forpath_to_q
and iterate over the lenght of one of the arrays and keep comparing adjacent values.
p_p, q_p = 0,0
for i in range(len(path_to_p)):
if path_to_p[p_p] == path_to_q[q_p]:
return path_to_p[p_p]
p_p += 1
q_p += 1

Let's take another example, LCA of D and C. After running DFS for both nodes, we find out
path_to_p = [D,B,A]
andpath_to_q =[C,A]
. We notice two things. 1) the lenght of the two lists are not the same. 2) One can say with confidence that A is the LCA. 
Based on our observations above, we can see that if offset the pointer of the largest array by some amount, we can faithfully just iterate over the length of the smaller array and we should reach our desired result! That offset can be found taking a difference in lengths of the two lists and assigning that offset to the pointer of the bigger of the two, instead of assigning it a 0, as shown :
p_p, q_p == 0,0 if len(path_to_p) > len(path_to_q): p_p = len(path_to_p)  len(path_to_q) elif len(path_to_p) < len(path_to_q): q_p = len(path_to_q)  len(path_to_p) for i in range(min(len(path_to_p),len(path_to_q)): if path_to_p[p_p] == path_to_q[q_p]: return path_to_p[p_p] p_p += 1 q_p += 1

Now let's handle the edge case, node H. It is a valid node, but isn't part of the tree. Hence, if we do a DFS for it, we will not be able to find and therefor, the array that stores it's ancestors is empty. Hence, if one of the two arrays are empty we can return back,
None
.
Let's see the code?!
code :
# Definition for a binary tree node.
# class TreeNode(object):
# def __init__(self, x):
# self.val = x
# self.left = None
# self.right = None
class Solution(object):
def lowestCommonAncestor(self, root, p, q):
"""
:type root: TreeNode
:type p: TreeNode
:type q: TreeNode
:rtype: TreeNode
"""
path_to_p, path_to_q = [], []
p_p, q_p = 0, 0
def helper(node, search, path): #runs DFS for search node
if node == None:
return False
if node == search:
path.append(node)
return True
if helper(node.left, search, path):
path.append(node)
return True
if helper(node.right, search, path):
path.append(node)
return True
helper(root, p, path_to_p) #DFS for p
helper(root, q, path_to_q) #DFS for q
if len(path_to_p) > len(path_to_q):
p_p = len(path_to_p)  len(path_to_q)
elif len(path_to_p) < len(path_to_q):
q_p = len(path_to_q)  len(path_to_p)
for i in range(0, min(len(path_to_q), len(path_to_p))):
if path_to_p[p_p] == path_to_q[q_p]:
return path_to_p[p_p]
p_p += 1
q_p += 1
return None