From what I can tell, the running time is O(height of BST) + O(k), optimal means lower bound?
Why the hint says "The optimal runtime complexity is O(height of BST)." ?

Add property
size
to BST node. Looks like:class TreeNode(object): def __init__(self, x): self.val = x self.left = None self.right = None self.size = 1 # the size of subtree whose root is current node
When
insert/delete
, updatesize
too. Then we can get kth smallest element according tosize
.Visit from
root
of theBST
: if
root.size == k
, returnroot.val
.  if
root.size > k
, search thek
smallest element ofroot.left
 if
root.size < k
, search thek  root.size
smallest element ofroot.right
At most we search
height of BST
times, so it'sO(height of BST)
.
 if

I have the same thought as shiyanhui.
The additional size property is to be maintained when inserting or deleting the node.
Simply when you find the position to insert/delete, you update the size property(+1 / 1) along the parent path all the way to the root.
I think this pretty much completes the interview question.