Lecture 12: Tree Isomorphism and QuickSelect
- Divide and conquer
- Variable partitioning
- Randomized and de-randomized algorithms
Russell began by talking about binary tree "rotations" for transforming a binary tree into another binary tree. Assuming you don't care about the order of the children of a node, these can be seen as isomorphisms, particularly Tree Isomorphisms.
Suppose you have two trees \( T_1, T_2 \) and you want to know whether they are isomorphic. Method: First compute the size \( s(u) \) of the sub-tree rooted at each node \( u \). Then run the following program:
TI(r1, r2):
if s(r1) != s(r2):
return False
else if s(r1) <= 1:
return True
else:
a1, b1 <- children of r1
a2, b2 <- children of r2
return (TI(a1, a2) AND TI(b1, b2)) OR (TI(a1, b2) AND TI(b1, a2))
You might think you'd have \( T(n) = 4 T(n / 2) + O(1) \). (But we're not really dividing the problem in half each time.) The Master Theorem says the total time is \( O(n ^ {\log_2 4}) = O (n^2) \). Russell showed on the board if the trees have linear shapes, you can prove the overally complexity is \( O(n) \).
Note \( n = L_1 + R_1 + 1 \). Here \( n \) is the size of a node, and the other quantities are the sizes of its left and right children. If the children are of the same size, the timing recurrence looks like \(T(n) \leq 4 T(\frac{n-1}{2}) + c \). If the children are of different sizes, there's no ambiguity in choosing which way to recurse, so the timing recurrence looks like \( T(n) \leq T(L_1) + T(L_2) + c \).
We will guess \( T(n) \leq c' n^2 \). Prove by induction on \( n \). The base case is immediately true (tree with only one node), just choose \( c' > c \). Inductive step, case 1: \[ T(n) \leq 4 T((n - 1)/2) + c \leq c'n^2 \] Inductive step, case 2: \[ T(n) \leq T(L_1) + T(R_1) + c \leq c' (L_1^2 + R_1^2 + 1) \leq c' n^2 \]
Russell made the point that it usually doesn't make sense to use \( O \) notation in an induction proof. It is because inductions proofs deal with fixed \( n \), but \( O \) notation describes how something changes as \( n \rightarrow \infty \).
Selection
I Select: Find the I'th largest element of a list. Russell described QuickSelect: http://en.wikipedia.org/wiki/Quickselect. This has \( O(n^2) \) worse case performance.
Proof of linear expected time uses a probabilistic argument. \[ ET(n) = 1/2 (ET(3/4n) + O(n)) + 1/2(ET(n) + O(n)) \]