Let I2 be the set of tuples (a0,a1,a2,…) of nonnegative integers, such that all but finitely many ar are nonzero. We impose the lexicographic total ordering on I2: given a=(a0,a1,a2,…) and b=(b0,b1,b2,…) in I2, take the largest r such that ar≠br, and define a>b if and only if ar>br.
We have described a bijection between trees of depth at most 2 and tuples in I2. The theorem (for depth 2) now follows from the following two lemmas, which are left as an exercise to the reader.
Lemma. Any legal move changes a tuple a∈I2 to another tuple b such that b<a for our total ordering.
Lemma. The set I2 is well-ordered: it contains no infinite descending chain.
We will construct recursively a well-ordered set Id which classifies the isomorphism types of trees of depth at most d. Specifically, we will construct Id+1 as a set of tuples of nonnegative integers, where instead of indexing by N we index by Id. Precisely, Id+1 is the set of functions a:Id→N (we notate such a function i↦ai) such that ai=0 for all but finitely many i∈Id.
Again, we impose on Id+1 the lexicographic total ordering: given any a,b∈Id+1, choose i∈Id maximal such that a(i)≠b(i); and take a>b if and only if a(i)>b(i). (Only finitely many i satisfy a(i)≠b(i), so there is no difficulty in choosing the maximal one.)
Just like before, there is a bijection between elements of Id+1 and isomorphism types of trees of depth at most d+1: given a tree T, produce the tuple which records for each i∈Id the number of copies of Ti that are immediate subtrees of T.
The result now follows from the following two lemmas, the first of which we again leave as an exercise.
Lemma. Any legal move changes a tuple a∈Id+1 to another tuple b such that b<a for our total ordering.
Lemma. The set Id+1 is well-ordered: it contains no infinite descending chain.
Proof. Suppose a(1)≥a(2)≥a(3)≥⋯ is an infinite descending chain in Id+1. We want to show that eventually the chain stabilizes: there exists N such that a(n)=a(N) for n≥N.
(A remark about notation: a(N)∈Id+1 is a tuple in our infinite descending chain of tuples. For i∈Id, the i-th entry of this tuple is a(N)i.)
Let j∈Id be an index; we say that the sequence (a(n)) stabilizes to the right of index j if there exists a positive integer N such that
a(n)i=a(N)i
whenever i≥j and n>N.
Since Id is well-ordered, we can choose j minimal such that our sequence stabilizes to the right of j; also choose N as above, so that (a(n))i=(a(N))i whenever i≥j and n>N.
The tuple a(N) has only finitely many nonzero entries. If a(N)k=0 for all k<j, then the sequence stabilizes at a(N) and we are done. Otherwise, choose k maximal such that k<j and a(N)k>0. (There are only finitely many k with a(N)k>0, so again there is no harm in choosing the maximum.)
I claim that in fact the sequence stabilizes to the right of index k, contradicting our choice of j. Consider an index i≥k. If i≥j then we know (a(n))i=(a(N))i for n>N. If k<i<j then a(N)i=0; and by the lexicographic ordering on Id+1 we know that a(n)i=0 for n>N as well. So we only need to worry about i=k. Again by lexicographic ordering, the sequence
a(N)k,a(N+1)k,a(N+2)k,…
is a nonincreasing sequence of natural numbers. Any such sequence in N eventually stabilizes. So our sequence (a(n)) stabilizes to the right of k as well, contradicting our choice of j. ◼
In closing I'll just mention that the sets Id are examples of ordinals; interpreting them as such gives rise to the proofs linked to at the very beginning of this post.