## Efficiency of repeated squaring: proof

My last post proposed a claim:

The binary algorithm is the most efficient way to build $n$ using only doubling and incrementing steps. That is, any other way to build $n$ by doubling and incrementing uses an equal or greater number of steps than the binary algorithm.

Someone posted a very nice, relatively short proof in the comments, which was quite different from the proof I had in mind. Maybe I’ll write about it in another post, but for now you can go read it for yourself.

In this post I’d like to present the proof I had in mind, which has a more constructive/computational flavor. Let’s use the digit $1$ to represent an increment step, and the digit $2$ to represent a doubling step. We can use a sequence of these digits to compactly represent a sequence of steps. Each sequence $s$ of $1$’s and $2$’s thus corresponds to a natural number, namely, the one we get if we start from zero and execute all the steps from left to right. Let’s denote this number by $N(s)$. So, for example, $N(1221121) = 13$, since starting from $0$, we first increment (yielding 1), then double twice (yielding 4), increment twice (6), double (12), and finally increment (13). Also, denote the length of $s$ by $\ell(s)$. The length of $s$ is the same as the number of steps it represents. For example, $\ell(1221121) = 7$.

Now, it turns out that $1221121$ is not the most efficient way to generate $13$. In looking at this and other examples of such non-optimal sequences, what stuck out to me is that they all seemed to contain consecutive increment operations. For example, in the case of $121121$, consecutive $1$’s are used in the middle to go from $4$ to $6$, after doubling $2$ to get $4$. But if we had just incremented once first (going from $2$ to $3$), then doubling would get us to $6$ right away; the effect of the one increment is “magnified” by the subsequent doubling. Formally, we can say that $211$ (a double followed by two increments) can always be replaced by $12$ (an increment followed by a double), because $2x + 1 + 1 = 2x+2 = 2(x+1)$.

What if we keep doing this operation—replacing $211$ with $12$—as much as we can? It turns out that by doing this we can always transform any sequence into an equivalent one the same length or shorter which has no consecutive $1$’s. This is the idea behind the first lemma.

Lemma 1. Given any sequence $s$, there exists a sequence $s'$ such that

• $N(s') = N(s)$,
• $\ell(s') \leq \ell(s)$, and
• $s'$ has no consecutive $1$’s.

In other words, for any sequence we can always find an equivalent, shorter (or at least not longer) sequence with no consecutive $1$’s.

Proof. By induction on the number of $1$ symbols in $s$.

• Base case: if $s$ has no consecutive $1$’s, then $s = s'$ satisfies the conditions of the lemma, and certainly an $s$ with zero $1$’s has no consecutive $1$’s.
• Let $k \geq 0$ and suppose we know that for any $s$ with exactly $k$ $1$’s there exists an equivalent $s'$ with the stated properties. Now suppose we have a sequence $s$ with exactly $(k+1)$ $1$’s. If none are adjacent then we are done, since $s$ itself has the required properties. Otherwise, consider the leftmost pair of consecutive $1$’s. There are two cases to consider:

• If $s$ begins with two $1$’s, $s = 11\dots$ this means that the procedure for computing $N(s)$ starts by incrementing twice from $0$ to reach $2$. Let $s'$ be the sequence obtained from $s$ by replacing the initial $11$ with $12$. An increment followed by a double also yields $2$, and the rest of $s'$ is identical to $s$, so $N(s) = N(s')$. But $s'$ has one fewer $1$ than $s$, so the induction hypothesis tells us there must be some equivalent $s''$ no longer than $s'$ with no consecutive $1$’s. This $s''$ is the one we are looking for; we need only observe that $N(s) = N(s') = N(s'')$ and $\ell(s'') \leq \ell(s') = \ell(s)$.

• Otherwise, the leftmost pair of $1$’s must occur immdiately following a $2$. In that case, as argued before, we can replace $211$ by $12$, which yields an equivalent but shorter sequence and reduces the number of $1$’s by one. Again, the induction hypothesis now implies that we can find an equivalent sequence with no repeated $1$’s which is no longer.

Let’s look at an example. Suppose we start with the most inefficient way possible to represent $13$, namely, $13$ consecutive increment steps. Then the lemma above says we can find an equivalent, shorter sequence with no consecutive $1$’s. Moreover, the proof is actually constructive, that is, it does not merely assert that such a sequence exists, but gives us a concrete way to compute it. The proof outlines a recursive rewriting process which we can visualize as follows:

The red underline shows the part that is going to be rewritten at each step. Notice how the length either stays the same or decreases at each step, and how the number of $1$’s decreases at each step. In fact, either a $1$ gets deleted and the number of $2$’s stays the same, or a $1$ is replaced by a $2$ when there are two $1$’s at the leftmost position. This is the only way new $2$’s are generated. So $2$’s are “born” at the left end from a pair of $1$’s, and then they spend the rest of their life slowly “migrating” to the right.

Let’s try starting with a different representation of $13$, say, $121112111$:

Curiously, it seems that the process ends with the same sequence ($121221$) even though we started with a different sequence that did not occur anywhere during the process generated by starting with thirteen $1$’s. Could this just be a coincidence?

Well, the fact that I’m even asking the question kind of gives away the answer: it’s no coincidence. In fact, for a given $n$, if you start with any sequence representing $n$ and run this process, you will always end with the same sequence at the end. And not only will this particular process always yield the same sequence, but there is only one possible sequence it could yield:

Lemma 2. For every natural number $n$, there is a unique sequence $s$ such that $s$ has no consecutive $1$’s and $N(s) = n$.

Proof. By (strong) induction on $n$.

• In the base case ($n = 0$), there is only one sequence which represents $0$ at all, namely, the empty sequence (and it has no consecutive $1$’s).

• Now pick some $n \geq 0$ and assume that consecutive-$1$-less representations are unique for all $n' < n$. Suppose $s_1$ and $s_2$ are two sequences with no consecutive $1$’s such that $N(s_1) = N(s_2) = n$. Our goal is to show that in fact $s_1 = s_2$.

• If $s_1$ and $s_2$ both end with the same symbol, then removing it yields two consecutive-$1$-less sequences representing some $n' < n$ (either $n - 1$ or $n / 2$ depending on the symbol removed); by the induction hypothesis they must be the same and hence $s_1 = s_2$ as well.

• Otherwise, suppose without loss of generality that $s_1$ ends with $1$ and $s_2$ ends with $2$; we will show that this case is actually impossible. $N(s_2)$ must be even, since it ends with a doubling step. If $s_1$ ends with a doubling step followed by an increment step (or if it consists solely of an increment), then it would be odd, but this is impossible since $N(s_1) = N(s_2)$ and $N(s_2)$ is even. Hence $s_1$ must end in consecutive $1$’s; but this is also impossible since we assumed $s_1$ had no consecutive $1$’s.

Finally, putting the pieces together: notice that the sequence generated from the binary representation of $n$ has no consecutive $1$’s, since each bit always generates a $2$, which is optionally followed by a $1$. Since by Lemma 2 we now know that such representations are unique, the process explained in the proof of Lemma 1 must actually result in this same unique sequence corresponding to the binary representation of $n$. Since this process results in a sequence which is no longer than the starting sequence, this means that every sequence of steps representing $n$ must be at least as long as the binary sequence. Hence, the binary sequence is the most efficient!

Associate Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
This entry was posted in computation, proof and tagged , , , , , , , , . Bookmark the permalink.