Quickly recognizing primes less than 1000: memorizing exceptional composites

In my previous post I wrote about a procedure for testing the primality of any number less than 1000:

  1. Test for divisibility by all primes up to 13, and also 19. (In practice I test for 2 and 5 first, which is pretty much automatic; then for 3 and 11, which both involve adding and subtracting digits; then 7 and 19, which both involve multiplying the final digit by two and either subtracting (7) or adding (19); and last I test for 13.)

  2. Check if the number is one of a memorized list of 18 composites less than 1000 which are not divisible by either 19 or any prime \leq 13. (In practice, of course, I do this check first, before doing any divisibility tests.)

How fast is it?

Using this method, presented with a random number from 1-1000 not divisible by 2 or 5 (I exclude those because I can pretty much tell they are not prime instantaneously, and I think it makes for a more interesting measurement to exclude them), right now it takes me on average 15 seconds to determine whether such a number is prime or not. Of course the variance is huge: numbers that are divisible by 3 I can identify in a second; for numbers that are actually prime, it can take me something like 40 seconds to run through all the divisibility tests in my head. I expect with practice I’ll get faster.

I’m still interested in exploring other methods as well, but figured this is something relatively simple that can provide a good baseline for comparison!

Memorizing exceptional composites

For the rest of the post I want to talk about how I memorized the list of exceptional composites. Here’s the list again, this time with prime factorizations given:

\begin{array}{rcl}289 &=& 17^2 \\ 391 &=& 17 \cdot 23 \\ 493 &=& 17 \cdot 29 \\ 527 &=& 17 \cdot 31 \\ 529 &=& 23^2 \\ 629 &=& 17 \cdot 37 \\ 667 &=& 23 \cdot 29 \\ 697 &=& 17 \cdot 41 \\ 713 &=& 23 \cdot 31 \\ 731 &=& 17 \cdot 43 \\ 799 &=& 17 \cdot 47 \\ 841 &=& 29^2 \\ 851 &=& 23 \cdot 37 \\ 899 &=& 29 \cdot 31 \\ 901 &=& 17 \cdot 53 \\ 943 &=& 23 \cdot 41 \\ 961 &=& 31^2 \\ 989 &=& 23 \cdot 43 \end{array}

That looks like a lot of stuff to memorize. And it would be, if you tried to memorize it as a bunch of individual, disconnected facts. But fortunately we can do better! Human brains are good at remembering sequences and stories. So we’re going to look at the numbers in order and tell ourselves stories about them. The better we get to know them and their relationships the easier they are to remember!

289, 391, 493

There is only one exceptional composite in each of the 200’s, 300’s, and 400’s, namely, 289, 391, and 493. What strikes one immediately about these numbers is that each is exactly 102 more than the previous. Is this a coincidence?

Of course not! It is actually due to the fact that 102 = 6 \cdot 17. Each of these numbers is a product of 17 with some other prime, and the second prime increases by 6 every time. That is, 289 = 17 \cdot 17, then 391 = 17 \cdot (17 + 6) = 17 \cdot 23, and 493 = 17 \cdot (23 + 6) = 17 \cdot 29. Of course adding 6 to a prime doesn’t always get us another prime—but it works surprisingly often for smaller primes. And every prime besides 2 and 3 is either one more than a multiple of 6, or one less. So if we start with 5 and 7 and keep adding 6, we will definitely hit all the primes.

This sequence of multiples of 17 starts from 17 \cdot 5, and if we continue it further we see that it contains several more numbers from our exceptional set as well:

\begin{array}{rcl}85 &=& 17 \cdot 5 \\ 187 &=& 17 \cdot 11 \\ \mathbf{289} &=& 17 \cdot 17 \\ \mathbf{391} &=& 17 \cdot 23 \\ \mathbf{493} &=& 17 \cdot 29 \\ 595 &=& 17 \cdot 35 \\ \mathbf{697} &=& 17 \cdot 41 \\ \mathbf{799} &=& 17 \cdot 47 \\ \mathbf{901} &=& 17 \cdot 53 \end{array}

What about if we start with 17 \cdot 7 and keep adding 102?

\begin{array}{rcl}119 &=& 17 \cdot 7 \\ 221 &=& 17 \cdot 13 \\ 323 &=& 17 \cdot 19 \\ 425 &=& 17 \cdot 25 \\ \mathbf{527} &=& 17 \cdot 31 \\ \mathbf{629} &=& 17 \cdot 37 \\ \mathbf{731} &=& 17 \cdot 43 \\ 833 &=& 17 \cdot 49 \end{array}

This sequence yields three of our exceptional composites, and quite a few others which in theory we can rule out with our divisibility tests but are probably worth knowing anyway (119, 221, 323, 833).

527, 529, 629, 667, 697

There are only two exceptional composites in the 500’s, and they are twins: 527 and 529. 527 we have already seen: it shows up in the sequence of multiples of 17 that begins with 17 \cdot 7. On the other hand 529 is 23^2.

The next exceptional composite is 629, the next multiple of 17 in the sequence after 527. Of course it is also exactly 100 bigger than 529. I personally find the sequence 527, 529, 629 to be quite memorable.

Next is 667 = 23 \cdot 29, which is the closest integer to two-thirds of 1000. If you know that 676 = 26^2, then 667 is also easy to remember for two reasons: it has the same digits as 676 but with the 7 and 6 reversed, and it is exactly 9 less than 676, and hence it is 26^2 - 3^2 = (26 - 3) \cdot (26 + 3).

The last exceptional composite in the 600’s is 697, which is from the sequence of multiples of 17 that started with 289, 391, 493 (595 is skipped because it is obviously a multiple of 5).

713, 731, 799, 841, 851, 899

Next come a pair of twins, a 99, then another pair of twins, then another 99! 713 and 731 are twins because they have the same digits, with the 1 and 3 reversed. 731 we have already seen: it is from the same 17-sequence as 527 and 629. 713 is 729 - 16 = 27^2 - 4^2 = (27-4)\cdot (27+4). 799 is again from the 17-sequence beginning with 289.

841 and 851 are twins because they have the same digits except the 4 and the 5, which are consecutive. 841 is 29^2, and 851 is 900 - 49 = 30^2 - 7^2 = (30 - 7) \cdot (30 + 7). Finally we have 899 which is 900 - 1 = 30^2 - 1^2 = (30 - 1) \cdot (30 + 1).

901, 943, 961, 989

I haven’t thought of a nice story to tell about these—I think of the last three as sort of “sporadic”, but there’s only three of them so it’s not that hard. Someone else could probably come up with a nice mnemonic.

901 in any case is not too hard to remember because it’s a twin with 899, and it’s also the end of the 17-sequence that started with 289.

943 is 23 \cdot 41. It’s 1024 - 81 = 32^2 - 9^2, but unlike some of the other differences of squares I’ve highlighted, I doubt this will actually help me remember it.

961 is 31^2. I think it’s cute that 169, 196, and 961 are all perfect squares.

Last but not least, 989 = 23 \cdot 43. If you happen to know that 33^2 = 1089 (I sure don’t!), then this is easy to remember as 33^2 - 10^2 = (33-10) \cdot (33+10).

Once again

Repetition helps too, so let’s recite: it starts with 289 = 17^2, then continues by 102s: 391, 493. After that the twins 527, 529, followed by 629; then 667 and 697. Then two sets of twins each with its 99: 713, 731, 799; 841, 851, 899; then 901 to come after 899, and then the three sporadic values: 943, 961, 989!

Posted in arithmetic, computation, primes | Tagged , , , , , , ,

Quickly recognizing primes less than 1000: divisibility tests

I took a little hiatus from writing here since I attended the International Conference on Functional Programming, and since then have been catching up on teaching stuff and writing a bit on my other blog. I gave a talk at the conference which will probably be of interest to readers of this blog—I hope to write about it soon!

In any case, today I want to return to the problem of quickly recognizing small primes. In my previous post we considered “small” to mean “less than 100”. Today we’ll kick it up a notch and consider recognizing primes less than 1000. I want to start by considering some simple approaches and see how far we can push them. In future posts we’ll consider some fancier things.

First, some divisibility tests! We already know how to test for divisibility by 2, 3, and 5. Let’s see rules for 7, 11, and 13.

  • To test for divisibility by 7, take the last digit, chop it off, and subtract double that digit from the rest of the number. Keep doing this until you get something which obviously either is or isn’t divisible by 7. For example, if we take 2952, we first chop off the final 2; double it is 4, and subtracting 4 from 295 leaves 291. Subtracting twice 1 from 29 yields 27, which is not divisible by 7; hence neither is 2952.

    As an optimization, we can always reduce things mod 7. For example, if we see the digit 7, we can just throw it away; if we see an 8 or 9 we can treat it as 1 or 2, respectively. And if we see a 3, the rule would tell us to subtract 6, but if it’s easier we can add 1 instead, since subtracting 6 and adding 1 are the same mod 7. With a bit of practice this can be done quite quickly.

    For an explanation of why this works, and several other fun methods for testing divisibility by 7, see this post by Mark Dominus.

  • To test a 3-digit number for divisibility by 11, just add the first and last digits and then subtract the middle digit. The original number is divisible by 11 if and only if the result is.

    This is especially obvious with numbers like 187, where the sum of the first and last digits is equal to the middle digit. (Subtracting the middle digit would leave 0, which is divisible by 11.) But it also applies in cases like 517: we have 5 + 7 - 1 = 12 - 1 = 11.

    The reason this works is that 10 is equivalent to -1 \pmod{11}, so a \cdot 10^2 + b \cdot 10 + c \equiv a \cdot (-1)^2 + b \cdot (-1) + c \equiv a - b + c \pmod{11}. This also suggests how to generalize to more than just 3-digit numbers: just alternately add and subtract digits.

  • To test for divisibility by 13, chop off the last digit, multiply it by 4, and add it to the remaining number. Keep doing this until you end up with something that you know either is or isn’t divisible by 13.

    Here reducing mod 13 can be even more helpful. For example, if the last digit is a 7, the rule says to add 28 to what’s left. But 28 is only 2 more than 2 \times 13 = 26, so adding 28 is equivalent to just adding 2.

    Why does this work? Suppose the final digit of our number is b and the rest of the number is a. That is, our number is of the form 10a + b, and we want to know whether this is equivalent to 0 \pmod{13}. But now note that 10a + b \equiv 0 \pmod{13} if and only if 40a + 4b \equiv 0 \pmod{13}. Why? From left to right, we are just multiplying both sides by 4; from right to left, we are allowed to divide by 4 since 4 is relatively prime to 13. So why did we choose to multiply by 4? It’s because it lets us get rid of the 10: 40 is the smallest multiple of 10 which is one away from a multiple of 13. Hence 10a + b \equiv 0 \pmod{13} iff 40a + 4b \equiv 0 \pmod{13} iff a + 4b \equiv 0 \pmod{13}.

    (Challenge: can you go back now and prove the test for divisibility by 7?)

At this point we might ask: if we take a number less than 1000 and test it for divisibility by 2, 3, 5, 7, 11, and 13, what’s left? In other words, what are the composite numbers under 1000 that we haven’t found yet? It turns out there are 27 of them: 289, 323, 361, 391, 437, 493, 527, 529, 551, 589, 629, 667, 697, 703, 713, 731, 779, 799, 817, 841, 851, 893, 899, 901, 943, 961, and 989. I’ll let you work out the factorizations; of course each one is a product of two primes which are at least 17.

So we could try to memorize this list and call it a day. Then the procedure becomes: given a number less than 1000, (1) test it for divisibility by all primes up to 13, and (2) check if it is one of the 27 composite numbers we have memorized. If it passes both tests, then it is prime. This sounds doable, though honestly I’m not super excited about memorizing a list of 27 composites.

There are a few more things we could do, though. First of all, notice that the divisibility test for 19 is super easy, since 19 is one less than 2 times 10: chop off the last digit, double it, and add it to the rest. Keep doing this until… you know the drill. This is just like the test for 7, but we add instead of subtract.

OK, so what if we test for all primes up to 13 and also 19? Then there are only 18 composites left that we have to memorize: 289, 391, 493, 527, 529, 629, 667, 697, 713, 731, 799, 841, 851, 899, 901, 943, 961, and 989. This is looking a bit better, and I am already noticing lots of patterns that would help with memorization: 529 and 629; 713 and 731; 899 and 901… oh, and 289 \to 391 \to 493 (since 102 is 6 \times 17). (…and it turns out that before publishing this post I couldn’t help myself and went ahead and memorized the list. It wasn’t that hard. I’ll say more about it in a future post!)

We could also test for divisibility by 17, of course. Unfortunately it is a bit more annoying: the smallest multiple of 10 which is one away from a multiple of 17 is 50, which is one less than 3 \times 17 = 51. So, to test for divisibility by 17, we chop off the last digit, multiply by 5, and subtract. This seems distinctly harder to do in my head than the other tests, because it seems to actually require dealing with two-digit numbers. If we do this, though, we are down to only 9 composites to memorize, which is not bad at all: 529, 667, 713, 841, 851, 899, 943, 961, 989.

Posted in arithmetic, computation, primes | Tagged , , , , | 3 Comments

Quickly recognizing primes less than 100

Recently, Mark Dominus wrote about trying to memorize all the prime numbers under 1000. This is a cool idea, but it made me start thinking about alternative: instead of memorizing primes, could we memorize a procedure for determining whether a number under 1000 is prime or composite? And can we make it clever enough to be done relatively quickly? This does tie into my other recent posts about primality testing, but to be clear, it’s also quite different: I am not talking about a general method for determining primality, but the fastest method we can devise for mentally determining, by hook or by crook, whether a given number under 1000 is prime. Hopefully there are rules we can come up with which are valid for numbers less than 1000—and thus make them easier to test—even though they aren’t valid for bigger numbers in general.

As a warmup, today I’ll write about how I determine whether a number less than 100 is prime: I don’t have them memorized, but can very quickly decide whether such a number is prime or not—and you can, too! This is a situation where doing a little mathematical analysis beforehand goes a long way.

Since 10^2 = 100, every composite number less than 100 has at least one factor which is less than 10. This means that every composite less than 100 is divisible by 2, 3, 5, or 7. Multiples of 2, 3, and 5 are relatively easy to recognize, and as we’ll see, 7 is not hard to deal with either.

  1. Any even number or multiple of 5 (i.e. numbers ending with an even number or with a 5) is clearly composite. (Other than 2 and 5 themselves, of course.)

  2. Multiples of 3 are composite. There are many numbers which I think of as “obvious” multiples of three: 9, 21, 27, and 33 because I have had their factorizations memorized since third grade; 81 because it’s a power of three so I have it memorized too; and 39, 63, 69, 93, and 99 because they consist of two digits each of which is a multiple of three. As you probably know, there is also a simple test for determining divisibility by three: just add the digits and see whether the result is divisible by three. This test identifies a few more “nonobvious” multiples of three which you might otherwise think are prime: 51 = 3 \cdot 17, 57 = 3 \cdot 19, and 87 = 3 \cdot 29.

  3. What’s left? There are only three numbers less than 100 which are composite and not divisible by 2, 3, or 5:

    • 49 = 7 \cdot 7 and 77 = 7 \cdot 11 are easy to spot.
    • 91 = 7 \cdot 13 is the one I always used to forget about—I call it a “fool’s prime” because it “looks” prime but isn’t. It’s worth just memorizing the fact that it is composite (and its factorization).

So, to sum up, faced with a number less than 100 that I want to test for primality, I can quickly rule it out if it is divisible by 2, 3, or 5, or if it is a multiple of 7 I recognize (49, 77, or 91). And that’s it! Anything else has to be prime.

In a future post I plan to write about how feasible it is to come up with a similar procedure to identify all primes under 1000.

Posted in arithmetic, computation, primes | Tagged , , , | 18 Comments

Primality testing: recap

Whew, this is developing into one of the longest post series I’ve ever written (with quite a few tangents and detours along the way). I thought it would be worth taking a step back for a minute to recap what we’ve covered and where this is going.

Next up: first of all, there’s still quite a bit more to say about the Fermat primality test. After that I plan to present some better/more efficient tests as well (at least Miller-Rabin and Baille-PSW, possibly others). And knowing me there will be at least three more tangents along the way!

Posted in computation, number theory, primes | Tagged , , , | 2 Comments

Efficiency of repeated squaring: another proof

In my previous post I proved that the “binary algorithm” (corresponding to the binary expansion of a number n) is the most efficient way to build n using only doubling and incrementing steps.

Today I want to explain another nice proof, written in a comment by an anonymous1 commenter. So although this proof is not originally due to me, I thought it deserved to be written up more fully and not squished into a comment, and I’ve changed it in a few minor ways which I think make it easier to understand (although perhaps not everyone will agree!).

Let \alpha(n) denote the minimum number of doubling and incrementing steps needed to generate n, and let \beta(n) denote the number of steps used by the binary algorithm. Note that \alpha(n) \leq \beta(n) for all n: if the binary algorithm uses \beta(n) steps, then the optimal number of steps can’t be any higher.

Now, suppose that the binary algorithm (call it algorithm B) isn’t the most efficient algorithm (our goal will be to derive a contradiction from this assumption). That means there exist values of n for which \alpha(n) < \beta(n). Let m be the smallest such n, so we must have \alpha(k) = \beta(k) for all k < m.

First, note that m > 0: B uses zero steps for n = 0 which is obviously optimal. Now, let’s think about the parity of m:

  • If m is odd, then the last step of any algorithm to compute m has to be incrementing, since that’s the only way to get an odd number—if the last step is doubling then the result would be even. Removing this last incrementing step from the sequence generated by algorithm B results in a sequence of \beta(m) - 1 steps which yields m-1. Since \alpha(m) < \beta(m), there must be some other sequence of length \alpha(m) that yields m, but since m is odd it must also end in an increment, so likewise we can delete the final increment step to get a sequence of \alpha(m) - 1 steps which yields m-1. But \alpha(m) - 1 < \beta(m) - 1, so algorithm B is not optimal for (m-1)—contradicting our assumption that m is the smallest number for which B is not optimal.

    Put more succinctly: if we can generate an odd number m more efficiently than algorithm B, then we can also generate m-1 more efficiently, so the smallest non-optimal m can’t be odd.

  • So suppose m is even, say m = 2k. We know the last step of B is doubling in this case, since the binary representation of m ends in a 0. Let A be a sequence of length \alpha(m) that generates m.

    • If the last step of A is also doubling, then removing the doubling step would give us a way to generate k = m/2 more efficiently than algorithm B (since algorithm B’s sequence for m/2 is also the same as the sequence for m without the final doubling step), again contradicting the minimality of m.
    • So suppose the last step of A is incrementing. Since the binary sequence for 2k is the same as the sequence for k followed by a doubling step, \beta(2k) = 1 + \beta(k). This in turn is equal to 1 + \alpha(k), since we assumed that \alpha(k) = \beta(k) for any k < m. So we have \beta(2k) = 1 + \alpha(k).

      On the other hand, since the last step of the optimal sequence A for m = 2k is an increment, we have \alpha(2k) = 1 + \alpha(2k-1) (since A is an optimal sequence for 2k if and only if A without the final increment is an optimal sequence for 2k-1). This is equal to 1 + \beta(2k-1), since \alpha and \beta are equal on everything less than k. Since 2k-1 is odd, the binary algorithm sequence for 2k-1 ends with a double followed by an increment, hence 1 + \beta(2k-1) = 2 + \beta(2k-2) = 3 + \beta(k-1) = 3 + \alpha(k-1).

      Putting this all together, we have 1 + \alpha(k) = \beta(2k) > \alpha(2k) = 3 + \alpha(k-1), which means \alpha(k) > 2 + \alpha(k-1). But this is absurd: there’s no way the optimal sequence for k takes two more steps than the optimal sequence for k-1, because we could just add an increment.

So we have shown that all these cases lead to absurdity: the conclusion is that there can’t be any such n where \alpha(n) < \beta(n): the binary algorithm is optimal for every n!

  1. Well, unless their given name is actually John Doe.

Posted in computation, proof | Tagged , , , , , , , , | 3 Comments

Efficiency of repeated squaring: proof

My last post proposed a claim:

The binary algorithm is the most efficient way to build n using only doubling and incrementing steps. That is, any other way to build n by doubling and incrementing uses an equal or greater number of steps than the binary algorithm.

Someone posted a very nice, relatively short proof in the comments, which was quite different from the proof I had in mind. Maybe I’ll write about it in another post, but for now you can go read it for yourself.

In this post I’d like to present the proof I had in mind, which has a more constructive/computational flavor. Let’s use the digit 1 to represent an increment step, and the digit 2 to represent a doubling step. We can use a sequence of these digits to compactly represent a sequence of steps. Each sequence s of 1’s and 2’s thus corresponds to a natural number, namely, the one we get if we start from zero and execute all the steps from left to right. Let’s denote this number by N(s). So, for example, N(1221121) = 13, since starting from 0, we first increment (yielding 1), then double twice (yielding 4), increment twice (6), double (12), and finally increment (13). Also, denote the length of s by \ell(s). The length of s is the same as the number of steps it represents. For example, \ell(1221121) = 7.

Now, it turns out that 1221121 is not the most efficient way to generate 13. In looking at this and other examples of such non-optimal sequences, what stuck out to me is that they all seemed to contain consecutive increment operations. For example, in the case of 121121, consecutive 1’s are used in the middle to go from 4 to 6, after doubling 2 to get 4. But if we had just incremented once first (going from 2 to 3), then doubling would get us to 6 right away; the effect of the one increment is “magnified” by the subsequent doubling. Formally, we can say that 211 (a double followed by two increments) can always be replaced by 12 (an increment followed by a double), because 2x + 1 + 1 = 2x+2 = 2(x+1).

What if we keep doing this operation—replacing 211 with 12—as much as we can? It turns out that by doing this we can always transform any sequence into an equivalent one the same length or shorter which has no consecutive 1’s. This is the idea behind the first lemma.

Lemma 1. Given any sequence s, there exists a sequence s' such that

  • N(s') = N(s),
  • \ell(s') \leq \ell(s), and
  • s' has no consecutive 1’s.

In other words, for any sequence we can always find an equivalent, shorter (or at least not longer) sequence with no consecutive 1’s.

Proof. By induction on the number of 1 symbols in s.

  • Base case: if s has no consecutive 1’s, then s = s' satisfies the conditions of the lemma, and certainly an s with zero 1’s has no consecutive 1’s.
  • Let k \geq 0 and suppose we know that for any s with exactly k 1’s there exists an equivalent s' with the stated properties. Now suppose we have a sequence s with exactly (k+1) 1’s. If none are adjacent then we are done, since s itself has the required properties. Otherwise, consider the leftmost pair of consecutive 1’s. There are two cases to consider:

    • If s begins with two 1’s, s = 11\dots this means that the procedure for computing N(s) starts by incrementing twice from 0 to reach 2. Let s' be the sequence obtained from s by replacing the initial 11 with 12. An increment followed by a double also yields 2, and the rest of s' is identical to s, so N(s) = N(s'). But s' has one fewer 1 than s, so the induction hypothesis tells us there must be some equivalent s'' no longer than s' with no consecutive 1’s. This s'' is the one we are looking for; we need only observe that N(s) = N(s') = N(s'') and \ell(s'') \leq \ell(s') = \ell(s).

    • Otherwise, the leftmost pair of 1’s must occur immdiately following a 2. In that case, as argued before, we can replace 211 by 12, which yields an equivalent but shorter sequence and reduces the number of 1’s by one. Again, the induction hypothesis now implies that we can find an equivalent sequence with no repeated 1’s which is no longer.

Let’s look at an example. Suppose we start with the most inefficient way possible to represent 13, namely, 13 consecutive increment steps. Then the lemma above says we can find an equivalent, shorter sequence with no consecutive 1’s. Moreover, the proof is actually constructive, that is, it does not merely assert that such a sequence exists, but gives us a concrete way to compute it. The proof outlines a recursive rewriting process which we can visualize as follows:

The red underline shows the part that is going to be rewritten at each step. Notice how the length either stays the same or decreases at each step, and how the number of 1’s decreases at each step. In fact, either a 1 gets deleted and the number of 2’s stays the same, or a 1 is replaced by a 2 when there are two 1’s at the leftmost position. This is the only way new 2’s are generated. So 2’s are “born” at the left end from a pair of 1’s, and then they spend the rest of their life slowly “migrating” to the right.

Let’s try starting with a different representation of 13, say, 121112111:

Curiously, it seems that the process ends with the same sequence (121221) even though we started with a different sequence that did not occur anywhere during the process generated by starting with thirteen 1’s. Could this just be a coincidence?

Well, the fact that I’m even asking the question kind of gives away the answer: it’s no coincidence. In fact, for a given n, if you start with any sequence representing n and run this process, you will always end with the same sequence at the end. And not only will this particular process always yield the same sequence, but there is only one possible sequence it could yield:

Lemma 2. For every natural number n, there is a unique sequence s such that s has no consecutive 1’s and N(s) = n.

Proof. By (strong) induction on n.

  • In the base case (n = 0), there is only one sequence which represents 0 at all, namely, the empty sequence (and it has no consecutive 1’s).

  • Now pick some n \geq 0 and assume that consecutive-1-less representations are unique for all n' < n. Suppose s_1 and s_2 are two sequences with no consecutive 1’s such that N(s_1) = N(s_2) = n. Our goal is to show that in fact s_1 = s_2.

    • If s_1 and s_2 both end with the same symbol, then removing it yields two consecutive-1-less sequences representing some n' < n (either n - 1 or n / 2 depending on the symbol removed); by the induction hypothesis they must be the same and hence s_1 = s_2 as well.

    • Otherwise, suppose without loss of generality that s_1 ends with 1 and s_2 ends with 2; we will show that this case is actually impossible. N(s_2) must be even, since it ends with a doubling step. If s_1 ends with a doubling step followed by an increment step (or if it consists solely of an increment), then it would be odd, but this is impossible since N(s_1) = N(s_2) and N(s_2) is even. Hence s_1 must end in consecutive 1’s; but this is also impossible since we assumed s_1 had no consecutive 1’s.

Finally, putting the pieces together: notice that the sequence generated from the binary representation of n has no consecutive 1’s, since each bit always generates a 2, which is optionally followed by a 1. Since by Lemma 2 we now know that such representations are unique, the process explained in the proof of Lemma 1 must actually result in this same unique sequence corresponding to the binary representation of n. Since this process results in a sequence which is no longer than the starting sequence, this means that every sequence of steps representing n must be at least as long as the binary sequence. Hence, the binary sequence is the most efficient!

Posted in computation, proof | Tagged , , , , , , , , | 2 Comments

Efficiency of repeated squaring

As you probably realized if you read both, my recent post without words connects directly to my previous post on exponentiation by repeated squaring Each section shows the sequence of operations used by the repeated squaring algorithm to build up a certain exponent. For example, here’s the diagram for 13:

Each row has two open rectangles exactly the same length as the previous row; this represents squaring, that is, multiplying the exponent by two. Some rows also have an extra dark square at the end, which represents multiplying by a, that is, adding 1 to the exponent. You can read off the binary representation of the final exponent by reading from top to bottom: a filled-in square represents a 1, and no filled-in square represents a 0. In the case of 13 above, we can see that the binary representation of 13 is 1101.

Commenter Steven G made a very interesting guess, that the images represented the most efficient way to form each integer using only doubling and adding 1. This seems plausible, but I was not at all sure. There are lots of ways to build a given integer by doubling and adding 1. For example, we can get 3 by adding 1 three times; or by adding 1, then doubling, then adding 1. We can get 6 by adding 1, doubling, adding 1, and doubling; or by adding 1, doubling twice, and then adding 1 twice. For certain numbers, might there be some weird clever way to build them more efficiently than the algorithm corresponding to their binary representation?

Claim: the binary algorithm is the most efficient way to build n using only doubling and incrementing steps. That is, any other way to build n by doubling and incrementing uses an equal or greater number of steps than the binary algorithm.

Let’s make this precise:

  • “One step” refers to either a doubling step or an incrementing step.

  • The binary algorithm is as follows: start with 0; reading the binary representation of n from left to right, double the current number and add 1 (two steps) every time you encounter a 1 bit, and only double (one step) every time you encounter a 0 bit— except that for the initial 1 bit you simply add one, instead of doing a useless doubling first (it’s pointless to double zero).

Can you prove the claim? I think I have a nice proof, which I’ll share in a future post, but I’m interested to see what others come up with.

Posted in computation | Tagged , , , , , , , | 7 Comments