Four formats for Fermat

In my previous post I mentioned Fermat’s Little Theorem, a beautiful, fundamental result in number theory that underlies lots of things like public-key cryptography and primality testing. (It’s called “little” to distinguish it from his (in)famous Last Theorem.) There are several different forms in which it is commonly presented, so I wanted to start by introducing them and showing how they are related.

Statement 1

Let’s start with the statement that looks the least general:

If p is prime and a is an integer where 0 < a < p, then a^{p-1} \equiv 1 \pmod p.

(Recall that x \equiv y \pmod p means that x and y have the same remainder when you divide them by p.) For example, 7 is prime, and we can check that for each a \in \{1, \dots, 6\}, if you raise a to the 6th power, you get a number which is one more than a multiple of 7:

\displaystyle \begin{array}{ccrcr} 1^6 &=& 1 &=& 0 \cdot 7 + 1 \\[0.2em] 2^6 &=& 64 &=& 9 \cdot 7 + 1 \\[0.2em] 3^6 &=& 729 &=& 104 \cdot 7 + 1 \\[0.2em] 4^6 &=& 4096 &=& 585 \cdot 7 + 1 \\[0.2em] 5^6 &=& 15625 &=& 2232 \cdot 7 + 1 \\[0.2em] 6^6 &=& 46656 &=& 6665 \cdot 7 + 1 \end{array}

Statement 2

Here’s a second variant of the theorem that looks slightly more general than the first:

If p is a prime and a is any integer not divisible by p, then a^{p-1} \equiv 1 \pmod p.

This looks more general because a can be any integer not divisible by p, not just an integer between 0 and p. As an example, let a = 10. Then 10^6 = 1000000 = 7 \cdot 142857 + 1.

We can see that (2) is more general than (1), since if 0 < a < p then it is certainly the case that is not divisible by p. Hence (2) implies (1). But actually, it turns out that (1) implies (2) as well!

Here’s a proof: let’s assume (1) and use it to show (2). In order to show (2), we have to show that a^{p-1} \equiv 1 \pmod p whenever p is prime and a is any integer not divisible by p. So let p be an arbitrary prime and a an arbitary integer not divisible by p. Then by the Euclidean division theorem, we can write a in the form a = qp + b, where q is the quotient when dividing a by p, and 0 \leq b < p is the remainder. b can’t actually be 0, since we assumed a is not divisible by p. Hence 0 < b < p, so (1) applies and we can conclude that b^{p-1} \equiv 1 \pmod p. But notice that a \equiv b \pmod p (since a is b more than a multiple of p), and hence a^{p-1} \equiv b^{p-1} \equiv 1 \pmod p as well.

So although (2) “looks” more general than (1), the two statements are in fact logically equivalent.

Statement 3

Here’s another version which seems to be yet more general, since it drops the restriction that a can’t be divisible by p:

If p is prime and a is any integer, then a^p \equiv a \pmod p.

Notice, however, that the conclusion is different: a^p \equiv a \pmod p rather than a^{p-1} \equiv 1 \pmod p.

As an example, let p = 7 and a = 10 again. Then 10^7 = 10000000 = 1428570 \cdot 7 + 10, that is, the remainder of 10^7 when divided by 7 is 10. As another example, if a = 14, then 14^7 = 105413504 \equiv 14 \pmod 7 since both are divisible by 7.

Once again, although this seems more general, it turns out to be equivalent to (1) and (2).

First of all, to see that (2) implies (3), suppose p is prime and a any integer. If a is divisible by p, then a \equiv 0 \pmod p and clearly a^p \equiv 0 \equiv a \pmod p. On the other hand, if a is not divisible by p, then (2) applies and we may conclude that a^{p-1} \equiv 1 \pmod p; multiplying both sides of this equation by a yields a^p \equiv a \pmod p.

Now, to see that (3) implies (2), let p be a prime and a any integer not divisible by p. Then (3) says that a^p \equiv a \pmod p; we wish to show that a^{p-1} \equiv 1 \pmod p. However, since a is not divisible by p we know that a has a multiplicative inverse \pmod p, that is, there is some b such that ab \equiv 1 \pmod p. (I have written about this fact before; it is a consequence of Bézout’s Identity.) If we take a^p \equiv a \pmod p and multiply both sides by b, we get to cancel one a from each side, yielding a^{p-1} \equiv 1 \pmod p as desired.

Statement 4

The final form is the most general yet: it even drops the restriction that p be prime.

If n \geq 1 and a is any integer, then a^{\varphi(n)} \equiv 1 \pmod n.

where \varphi(n) is the Euler totient function, i.e. the number of positive integers less than n which are relatively prime to n. For example, \varphi(12) = 4 since there are four positive integers less than n which have no factors in common with 12: namely, 1, 5, 7, and 11.

We can see that (4) implies (2), since when n = p is prime, \varphi(p) = p-1 (since every integer in \{1, \dots, p-1\} is relatively prime to p). None of (1), (2), or (3) directly imply (4)—so it is, in fact, a bit more general—but we can generalize some of the proofs of these other facts to prove (4).

Advertisements
Posted in number theory, primes, proof | Tagged , , | Leave a comment

New baby, and primality testing

I have neglected writing on this blog for a while, and here is why:

Yes, there is a new small human in my house! So I won’t be writing here regularly for the near future, but do hope to still write occasionally as the mood and opportunity strikes.

Recently I realized that I really didn’t know much of anything about fast primality testing algorithms. Of course, I have written about the Lucas-Lehmer test, but that is a special-purpose algorithm for testing primality of numbers with a very special form. So I have learned about a few general-purpose primality tests, including the Rabin-Miller test and the Baille-PSW test. It turns out they are really fascinating, and not as hard to understand as I was expecting. So I may spend some time writing about them here.

As a first step in that direction, here is (one version of) Fermat’s Little Theorem (FLT):

Let p be a prime and a some positive integer not divisible by p. Then a^{p-1} \equiv 1 \pmod p, that is, a^{p-1} is one more than a multiple of p.

Have you seen this theorem before? If not, play around with some small examples to see if you believe it and why you think it might be true. If you have seen it before, do you remember a proof? Or can you come up with one? (No peeking!) There are many beautiful proofs; I will write about a few.

Posted in meta, number theory, primes | Tagged , , , | 9 Comments

From primitive roots to Euclid’s orchard

Commenter Snowball pointed out the similarity between Euclid’s Orchard

…and this picture of primitive roots I made a year ago:

At first I didn’t see the connection, but Snowball was absolutely right. Once I understood it, I made this little animation to illustrate the connection more clearly:

(Some of the colors flicker a bit; I’m not sure why.)

Posted in pattern, pictures, posts without words | Tagged , , , , , | Leave a comment

A few words about PWW #20

A couple commenters quickly figured out what my previous post without words was about. The dots making up the image are at integer grid points (m,n), with the center at (0,0). There is a dot at (m,n) if and only if m and n are relatively prime, that is, \gcd(m,n) = 1. Here is a slightly smaller version so it’s easier to see what is going on:

I learned from Lucas A. Brown that this is sometimes known as “Euclid’s Orchard”. Imagine that there is a tall, straight tree growing from each grid point other than the origin. If you stand at the origin, then the trees you can see are exactly those at grid points (m,n) with \gcd(m,n) = 1. This is because if a tree is at (dm,dn) for some d > 1, then it is blocked from your sight by the tree at (m,n): both lie exactly along the line from the origin with slope n/m. But if a tree is at some point with relatively prime coordinates (m,n), then it will be the first thing you see when you look along the line with slope exactly n/m.

(…well, actually, all of the above is only really true if we assume the trees are infinitely skinny! Otherwise trees will end up blocking other trees which are almost, but not quite, in line with them. So try not to breathe while standing at the origin, OK? You might knock over some of the infinitely skinny trees.)

Here’s the 9 \times 9 portion of the grid surrounding the origin, with the lines of sight drawn in along with the trees you can’t see because they are exactly in line with some closer tree. (I’ve made the trees skinny enough so that they don’t accidentally block any other lines of sight—but if we expanded the grid we’d have to make the trees even skinner.)

Now, what about the colors of the dots? Commenter Snowball guessed this correctly: each point is colored according to the number of steps needed for the Euclidean algorithm needed to reach 1. Darker colors correspond to more steps. It is interesting to note that there seems to be (eight symmetric copies of) one particularly dark radial stripe, indicated below:

In fact, the slope of this stripe is exactly \varphi = (1 + \sqrt 5)/2! This corresponds to the fact (first proved by Gabriel Lamé in 1844) that consecutive Fibonacci numbers are worst-case inputs to the Euclidean algorithm—that is, it takes more steps for the Euclidean algorithm to compute \gcd(F_{n+1}, F_n) = 1 than for any other inputs of equal or smaller magnitude. Since the ratio of consecutive Fibonacci numbers tends to \varphi, the dots with the darkest color relative to their neighbors all lie approximately along the line with slope \varphi. What’s interesting to me is that lots of other dots that lie close to this line are also relatively dark. Why does this happen?

Posted in pattern, pictures, posts without words | Tagged , , | 10 Comments

Post without words #20

Image | Posted on by | Tagged , , | 7 Comments

The curious powers of 1 + sqrt 2: recurrences

In my previous post, we found an answer to the question:

What’s the 99th digit to the right of the decimal point in the decimal expansion of (1 + \sqrt 2)^{500}?

However, the solution depended on having the clever idea to add (1 + \sqrt 2)^n + (1 - \sqrt 2)^n. But there are other ways to come to similar conclusions, and in fact this is not the way I originally solved it.

The first thing I did when attacking the problem was to work out some small powers of 1 + \sqrt 2 by hand:

\displaystyle \begin{array}{rcl} (1 + \sqrt 2)^2 &=& 1 + 2 \sqrt 2 + 2 = 3 + 2 \sqrt 2 \\[1em] (1 + \sqrt 2)^3 &=& (3 + 2 \sqrt 2)(1 + \sqrt 2) = 7 + 5 \sqrt 2 \\[1em] (1 + \sqrt 2)^4 &=& (7 + 5 \sqrt 2) (1 + \sqrt 2) = 17 + 12 \sqrt 2 \end{array}

and so on. It quickly becomes clear (if you have not already seen this kind of thing before) that (1 + \sqrt 2)^n will always be of the form a + b \sqrt 2. Let’s define a_n and b_n to be the coefficients of the nth power of (1 + \sqrt 2), that is, (1 + \sqrt 2)^n = a_n + b_n \sqrt 2. Now the natural question is to wonder what, if anything, can we say about the coefficients a_n and b_n? Quite a lot, as it turns out!

We can start by working out what happens when we multiply (1 + \sqrt 2)^n = (a_n + b_n \sqrt 2) by another copy of (1 + \sqrt 2):

\displaystyle (1 + \sqrt 2)^{n+1} = (a_n + b_n \sqrt 2)(1 + \sqrt 2) = (a_n + 2b_n) + (a_n + b_n) \sqrt 2

But (1 + \sqrt 2)^{n+1} = a_{n+1} + b_{n+1} \sqrt 2 by definition, so this means that a_{n+1} = a_n + 2b_n and b_{n+1} = a_n + b_n. As for base cases, we also know that (1 + \sqrt 2)^0 = 1 + 0\sqrt 2, so a_0 = 1 and b_0 = 0. From this point it is easy to quickly make a table of some of the values of a_n and b_n:

\displaystyle \begin{array}{ccc} n & a_n & b_n \\ \hline 0 & 1 & 0 \\ 1 & 1 & 1 \\ 2 & 3 & 2 \\ 3 & 7 & 5 \\ 4 & 17 & 12 \\ 5 & 41 & 29 \\ 6 & 99 & 70 \\ 7 & 239 & 169 \\ 8 & 577 & 408 \\ 9 & 1393 & 985 \end{array}

Each entry in the b_n column is the sum of the a_n and b_n from the previous row; each a_n is the sum of the previous a_n and twice the previous b_n. You might enjoy playing around with these sequences to see if you notice any patterns. It turns out that there is an equivalent way to define the a_n and b_n separately, such that each a_n only depends on previous values of a_n, and likewise each b_n only depends on previous b_n. I’ll explain how to do that next time, but leave it as a challenge for you in the meantime!

Posted in number theory, puzzles | Tagged , , , , , | 8 Comments

The curious powers of 1 + sqrt 2: a clever solution

Recall that we are trying to answer the question:

What’s the 99th digit to the right of the decimal point in the decimal expansion of (1 + \sqrt 2)^{500}?

In my previous post, we computed (1 + \sqrt 2)^n for some small n and conjectured that the answer is 9, since these powers seem to be alternately just under and just over an integer. Today, I’ll explain a clever solution, which I learned from Colin Wright (several commenters also posted similar approaches).

First, let’s think about expanding (1 + \sqrt 2)^n using the Binomial Theorem:

\displaystyle (1 + \sqrt 2)^n = 1 + n \sqrt 2 + \binom{n}{2} (\sqrt 2)^2 + \binom{n}{3} (\sqrt 2)^3 + \dots + (\sqrt 2)^n.

We get a sum of powers of \sqrt 2 with various coefficients. Notice that when \sqrt 2 is raised to an even power, we get an integer: (\sqrt 2)^2 = 2, (\sqrt 2)^4 = 2^2 = 2, and so on. The odd powers give us irrational things. So if we could find some way to “cancel out” the odd, irrational powers, we would be left with a sum of a bunch of integers.

Here is where we can pull a clever trick: consider (1 - \sqrt 2)^n. If we expand it by the Binomial Theorem, we find

\displaystyle \begin{array}{rcl} (1 - \sqrt 2)^n &=& \displaystyle 1 + n (-\sqrt 2) + \binom{n}{2} (-\sqrt 2)^2 + \binom{n}{3} (-\sqrt 2)^3 + \dots + (-\sqrt 2)^n \\[1.5em] &=& \displaystyle 1 - n \sqrt 2 + \binom{n}{2} (\sqrt 2)^2 - \binom{n}{3} (\sqrt 2)^3 + \dots \pm (\sqrt 2)^n \end{array}

but this is the same as the expansion of (1 + \sqrt 2)^n, with alternating signs: the odd terms—which are exactly the irrational ones—are negative, and the even terms are positive. So if we add these two expressions, the odd terms will cancel out, leaving us with two copies of all the even terms:

\displaystyle (1 + \sqrt 2)^n + (1 - \sqrt 2)^n = 2 \left(1 + \binom{n}{2} (\sqrt 2)^2 + \binom{n}{4} (\sqrt 2)^4 + \dots \right).

For now, we don’t care about the value of the sum on the right—the important thing to note is that it is an integer, since it is a sum of integers multiplied by even powers of \sqrt 2, which are just powers of two.

We are almost done. Notice that \sqrt 2 \approx 1.4142 \dots, so 1 - \sqrt 2 \approx -0.4142 \dots. Since this has an absolute value less than 1, its powers will get increasingly close to zero; since it is negative, its powers will alternate between being positive and negative. Hence,

\displaystyle (1 + \sqrt 2)^n + (1 - \sqrt 2)^n

is an integer, and (1 - \sqrt 2)^n is very small, so (1 + \sqrt 2)^n must be very close to that integer. When n is even, (1 - \sqrt 2)^n is positive, so (1 + \sqrt 2)^n must be slightly less than an integer; conversely, when n is odd we conclude that (1 + \sqrt 2)^n is slightly greater than an integer.

To complete the solution to this particular problem, we have to make sure that (1 - \sqrt 2)^{500} is small enough that we can say for sure the 99th digit after the decimal point of (1 + \sqrt 2)^{500} is still 9. That is, we need to prove that, say, (1 - \sqrt 2)^{500} < 10^{-100}. This will be true if we can show that |1 - \sqrt 2|^5 < 10^{-1} (just raise both sides to the 100th power), and in turn, taking the base 10 logarithm of both sides, this will be true if 5 \log_{10} |1 - \sqrt 2| < -1. At this point we can simply confirm by computation that 5 \log_{10} |1 - \sqrt 2| \approx -1.91\dots < -1. The fact that we get -1.91\dots means that not just 99, but actually the first 191 digits after the decimal point of (1 + \sqrt 2)^{500} are 9. (It turns out that the 192nd digit is a 5.)

The rabbit hole goes much deeper than this, however!

Posted in number theory, puzzles | Tagged , , , , , | 2 Comments