I was a little unsatisfied with my proof yesterday since I don’t think I did a very good job explaining how enters into things. When sinuheancelmo asked a question which seemed to show confusion on exactly that point, I figured it would be worth spending a post to clarify things.
The main issue is that there are two slightly different ways we could have defined the sequence of numbers , and you will see both definitions in the literature. Here is Method 1:
.
.
is prime iff
.
And Method 2:
.
.
is prime iff
.
Do you see the difference? In Method 1, the are defined without ever taking the remainder
. So the
will just keep getting bigger and bigger. Then we define the test by saying that
is prime if
is divisible by
, that is, if
. In Method 2, on the other hand, we build
into the definition of the
, so they never get larger than
. Then as the final test we can just require that
is actually equal to zero.
In other words, the only difference between the two methods is how often we compute the remainder : once at the very end (method 1), or continuously as we go along (method 2). Method 1 is mathematically simpler, but you would never actually implement the test that way, since the
get astronomically large. Method 2 corresponds to what you would actually implement, but it’s slightly more annoying to deal with all those
’s when proving things about it.
The two methods are mathematically equivalent, because taking remainders “commutes with addition and multiplication”, that is,
(thought question: why can’t we just say ?) and similarly for multiplication. That is, it doesn’t matter whether you take the remainder before or after adding or multiplying. Since the definition of
only involves multiplying and adding (well, subtracting, but that’s just adding a negative number), we are justified in either taking the remainder as we go along or waiting until the very end—we will get the same result in either case.
Because I originally defined the according to Method 2, I had to carefully state the theorem from yesterday in terms of the remainder
, and wave my hands about mod commuting with addition and multiplication. If we instead define
according to Method 1, then the statement of the theorem and its proof do not have to mention
at all: we just say
and prove it by induction, in the same way that we did yesterday, but without the handwaving about taking remainders. In other words, what actually gives us is the
defined according to Method 1. If we build
into the definition of the
, then the theorem has to take this into account.
I think from now on we will just use the Method 1 definition of in the proof, but keep in mind that to actually implement it you would want to use Method 2.