The reason I asked the question, however, is to see the pattern that emerges if we multiply out these expressions, which commenter Buddha Buck explained.
The first line is the result of multiplying out ; the second is ; and the last is of course . After expanding everything as much as possible, we get one term for the product of every possible combination of the probabilities, with a positive sign for an even number of probabilities, and negative for an odd number.
Do you see why this happens? Each term in the result arises from choosing one of the two possible terms from each parenthesized expression. Taking as a specific example, we get if we choose from , and from , and from , yielding .
And now for a set of follow-up questions. Suppose that is the set of all people in our population, is the set of people who like anchovies, and is the set of people who like books. Suppose we also know how many people like both anchovies and books, that is, we know . (The notation denotes the size of a set , and denotes the intersection of two sets, that is, the set of elements the two sets have in common.)
How many people like neither anchovies nor books?
Now suppose is the set of people who like carpets. Suppose we also know the sizes of the sets , , and . How many people like none of the things?
And so on?
Suppose we also know that these probabilities are all independent, that is, they do not influence each other at all. Wether a person likes anchovies has nothing to do with whether they like to read or whether they like carpets, and so on. When probabilities are independent, it makes sense to multiply them: for example, the probability that someone likes both anchovies and carpets is the product .
What is the probability that a randomly chosen person likes neither anchovies nor books?
What is the probability that a randomly chosen person likes none of these three things?
Can you generalize? What if we had four, five, or more independent probabilities for things people might like; what would be the probability that a random person didn’t like any of the things?
Recall that the Euler totient function, , counts how many numbers from to are relatively prime to (that is, share no prime factors in common with) .
My current goal is to show that if we know the prime factorization of , we can calculate much more quickly than just by counting.
This rests on the fact that is multiplicative, that is, whenever and are relatively prime. In particular this means we can break down based on a number’s prime factorization:
First I displayed some visualizations that give us some intuition for why this is true. Then in another post I presented a formal proof based on that intuition.
So the only thing left is to figure out how to compute . That’s what we’ll do today! Then we’ll see the whole method of computing in action.
As a warmup, what is when is prime? Since is prime, by definition it has no prime factors other than itself, and no smaller number can share a prime factor with it. Hence every number from to is relatively prime to , and .
How about ? Here’s an example with . The blue squares are the numbers relatively prime to —the ones we want to count—and the white squares are the ones which share a prime factor with .
In general, the only numbers that share a prime factor with are numbers that are divisible by . In the example above, there are five white squares; in general, counting itself, there are exactly of these numbers: . So all of the numbers from to are relatively prime to except for those multiples of , for a total of . Put another way, exactly one out of every numbers is divisible by , so each consecutive group of numbers yields which are relatively prime to . There are such groups up to , so the total is . In the example with , you can verify that there are blue squares.
Finally, what about bigger powers, ? Well, once again, the only numbers which share any common factor with are exactly the multiples of . There’s one multiple of (and non-multiples of ) in every group of numbers. How many such groups are there up to ? There are of them. Hence .
A previous post contained an example that required computing ; now we can see how to do it. First, we factor the number as . Then, using the fact that is multiplicative, we can break it down into three separate applications of :
.
Finally, we can evaluate for each of these prime powers:
Hence . For this relatively small number we can actually double-check this result using a computer to count the answer by brute force:
>>> length [n | n <- [1 .. 151389], gcd n 151389 == 1]
85536
Math works!
]]>The dark blue squares are those counted by , and the light blue rows and columns are those counted by and . We noted what these pictures are telling us: because we know from the Chinese Remainder Theorem that the numbers from to correspond exactly to pairs of numbers , showing that is multiplicative really comes down to showing the following fact:
In other words, the dark blue cells show up precisely at the intersections of light blue rows and columns. Let’s prove this!
First, if shares no common factors with the product , then it can’t possibly share any common factors with or individually. This is because any divisor of is also a divisor of , and likewise for ; so if did share any common factor with, say, , then it would automatically be a common factor with . The converse is also true: if shares no common factors with and also shares no common factors with , then it can’t share any common factors with . That is,
For the second piece of the puzzle, note that by definition, for some integer . That is, is what is left over after we subtract some number of copies of from . If shares a common factor with , then so does , since we could factor out of both and . Likewise, if we add to both sides to get , we can see that if shares a common factor with , then so does —since once again we could factor out of both and . Thus we have shown that
or, taking the contrapositive of this statement,
Now let’s put the puzzle pieces together:
which is exactly what we set out to prove.
The Chinese Remainder Theorem tells us that when and are relatively prime, there is a 1-1 matchup between the numbers and the pairs of numbers such that and . But we now know we can make an even finer distinction: which are relatively prime to are in 1-1 correspondence with pairs where is relatively prime to and is relatively prime to . Hence, .
]]>Let’s get some intuition for this by looking at some Chinese remainder theorem grids and highlighting the numbers which are relatively prime to . So, for example, the first grid below is ; I’ve filled it in with consecutive numbers from to along the diagonal, and highlighted all the numbers that share no common factors with . (I’ve also labelled the rows and columns with smaller numbers counting from .) Below that I’ve also chosen to show similar , , and grids.
And here’s a slightly bigger example ():
There are some striking patterns here. The placement of the highlighted cells is clearly not random: they are precisely confined to certain rows and columns. (Incidentally, this is exactly what I was trying to show in my last post without words, though I think the colors may have been somewhat distracting.) Which rows and columns are they? Let’s have a look. Below I’ve drawn the grid again, this time highlighting the rows and columns where cells relatively prime to occur:
Notice that the highlighted rows (1, 3, 5, 7) are exactly the numbers relatively prime to 8, and the highlighted columns (1, 2, 4, 5, 7, 8) are the ones relatively prime to 9. The number of dark blue intersection points is just the number of highlighted rows times the number of highlighted columns: there are four highlighted rows and six highlighted columns, making a total of 24 dark blue squares. That is, .
You can check that this is true for the other examples I showed above as well. So can we turn this into a conjecture? Remember that when we label an grid with consecutive numbers along the diagonal, the number ends up at the grid coordinates (see this post). We’re guessing that cells containing numbers relatively prime to end up precisely where the row number is relatively prime to and the column number is relatively prime to , so we can translate this into the more formal statement
We’ll prove this next time!
]]>Of course, we could just count: that is, to compute , list all the numbers from to , and compute the GCD of each number with (using the Euclidean algorithm); count the ones for which the GCD is . This works, of course, but it is possible to compute much faster than this.
The key to computing more quickly is that it is has the special property of being multiplicative: if and are relatively prime, then (don’t take my word for it though—we’ll prove it!).
Recall that we can break any number down as a product of distinct prime powers,
For example, . Of course, any powers of two different primes are relatively prime to each other (they are only divisible by their respective primes), so if is multiplicative it means we can break down like this:
Then all we have left to do is figure out how to compute for some power of a prime . That is, we’ve reduced a harder problem (“how many numbers are relatively prime to an arbitrary integer ?”) to a more specific and hopefully easier one (“how many numbers are relatively prime to an integer of the form for some prime number ?”). Some questions we still have to answer:
How do we know that when and are relatively prime? (Hint: look at my previous post and the one before that…)
How do we find ? You might like to try figuring this out too. As a warmup, what is ? Then what about ? Can you generalize?
is a bijection between the set and the set of pairs (remember that the notation means the set ). In other words, if we draw an grid and trace out a diagonal path, wrapping around when we get to an edge, we will hit every grid square exactly once before returning to the start. In other other words, the set of equations , always has a unique solution (that is, unique ) when and are relatively prime.
Let’s prove it! First, let’s prove that the function is one-to-one, that is, any two different values of are sent to different pairs . In other words, if and are not equal, then and will also not be equal. In symbols:
Notice how this is phrased negatively, in terms of things being not equal to each other. It is usually much easier to prove the contrapositive of this statement, namely
That is, if and map to the same remainders , then and must in fact be the same. This is logically equivalent (you should try to convince yourself of this if you’re unsure!) but much easier to get a handle on.
So, suppose we have numbers and , both non-negative and less than , and suppose they give the same results and , that is, and . Consider their difference . Since and have the same remainder when divided by , it must be the case that is evenly divisible by ; likewise, is evenly divisible by . But since and are relatively prime, that is, they share no common factors, they cannot overlap at all in ’s prime factorization; so in fact must be divisible by their product, . In theory, then, could be one of the values . Which is it? Well, remember that , and . Subtracting two nonnegative integers less than could never produce any multiple of other than . So in fact , which means .
So the function is one-to-one. But it maps between two finite sets of the same size—namely, and , both of which have exactly elements. If a function between two finite sets of the same size is one-to-one, it must actually be a bijection. That is, if we have the same number of things on each side, and a way to send each item on the left to a distinct item on the right, then we in fact have a way to match up every element on both sides with exactly one on the other side.
As a final note, this proof certainly has simplicity going for it, but you may find it a bit unsatisfactory: although it tells us that the function has an inverse, it doesn’t tell us how to compute it. That is, given a particular pair of remainders where and , how do we find ? It can certainly be done. But for now I will let you look it up, and perhaps I’ll write about it at some point in the future!
]]>And then in the next post I explained how I made the images: starting in the upper left corner of a grid, put consecutive numbers along a diagonal line, wrapping around both from bottom to top and right to left. The question is then why some grids get completely filled in, like the one above, and why some have holes, like this:
Commenter ZL got the right answer: the grid will be completely filled in if and only if the width and height of the grid are relatively prime.
In fact, this is (a special case of^{1}) the Chinese remainder theorem (as commenter Jon Awbrey foresaw). In more traditional terms, the Chinese remainder theorem says that if we have a system of two modular equations
then as long as and are relatively prime, there is a unique solution for in the range .
Here’s another way to think about this. Suppose I have a number in the range . If I tell you only , that is, the remainder when dividing by , you can’t tell for sure what is; there may be multiple possibilities for . Likewise, if I only tell you , you don’t know . But if I tell you both—as long as and are relatively prime—it’s enough information to precisely pin down the value of ; there is only one value of which has a certain remainder modulo and at the same time a certain remainder modulo .
Yet another way to state the theorem is to consider the function
which sends an integer to a pair of integers where the first is in the range and the second is in the range . If we use the notation then we can write this concisely as
The Chinese Remainder Theorem just says that this function is actually a bijection, that is, it is onto (every pair has some solution such that ) and one-to-one (the for a given is unique).
But this function is exactly what the grid diagrams are visualizing! We start at in the upper left, and every time we step to the next value of , we take a step down; so in general the number will be on row (if we number the rows starting with row at the top)—except that we wrap around when we get to the bottom. So if there are rows, that means will be on row . Similarly, we take a step to the right with each next value of , and wrap around; so will be in column . Hence each will end up at grid coordinates , which is exactly the function . And being a bijection is exactly the same thing as the grid being completely filled in: every value of corresponds to a different grid square, and every grid square has a value of .
In my next post I’ll go over a simple proof of the theorem.
The full Chinese Remainder Theorem actually says something about the situation where you have an arbitrary number of these equations, not just two; but the version with only two equations has all the essential features. (Indeed, once you believe the version with two equations, you can “bootstrap” your way to dealing with an arbitrary number just by applying the two-equation version repeatedly.)
By contrast, it is very easy to decide whether it is possible to write 33 as a sum of two squares, . Since squares can only be positive, any values of and greater than are not going to work, since they would make the sum too big. So there are only a few pairs of values to check: it’s enough to just check all pairs with (quick: how many such pairs are there?), which can even be done by hand in a few minutes. None of them work, so this exhaustive search of all the possibilities proves that it is not possible to find and such that .
But a sum of three cubes, , is an entirely different matter! We can’t put any a priori bound on the size of the values , , and , because the cube of a negative number is negative—if we choose at least one of them to be positive and at least one of them to be negative, they could in theory be very large but “cancel out” to yield 33. And that’s exactly what Dr. Booker found (using approximately 23 years’ worth of computer time spread over one month!):
Whoa.
]]>