MaBloWriMo 28: Equivalence relations are partitions

Today we’ll take a brief break from group theory to prove a fact about equivalence relations, namely, that they are the same as partitions. A partition is a pretty intuitive concept: you take a big set, and cut it up into smaller sets so that every element of the big set belongs to exactly one of the small sets. That is, no elements are left out, and none of the small sets overlap with each other at all.

More formally, a partition of a set S is a finite collection of nonempty sets P_1, \dots, P_n such that

  1. The union of all the P_i is the original set: \bigcup_i P_i = S.
  2. The P_i do not overlap, i.e. they are pairwise disjoint: if i \neq j then P_i \cap P_j = \varnothing.

So how does this relate to equivalence relations? Suppose we have a set X and an equivalence relation \sim on X. Then the equivalence class of an element x \in X is defined as the set of all elements which are equivalent to x. We write

[x] = \{ y \mid y \in X, x \sim y \}

to denote the equivalence class of x. It turns out that if \sim is an equivalence relation, the equivalence classes form a partition of X. Intuitively, each equivalence class consists of a bunch of things which are all equivalent to each other, and these necessarily include everything (since everything is at least equivalent to itself) and don’t overlap (because if two equivalence classes did overlap, they would collapse into one big class due to transitivity). Let’s prove it.

  1. First we’ll prove a small lemma: if y \sim z then [y] = [z]. As is usual, to prove two sets are equal, we’ll prove that each is a subset of the other. First, suppose y' is any element in [y]. Then by definition y \sim y'. By symmetry, y' \sim y, and then by transitivity (since we assumed y \sim z), y' \sim z. By symmetry again z \sim y', so by definition y' \in [z]. So every element of [y] is also an element of [z]. An entirely analogous argument shows that [z] is a subset of [y] as well. Thus [y] = [z].
  2. Now we can prove that the set of distinct equivalence classes forms a partition of X. First, every x \in X is at the very least included in its own equivalence class [x], which must contain x since x \sim x by reflexivity. So the equivalence classes don’t miss any elements, that is, the union of all the equivalence classes is the entire set X.
  3. We also have to prove that distinct equivalence classes don’t overlap. Let P_i and P_j be two different equivalence classes. We want to show they have no elements in common. Suppose, on the contrary, that there is some x which is in both [y] and [z]. Then by definition y \sim x and z \sim x. So by symmetry x \sim z, and then by transitivity y \sim z. By our lemma above, this means [y] = [z], contradicting our assumption that they are different.
Posted in algebra, proof | Tagged , , , , , , | 2 Comments

MaBloWriMo 27: From subgroups to equivalence relations

Again, let G be a group and H a subgroup of G. Then we can define a binary relation on elements of G, called \sim_H, as follows:

x \sim_H y if and only if there is some h \in H such that x = yh.

That is, for any two elements x, y \in G, either x \sim_H y, or not: yes, if you can get from y to x by combining (on the right) with some element in H, and otherwise, no. Note that given any two elements x, y \in G, it is always possible to get from y to x by combining with some element of G: in particular, x = y(y^{-1}x). But this might not be an element of H.

Now, an equivalence relation on a set X is a relation x \sim y with the following three properties:

  1. Reflexivity: every x \in X is related to itself, that is, x \sim x.
  2. Symmetry: If x \sim y, then also y \sim x.
  3. Transitivity: If x \sim y and y \sim z, then x \sim z.

The usual equality relation satisfies these properties: things are always equal to themselves; if x = y then y = x; and if x = y and y = z then x = z. The notion of an equivalence relation is a way to talk about more general kinds of equality.

Let’s prove that \sim_H is an equivalence relation. This is really cool because it turns out that the three properties of an equivalence relation each follow from one of the three properties of a group!

  1. Reflexivity. We have to show that any x \in G is related to itself, that is, x \sim_H x. By definition this means there is some h \in H such that x = xh. Well, that’s easy: since H is a group, it has to contain the identity element e, and x = xe.
  2. Symmetry. Suppose x \sim_H y, that is, x = yh for some h \in H. Then we have to show y \sim_H x, that is, there is some h' \in H (which could be different from h) such that y = xh'. Well, since H is a group, it has to contain inverses. We can combine both sides of x = yh with h^{-1} to obtain xh^{-1} = y—so the h' we are looking for is precisely h^{-1}.
  3. Transitivity. Suppose x \sim_H y and y \sim_H z. That means there are h_1, h_2 \in H for which x = y h_1 and y = z h_2. We want to show that x \sim_H z, that is, x = zh for some h \in H. Substituting for y, we find that x = y h_1 = (z h_2) h_1 = z (h_2 h_1) (note how we used the third property of a group, associativity). So the h we are looking for is just h_2 h_1, which has to be in H since H is closed under the binary operation.

So for a given subgroup H \leq G, this relation defines a sort of “equality with respect to H” on the elements of G (whatever that means!). As for an example—consider again the subgroup \{0,4\} \leq \mathbb{Z}_8. Which elements of \mathbb{Z}_8 are related to each other under \sim_{\{0,4\}}? What do you notice?

Posted in algebra, proof | Tagged , , , , , , | Leave a comment

MaBloWriMo 26: Left cosets

Let G be a group and H a subgroup of G. Then for each element a \in G we can define a left coset of H by

aH = \{ ah \mid h \in H \}.

That is, aH is the set we get by combining a (on the left) with every element of H. For example, given the subgroup \{0,4\} \leq \mathbb{Z}_8 (this was the other subgroup of \mathbb{Z}_8—did you find it?), the left coset corresponding to 1 \in \mathbb{Z}_8 is 1 +_8 \{0,4\} = \{1 +_8 0, 1 +_8 4\} = \{1,5\}. A few observations:

  • The cosets corresponding to different elements of G might be the same. For example, the left coset of \{0,4\} corresponding to 5 \in \mathbb{Z}_8 is \{5 +_8 0, 5 +_8 4\} = \{5, 1\} = \{1, 5\}, just like the coset for 1.
  • Can you find the other possible (left) cosets of \{0,4\} in \mathbb{Z}_8? What do you notice?
  • As you may guess, there are also things called right cosets, denoted Ha, where we combine with an element on the right. \mathbb{Z}_8 is not such a good example anymore, since in \mathbb{Z}_8 the binary operation is commutative, that is, a +_8 b = b +_8 a, so left and right cosets are the same thing. In general, though, the binary operation of a group does not have to be commutative.
  • There is nothing special about left cosets as opposed to right cosets. In our proof of Lagrange’s Theorem we will use left cosets, but we could equally well replace all the left cosets by right cosets (and flip a few other things around) to get a different but equally valid proof.
  • As an interesting aside, when the left and right cosets of a subgroup coincide, we say that the subgroup is normal. (Hence every subgroup of a group with a commutative binary operation is normal; but this can also happen even when the binary operation is not commutative.) It turns out that these normal subgroups are very important. Normal subgroups of a group are kind of like the divisors of an integer; you can “divide” a group by one of its normal subgroups to get a “quotient group”. And yes, there are special groups called simple groups which don’t have any normal subgroups, and are kind of like prime numbers—there is a suitable sense in which every finite group can be uniquely decomposed into a “product” of simple groups, just like integers can be uniquely decomposed into a product of prime factors. But this is getting way off on a tangent! (I told you this proof would hint at some very cool, deeper group theory.)

Just one more observation for today. For any a \in G, consider the function f_a(b) = a \odot b which combines its input with a (on the left). This function is injective, that is, one-to-one: if f_a(b) = f_a(c), then by definition ab = ac, and combining both sides with a^{-1} on the left, a^{-1}ab = a^{-1}ac, hence b = c. (In a group we can always cancel things from both sides of an equation—though only from the end! For example, from abc = xby it does not follow that ac = xy.) Conversely, this means that if b \neq c, then ab \neq ac.

When we form the left coset aH, we are applying the function f_a to every element of H. The fact that this function is injective means it can’t “collapse” multiple elements of H into the same element in the result. This shows that the coset aH has to have the same size as H: there is exactly one element in aH for each element of H, and they all have to be different.

Posted in algebra, proof | Tagged , , , , , | Leave a comment

MaBloWriMo 25: Subgroups

So in the remainder of the month, we’ll prove that in any group G, the order of each element g \in G must evenly divide the order (size) of the group. I said in an earlier post that this is called Lagrange’s Theorem; actually, my memory was a tad off and it turns out that Lagrange’s Theorem is slightly more general than this, but implies it as a fairly straightforward corollary.

Today, we need to define the concept of a subgroup. If G is a group, then H is a subgroup of G (written H \leq G) when

  1. The set of elements of H is a subset of the set of elements of G, and
  2. H is also a group, under the same binary operation as G.

If H \leq G then you can think of H as a group “hiding inside” a bigger group G.

Given some subset H of the elements of G, to see whether H forms a subgroup we need to check just three things:1

  1. H is nonempty.
  2. H is closed under the binary operation. This is a rather special property: if you pick any old subset of the elements of G, chances are that combining two elements from your subset might result in something outside the subset (though of course it will still be in G).
  3. The inverse of any element of H is also in H. This is a special property too, for the same reason.

We don’t need to recheck associativity, since H uses the same binary operation as G. Note also that we don’t need to check whether H has an identity element, since this is already implied by (1), (2) and (3): since H is nonempty by (1), choose some a \in H. Then by (3), a^{-1} \in H too, then by (2), a a^{-1} = e is also in H. (Previously I’ve been using \odot to denote the binary operation of a group, but this is going to get tedious fast: from now on I will just omit writing an explicit symbol at all, and write ab instead of a \odot b. This is standard group theory notation.)

Let’s see some examples. Remember the example group \mathbb{Z}_8, which consists of the numbers 0 through 7, with a binary operation of addition \pmod 8. Let’s first consider the subset \{0,1,2,3\}. Is this is a subgroup of \mathbb{Z}_8? No, it isn’t: it’s not closed under the binary operation. For example, 2 +_8 3 = 5 which is not in the subset.

\{0,2,4,6\}, on the other hand, is indeed a subgroup of \mathbb{Z}_8. It is obviously nonempty. We can check that combining any two elements of the set according to the binary operation lands us back in the set (adding two even numbers \pmod 8 always results in an even number again), and the inverse of each element is in the set (0 and 4 are their own inverses, and 2 and 6 are inverses).

Any group is trivially a subgroup of itself (the “subset” in the definition does not have to be a strict subset). Also, the group with a single element is a subgroup of any group. So we have found three subgroups of \mathbb{Z}_8. There is one more—can you find it?

Now we can state Lagrange’s Theorem: if H is a subgroup of G, then the order of H evenly divides the order of G.

We’ll spend the rest of the month proving this. Here’s an outline for the rest of the proof, one blog post for each item below. (The proof will not actually take quite as long as I thought!) Supposing H is a subgroup of G:

  1. Define some subsets of G called the left cosets of H, and show that they all have the same size as H.
  2. Define (and prove) a certain equivalence relation on elements of G, defined in terms of H.
  3. Show that the equivalence classes of any equivalence relation form a partition.
  4. Show that the equivalence classes for the equivalence relation we defined are exactly the left cosets of H. Conclude that since every coset of H is the same size, and they partition G, their common size (that is, the size of H) must evenly divide the size of G. This will conclude the proof of Lagrange’s Theorem.
  5. In the final post, we will define the cyclic subgroup generated by an element g \in G and show that the order of this subgroup is the same as the order of g—hence by Lagrange’s Theorem the order of g must divide the order of G.


  1. This is usually called the two-step subgroup test (there are, of course, three conditions, but the condition that H be nonempty is usually so trivial that it doesn’t count). There are other ways to check whether some subset is a subgroup, most notably the one-step subgroup test which, besides H being nonempty, requires only that ab^{-1} \in H for every a, b \in H. It is a nice exercise in basic group theory to prove that this is equivalent.
Posted in algebra, proof | Tagged , , , , | Leave a comment

MaBloWriMo 24: Bezout’s identity

A few days ago we made use of Bézout’s Identity, which states that if a and b have a greatest common divisor d, then there exist integers x and y such that ax + by = d. For completeness, let’s prove it.

Consider the set of all linear combinations of a and b, that is,

\{ax + by \mid x, y \in \mathbb{Z} \},

and suppose d = as + bt is the smallest positive integer in this set. For example, if a = 10 and b = 6, then you can check that, for example, 16 = a + b, and 8 = 2a - 2b, and 4 = a - b are all in this set, as is, for example, -4 = -a + b, but the smallest positive integer you can get is 2 = 2a - 3b. We will prove that in fact, d is the greatest common divisor of a and b.

Consider dividing a by d. This will result in some remainder r such that 0 \leq r < d. I claim the remainder is also of the form ax + by for some integers x and y: note that a = a1 + b0 is of this form, and d is of this form by definition, and we get the remainder by subtracting some number of copies of d from a. Subtracting two numbers of the form ax + by works by subtracting coefficients, yielding another number of the same form again. But d is supposed to be the smallest positive number of this form, and r is less than d—which means r has to be zero, that is, d evenly divides a. The same argument shows that d evenly divides b as well. So d is a common divisor of a and b.

To see that d is the greatest common divisor, suppose c also divides a and b. Then since d = ax + by, we can see that c must divide d as well—so it must be less than or equal to d.

Voila! This proof doesn’t show us how to actually compute some appropriate x and y given a and b—that can be done using the extended Euclidean algorithm; perhaps I’ll write about that some other time. But this proof will do for today.

For the remainder of the month, as suggested by commented janhrcek, we’ll prove the thing I hinted at in an earlier post: namely, that the order of any group element is always a divisor of the order of the group. This is a really cool proof that hints at some much deeper group theory.

Posted in algebra, arithmetic, computation, famous numbers, iteration, modular arithmetic, number theory, primes | Tagged , , , , , , | 2 Comments

MaBloWriMo 23: contradiction!

So, where are we? We assumed that s_{n-2} is divisible by M_n, but M_n is not prime. We picked a divisor q of M_n and used it to define a group X^*, and yesterday we showed that \omega has order 2^n in X^*. Today we’ll use this to derive a contradiction.

Recall that we picked q so that q^2 \leq M_n—we can always pick a divisor of M_n that is less than (or equal to) the square root of M_n. We then defined X in terms of q as

X = \{ a + b\sqrt 3 \mid a, b \in \mathbb{Z}; 0 \leq a, b < q \}.

So how many elements are in the set X? That’s not too hard: there are q choices for the coefficient a, and q choices for the coefficient b; each pair of choices gives a different element of X, which therefore contains q^2 elements.

So what about the order of X^*? We got X^* by throwing away elements from X without an inverse. At least we know that 0 doesn’t have an inverse. There might be more, depending on q, but at least we can say that |X^*| \leq q^2 - 1.

\omega is in the group X^*, and we showed that the order of an element cannot exceed the order of the group. But, check this out:

|X^*| \leq q^2 - 1 \leq M_n - 1 < M_n + 1 = 2^n = |\omega|

The order of the group |X^*| is less than the order of |\omega|! This is a contradiction. So, our assumption that M_n has a nontrivial divisor q must be wrong—M_n is prime!

It took 23 posts, but we have finally proved one direction of the Lucas-Lehmer test: if computing s_{n-2} yields something divisible by M_n, then M_n is definitely prime.

And now I have to decide what to do with the remaining week. Of course, there is another direction to prove: we have only shown that the Lucas-Lehmer test correctly identifies primes (if s_{n-2} is divisible by M_n, then M_n is prime), but it is possible to also prove the converse, that the Lucas-Lehmer test identifies all the Mersenne primes (if M_n is prime, then s_{n-2} will be divisible by M_n). I am still trying to figure out how difficult this proof is. I don’t think we’ll be able to fit it in the last 7 days of the month, but it still might be worth starting it and finishing it on a more relaxed schedule.

Of course, I’m also open to questions, suggestions, etc.!

Posted in algebra, arithmetic, computation, famous numbers, iteration, modular arithmetic, number theory, primes | Tagged , , , , , , | 5 Comments

MaBloWriMo 22: the order of omega, part II

Yesterday, from the assumption that s_{n-2} is divisible by M_n, we deduced the equations

\omega^{2^{n-1}} = q-1


\omega^{2^n} = 1

which hold in the group X^*. So what do these tell us about the order of \omega? Well, first of all, the second equation tells us that the order of \omega must be a divisor of 2^n, and the only divisors of 2^n are other powers of 2. So the order of \omega must be 2^k for some k \leq n.

Now suppose the order of \omega is 2^k, so \omega^{2^k} = 1. But then if we square both sides we get \omega^{2^{k+1}} = 1. Squaring again gives \omega^{2^{k+2}} = 1, and so on. So once we hit 1, we are stuck there: raising \omega to all bigger powers of 2 will also yield 1.

But now look at the first equation: \omega^{2^{n-1}} = q-1. Remember that the order of \omega has to be a power of 2. From this equation we can see that the order can’t be 2^{n-1}. Could it be a smaller power of two? In fact, no, it can’t, by the argument in the previous paragraph: once you hit a power of two that yields 1, all the higher powers also have to yield 1. So if \omega raised to any smaller power of 2 were the identity, then \omega^{2^{n-1}} would also have to be the identity—but it isn’t.

The inescapable conclusion is that the only possibility for the order of \omega is exactly 2^n.

So, how does that help? Hint: think about the order of the group X^*… the triumphant conclusion tomorrow!

Posted in algebra, arithmetic, computation, famous numbers, iteration, modular arithmetic, number theory, primes | Tagged , , , , , , | 1 Comment