Book review: The Mathematics of Various Entertaining Subjects, Volume 3

I have a bunch of books in the queue to review—I hope to begin writing these more regularly again!

[Disclosure of Material Connection: Princeton Press kindly provided me with a free review copy of this book. I was not required to write a positive review. The opinions expressed are my own.]

The Mathematics of Various Entertaining Subjects, Volume 3: The Magic of Mathematics
Jennifer Beineke and Jason Rosenhouse, eds.
Princeton University Press, 2019

The MOVES conference takes place every two years in New York. MOVES is an acronym for “The Mathematics of Various Entertaining Subjects”, and the conference is a celebration of math that isn’t necessarily considered an Important Research Topic, and doesn’t necessarily have Important Applications—but simply math that is fun for its own sake. (Although in hindsight, math that starts out as Just For Fun often seems to end up with important applications too—for example, think of graph theory or probability theory.) The most recent conference took place just a few months ago, in August 2019; the next one will be in August 2021 (you can already register if you like to plan that far ahead!).

This book is basically the conference proceedings from 2017—a collection of papers that were presented at the conference, published all together in book form. So it’s important to state at the outset that although the topics are entertaining, this really is a collection of research papers. Overall this is definitely not a book written for a general audience! I had to work hard to understand some of the papers, and some of them lost me completely.

However, there’s some great stuff in here that rewards patient study. Some of my favorites that are more generally accessible include:

  • A chapter on “Wiggly Games and Burnside’s Lemma” that does a great job explaining Burnside’s Lemma—a classic result about counting things with symmetry, at the intersection of combinatorics and group theory—via applications to counting the number of possible tiles in several different games.

  • “Solving Puzzles Backwards” has some nice puzzles and a discussion of elegant ways to approach their solutions.

  • “Should we Call Them Flexa-Bands?” has some interesting reflections on the topology of different types of flexagons.

Some other things I particularly enjoyed but which are not so accessible without some background include a chapter on the computational complexity of losing at checkers, a chapter on “Kings, sages, hats, and codes” that I wish I understood better, and a chapter on the combinatorics of Legos.

There’s so much other stuff in there on such wildly varying topics that it’s impossible to summarize. In any case, definitely recommended if you are a professional mathematician looking for some fun yet still technically meaty reading; definitely not recommended if you’re looking for a casual read of a popular math book. And if you’re somewhere in between—that is, you’re not a professional mathematician but you aspire to read and understand things on that level—this could honestly be a great place to start!

Posted in books, review | Tagged , , , , | Leave a comment

A combinatorial proof: PIE a la mode!

Continuing from my last post in this series, we’re trying to show that S = n!, where S is defined as

\displaystyle S = \sum_{i=0}^n (-1)^i \binom{n}{i} (k+(n-i))^n

which is what we get when we start with a sequence of n+1 consecutive nth powers and repeatedly take successive differences.

Recall that we defined M_a as the set of all functions from a set of size n (visualized as n blue dots) to a set of size k + n (visualized as k yellow dots on top of n blue dots) such that the blue dot numbered a is missing. I also explained in my previous post that the functions with at least one blue dot missing from the output are exactly the “bad” functions, that is, the functions which do not correspond to a one-to-one matching between the blue dots on the left and the blue dots on the right.

As an example, the function pictured above is an element of M_1, as well as an element of M_3. (That means it’s also an element of the intersection M_1 \cap M_3—this will be important later!)

Let F be the set of all functions from n to k+n, and let P be the set of “good” functions, that is, the subset of F consisting of matchings (aka Permutations—I couldn’t use M for Matchings because M is already taken!) between the blue sets. We already know that the number of matchings between two sets of size n, that is, |P|, is equal to n!. However, let’s see if we can count them a different way.

Every function is either “good” or “bad”, so we can describe the set of good functions as what’s left over when we remove all the bad ones:

\displaystyle P = F - \bigcup_{a=1}^n M_a

(Notice how we can’t just write P = F - M_1 - M_2 - \dots - M_n, because the M_a sets overlap! But if we union all of them we’ll get each “bad” function once.)

In other words, we want to count the functions that aren’t in any of the M_a. But this is exactly what the Principle of Inclusion-Exclusion (PIE) is for! PIE tells us that the size of this set is

\displaystyle |P| = |F| - \left|\bigcup_{a=1}^n M_a \right| = \sum_{T \subseteq \{1 \dots n\}} (-1)^{|T|}\left| \bigcap_{a \in T} M_a \right|,

that is, we take all possible intersections of some of the M_a, and either add or subtract the size of each intersection depending on whether the number of sets being intersected is even or odd.

We’re getting close! To simplify this more we’ll need to figure out what those intersections \bigcap_{a \in T} M_a look like.

Intersections

What does M_a \cap M_b look like? The members of M_a \cap M_b are exactly those functions which are in both M_a and M_b, so M_a \cap M_b contains all the functions that are missing both a and b (and possibly other elements). Likewise, M_a \cap M_b \cap M_c contains all the functions that are missing (at least) a, b, and c; and so on.

Last time we argued that |M_a| = (k + (n-1))^n, since functions from n to k+n that are missing a can be put into a 1-1 matching with arbitrary functions from n to k + (n-1), just by deleting or inserting element a:

So what about an intersection—how big is M_a \cap M_b (assuming a \neq b)? By a similar argument, it must be (k + (n-2))^n, since we can match up each function in M_a \cap M_b with a function from n to k+(n-2): just delete or insert both elements a and b, like this:

Generalizing, if we have a subset T \subseteq \{1, \dots, n\} and intersect all the M_a for a \in T, we get the set of functions whose output is missing all the elements of T, and we can match them up with functions from n to k + (n-|T|). In formal notation,

\displaystyle \left| \bigcap_{a \in T} M_a \right| = (k + (n-|T|))^n

Substituting this into the previous expression for the number of blue matchings |P|, we get

\displaystyle |P| = \sum_{T \subseteq \{1 \dots n\}} (-1)^{|T|}(k + (n-|T|))^n

Counting subsets

Notice that the value of (-1)^{|T|}(k + (n-|T|))^n depends only on the size of the subset T and not on its specific elements. This makes sense: the number of functions missing some particular number of elements is the same no matter which specific elements we pick to be missing.

So for each particular size |T| = i, we are adding up a bunch of copies of the same value (-1)^i (k + (n-i))^n—as many copies as there are different subsets of size i. The number of subsets T of size i is \binom n i, the number of ways of choosing exactly i things out of n. Therefore, if we add things up size by size instead of subset by subset, we get

\begin{array}{rcl} |P| &=& \displaystyle \sum_{\text{all possible sizes } i} (\text{number of subsets of size } i) \cdot \left[(-1)^{i}(k + (n-i))^n \right] \\[1em] &=& \displaystyle \sum_{i=0}^n \binom n i (-1)^{i}(k + (n-i))^n\end{array}

But this is exactly the expression for S that we came up with earlier! And since we already know |P| = n! this means that S = n! too.

And that’s essentially it for the proof! I think there’s still more to say about the big picture, though. In a future post I’ll wrap things up and offer some reflections on why I find this interesting and where else it might lead.

Posted in arithmetic, combinatorics, proof | Tagged , , , , , | Leave a comment

A few words about PWW #27

The images in my last post were particular realizations of the famous Sieve of Eratosthenes. The basic idea of the sieve is to repeatedly do the following:

  • Circle the next number bigger than 1 that is not yet crossed out, call it p.
  • Cross out every pth number after p, that is, all the multiples of p. These are not prime since they are multiples of p.

This is an efficient way to find all the primes up to a given limit. Note that it doesn’t require doing any division or factoring, just adding. Here’s the image of the 10 \times 10 sieve again:

Some questions for you to ponder:

  • Why can we always cross out multiples of each prime p using parallel straight lines?
  • When will the lines be vertical? When will they be diagonal?
  • Is there only one way to cross out multiples of each p with straight lines, or could we have chosen different lines?
  • Why are the primes on the top row circled, while the rest of the primes are in boxes? What’s the difference?

And just for fun here’s the sieve diagram for n = 29, one of my favorites. Click here for a larger version.

Posted in pattern, pictures, posts without words, primes | Tagged , , | 4 Comments

Post without words #27

Image | Posted on by | Tagged , , | 7 Comments

Order of operations considered harmful

[The title is a half-joking reference to Edsger Dijkstra’s classic paper, Go To Statement Considered Harmful; see here for more context.]

Everyone is probably familiar with the so-called “order of operations”, which is

a collection of rules that reflect conventions about which procedures to perform first in order to evaluate a given mathematical expression.

(the above quote is from the Wikipedia page on Order of Operations). If you grew up in the US, like me, you might have memorized the acronym PEMDAS; I have recently learned that people in other parts of the world use other acronyms, like BEDMAS. In any case, these mnemonics help you remember the order in which you should conventionally perform various arithmetic operations, right?

For example:

\begin{array}{rl} & 1 + 3 \times (5 - 3)^4/2 \\ = & 1 + 3 \times 2^4 / 2 \\ = & 1 + 3 \times 16 / 2 \\ = & 1 + 48 / 2 \\ = & 1 + 24 \\ = & 25.\end{array}

Makes sense, right? We did the stuff in parentheses first, then the exponent, then the multiplication and division (from left to right), then the addition.

OK, pop quiz: what is the value of

3 + 0 \times 37^{984}?

Of course it is 3, you say. Aha, but did you follow the order of operations? I thought not. Supposedly the order of operations tells you that you have to perform the exponentiation before the multiplication, but I am willing to bet that you skipped straight over the exponentiation and did the multiplication first! Really, you should be ashamed of yourself.

Another pop quiz: what is the value of

2 \times 7 + 3 \times 5 + 4 \times 6?

Well, let’s see:

\begin{array}{rl}& 2 \times 7 + 3 \times 5 + 4 \times 6 \\[0.5em] =& 14 + 3 \times 5 + 4 \times 6 \\[0.5em] =& 14 + 15 + 4 \times 6 \\[0.5em] =& 29 + 4 \times 6 \\[0.5em] =& 29 + 24 \\[0.5em] =& 53\end{array}

Easy peasy. But wait, did you notice what I did there? I did one of the additions before I did the last multiplication! According to the “order of operations” this is not allowed; you are supposed to perform multiplication before addition, right??

One more quiz: solve for y in the equation

20 = 2 + 3y.

Of course we can proceed by subtracting 2 from both sides, resulting in 18 = 3y, and then dividing both sides by 3, finding that y = 6 is the unique solution.

How did we know not to add the 2 and the 3, resulting in the bogus equation 20 = 5y? Because of the order of operations, of course… but wait, what does the order of operations even mean here? Did you notice that we never actually performed the multiplication or addition at all?

My point is this: casting the “order of operations” in performative terms—as in, the order we should perform the operations—is grossly misleading.

You might think all my examples can be explained away easily enough—and I agree—but if you take literally the idea of the order of operations telling us what order to perform the operations (as many students will, and do), they don’t make sense. In fact, I would argue that saying the order of operations is about the order of performing the operations is worse than misleading, it is just plain wrong.

So what is it, really?

I made the title of this post provocative on purpose, but of course I am not actually arguing against the order of operations in and of itself. We certainly do need to agree on whether 1 + 2 \times 3 should be 7 or 9. But we need a better way of explaining it than saying it is the “order in which we perform the operations”.

Any mathematical expression is fundamentally a tree, where each operation is a node in the tree with the things it operates on as subtrees. For example, consider the example expression 1 + 3 \times (5 - 3)^4/2 from the beginning of the post. As a tree, it looks like this:

This tells us that at the very top level, the expression consists of an addition, specifically, the addition of the number 1 and some other expression; that other expression is a division (quick check: do you see why the division should go here, and not the multiplication?), and so on.

However, pretty much all the writing systems we humans have developed are linear, that is, they consist of a sequence of symbols one after the other. But when you go to write down a tree as a linear sequence of symbols you run into problems. For example, which tree does 3 \mathbin{\triangle} 5 \mathbin{\square} 2 represent?

Without further information there’s no way to tell; the expression 3 \mathbin{\triangle} 5 \mathbin{\square} 2 is ambiguous.

There are two ways to resolve such ambiguity. One is just to add parentheses around every operation. For example, when fully parenthesized, the example expression from before looks like this:

(1 + ((3 \times ((5 - 3)^4))/2))

With one set of parentheses for every tree node, this is an unambiguous way to represent a tree as a linear sequence of symbols. For example, in the case of 3\mathbin{\triangle} 5 \mathbin{\square} 2, we would be forced to write either (3\mathbin{\triangle} (5 \mathbin{\square} 2)) or ((3\mathbin{\triangle} 5) \mathbin{\square} 2), fully specifying which tree we mean.

But, of course, fully parenthesized expressions are quite tedious to read and write. This leads to the second method for resolving ambiguity: come up with some conventions that specify how to resolve ambiguity when it arises. For example, if we had a convention that says \square has a higher precedence than (i.e. “comes before”, i.e. “binds tighter than”) \triangle, then 3 \mathbin{\triangle} 5 \mathbin{\square} 2 is no longer ambiguous: it must mean the left-hand tree (with the \triangle at the top), and if we wanted the other tree we would have to use explicit parentheses, as in (3 \mathbin{\triangle} 5) \mathbin{\square} 2.

Of course, this is exactly what the “order of operations” is: a set of conventions that tells us how to interpret otherwise ambiguous linear expressions as unambiguous trees. In particular, the operations that we usually talk of being “performed first” really just have higher precedence than the other operations. Think of the operations as “magnets” trying to attract things to them; higher precedence means stronger magnets.

I wouldn’t necessarily phrase it this way to a student, though. I have never taught elementary or middle school math, or whichever level it is where this is introduced, but I think if I did I would just tell them:

The order of operations tells us where to put parentheses.

The nice thing is that if you think about adding parentheses instead of performing operations, then the order of operations really does tell you what order to do things: first, add parentheses around any exponentiations; then add parentheses around any multiplications and divisions from left to right; finally, add parentheses around any additions and subtractions from left to right. Of course, if the expression already has any parentheses to begin with, you should leave them alone. This also explains what is “different” about the parentheses/brackets at the beginning of PEMDAS/BEDMAS: you can’t “perform” parentheses like you can “perform” the other operations. But from this new point of view, you don’t even need to include them in the mnemonic: they are different because the whole point is to add more of them. Of course, as students become familiar with this process, at some point they no longer need to actually write out all the parentheses, they can just do it in their heads.

What next?

To be honest I am not sure exactly what the takeaway should be here. I do think we are doing students a disservice by teaching them the misleading idea that the “order of operations” is about the order in which to perform the operations, and sadly it seems this idea is quite widespread. But I am not exactly sure what should be done about it. If you are a mathematics educator—either one who teaches students about the order of operations, or one who has witnessed the effects of their understanding on later subjects—I’d love to hear your responses and ideas!

Posted in arithmetic, teaching | Tagged , , , , , , | 7 Comments

A combinatorial proof: counting bad functions

In a previous post we derived the following expression:

\displaystyle S = \sum_{i=0}^n (-1)^i \binom{n}{i} (k+(n-i))^n.

We are trying to show that S = n!, in order to show that starting with a sequence of n+1 consecutive nth powers and repeatedly taking successive differences will always result in n!. Last time we also made a picture of a typical function from a set of size n to a set of size k+n, which looks like this:

Here’s the plan of action from this point:

  • Start with all the functions from a set of size n to a set of size k+n (i.e. all possible pictures like the above).
  • Subtract off the “bad” functions which don’t correspond to matchings between the blue dots.
  • We will be left with only matchings, of which we know there are n!.
  • Find that the right expression to count the functions via this procedure is exactly S.
  • Rejoice!

Bad functions

What makes a function “bad”? Based on the discussion in my previous post, it seems that there are two things that can make a function “bad”, that is, prevent it from being a matching between the blue sets:

  • An arrow can point to one of the orange dots. This means there won’t be enough arrows left to cover all of the blue dots.
  • Two arrows can “collide”, that is, point to the same blue dot on the right-hand side (if two arrows point to the same orange dot then it’s already bad because of the first reason). This means it can’t be a valid matching, because a matching has to match each blue dot on the left with exactly one blue dot on the right and vice versa.

The example function above has both these problems:

From another point of view, however, there is really only one thing that makes a function bad, which encompasses both points above: a function is bad if and only if there is at least one “missing” blue dot, that is, a blue dot on the right-hand side which is not pointed to by an arrow. On the one hand, if there is a missing blue dot, then the function obviously can’t be a matching, since the missing blue dot is unmatched. On the other hand, if there are no missing dots, then each dot must be matched with a unique dot on the left-hand side, since there are the same number of blue dots on both sides. (In fact, if a blue dot is missing, it will always be caused by one of the two problems listed above—there will either be an arrow pointing to an orange dot, or two arrows that collide.) Here is the same function again, but with the missing elements marked as “bad” by fading them out a bit:

Our goal is now to “subtract off” these bad functions, that is, the functions with at least one missing blue dot.

Counting bad functions

Let’s number the blue dots from 1 to n (with 1 at the top), and define M_i as the set of all functions from n to k+n where blue dot i is missing (M is for “Missing”).

So, for example, the function from before is an element of M_3, since element 3 is missing:

How big is M_i? Well, if we just leave out the missing blue dot i (and renumber the blue dots with numbers greater than i), we are left with a function from n to k+(n-1); conversely, every function from n to k+(n-1) can be made into a function from n to k+n with blue dot i missing, just by inserting a new dot numbered i (and again renumbering the dots with numbers \geq i).

So, since we can put the function in M_i in one-to-one correspondence with all the functions from a set of size n to a set of size k+(n-1), the size of M_i is just the number of such functions, that is, (k+(n-1))^n.

At this point, however, we run into a snag: as you can see, the example function we’ve been using is also an element of M_1, since 1 is missing too. And this is why we can’t simply subtract each of the M_i and call it a day: the sets overlap, so if we subtract all of them we would be subtracting functions multiple times.

Hmm, subsets that overlap, and we want to count the things that are in none of the subsets… this is starting to sound familiar! We need some PIE, of course. In my next post we’ll start putting some of the pieces together!

Posted in arithmetic, combinatorics, proof | Tagged , , , , , | 1 Comment

A combinatorial proof: functions and matchings

We’re trying to prove the following equality (see my previous post for a recap of the story so far):

\displaystyle \sum_{i=0}^n (-1)^{n-i} \binom{n}{i} (k+i)^n = n!

In particular we’re trying to show that the two sides of this equation correspond to two different ways to count the same set of mathematical objects. We already know that the right-hand side counts the number of permutations of a set of size n, or put another way, the number of one-to-one matchings between two different sets of size n. Somehow, then, we need to show that the left side of the equation is just a complicated way to count the same thing.

Cosmetic surgery

Let’s call the left-hand side of the equation S, that is,

\displaystyle S = \sum_{i=0}^n (-1)^{n-i} \binom{n}{i} (k+i)^n.

The first thing I want to do is switch the direction of the summation. That is, everywhere we currently have i I want to replace it with n-i, and vice versa. So 0 will switch places with n, and 1 will switch with n-1, and so on. Ultimately we will still have the same values of i, just in the reverse order. Of course, the order in which we add up a bunch of things doesn’t matter (since addition is commutative), so this won’t change the value of S at all.

\displaystyle S = \sum_{i=0}^n (-1)^i \binom{n}{n-i} (k+(n-i))^n

Finally, note that \binom{n}{n-i} = \binom{n}{i} (there are the same number of ways to choose n-i things out of n as there are to choose i things to exclude), so we can write

\displaystyle S = \sum_{i=0}^n (-1)^i \binom{n}{i} (k+(n-i))^n.

Functions

When i = 0, S involves the term (k+n)^n. Let’s think about what that counts. I explained in a previous post that a^b is the number of functions from a set of size b to a set of size a (as a reminder, this is because the function gets to independently choose where to map each input value—so there are b choices for each of the a inputs, for a total of b \cdot b \cdot \dots \cdot b = b^a choices). So let’s draw a typical function from a set of size n to a set of size k+n:

For this example I chose n = 5 and k = 3. The blue dots on the left represent the set of size n. The dots on the right represent the set of size k+n: I made n of them blue to emphasize that they “correspond” to the blue dots on the left, and the remaining k “extra” dots are orange.

The arrows indicate what the function does. To be a valid function, it has to be defined for each element of the input set; that is, in the picture, there has to be exactly one arrow coming out of each dot on the left. However, the arrows can go anywhere. Note that in the example above, one arrow points to an orange dot, and two other arrows point the same blue dot (we could say they “collide”).

Some of these functions are actually matchings between the blue dots on the left and right, that is, every arrow points to a blue dot, and none of the arrows collide. For example:

We already know that the right-hand side of the equation, n!, is the number of such matchings. And we’re going to show that the left-hand side of the equation is what you get if you start with all the possible functions from n to n+k, and then subtract the ones that aren’t matchings. Of course, these should give you the same result! If you like, between now and my next post, you can try to work out the details for yourself.

Posted in arithmetic, combinatorics, proof | Tagged , , , , , | 5 Comments