Opt Art: From Mathematical Optimization to Visual Design
Robert Bosch
Princeton University Press, 2019
I recently finished reading Robert Bosch’s new book, Opt Art. It was a quick read, both because it’s not actually that long, but also because it was fascinating and beautiful and I didn’t want to put it down!
The central theme of the book is using linear optimization (aka “linear programming”) to design and generate art. The resulting art can be simply beautiful for its own sake, or can also give us insight into underlying mathematics.
Linear optimization is something I knew about in a general sense, but after reading Bosch’s book I understand it much better—both the details of how the simplex algorithm works, and especially the various ways linear optimization can be applied. I think Bosch does a fantastic job explaining things in a way that gives real insight but doesn’t get bogged down in too much detail. (In a few places I personally wish there had been a few more details—but it’s quite possible that adding more detail would have made the book better for me but worse for a bunch of other people, i.e. it would not be a global optimum!)
Another thing the book explains really well is how the Travelling Salesman Problem (TSP) can be solved using linear optimization. I had no idea there was a connection between the two topics. I’m sure the connection is explained in great detail in the TSP book by William Cook, which I read 7 years ago, but for some reason when I read that I guess it didn’t really click. But from reading Bosch’s book I feel like I now know enough to put together the details and implement a basic TSP solver myself if I wanted to (maybe I will)!
I’m definitely inspired to use some of Bosch’s techniques to make my own artwork—if I do, I will obviously post about it here!
]]>Robert Anderson noted that poker players sometimes use the second hand of a watch to introduce some randomness into their strategy. I assumed this would be something like getting a random bit based on whether the number of seconds is even or odd, but Pete McAllister chimed in to say that it is usually something more like dividing a minute into chunks, and making a decision based on which chunk the current second is in. For example, if you want to make one choice 20 percent of the time and another choice 80 percent of the time, you could just make the first choice if the second hand is between 0–12 seconds, and the other choice otherwise.
In game theory this is called a “mixed” strategy, and this kind of strategy can arise naturally as the Nash equilibrium of certain kinds of games, so it’s not surprising to me that it would show up in high-level poker. I found conflicting advice about this online; some people were claiming that you should not use randomness when playing poker, but I did find a website that talked about implementing this kind of mixed strategy using the second hand of a watch, and it seemed to be a website with pretty high-level poker advice.
In any case, if you have a phone or a watch with you, this does suggest some strategies for generating random numbers: for example, look at the last digit of the seconds to get a random number from 0–9, or whether it is even or odd to get a bit. Or you could just take the number of seconds directly as a random number between 0–59. Of course this only works once and then you have to wait a while before you can do it again. Also, it turns out that my phone doesn’t show seconds by default. Taking the ones digit of the minutes as a random number from 0–9 should work too, but the tens digit of the minutes seems like it’s “not random enough”, in the sense that it might be correlated with whatever it is that I’m doing.
Of course, a phone or watch counts as an “aid”, but most people tend to carry around something like this all the time, so it’s relatively practical. On the other hand, if you’re going to use a phone anyway, you should just use an app for generating random numbers.
Naren Sundar commented that hair is pretty random, but admitted that it would be hard to measure.
Frederik suggested spitting, or throwing your shoe in the air and seeing which way the toe points when it lands. I like the shoe idea, but on the other hand it’s somewhat obtrusive to take your shoe off and throw it in the air every time you want a random bit! And what if you’re not wearing shoes? I’m also afraid I might throw my shoe in the same way every time; I’m not sure how random it would be in practice.
Kaligule suggested taking whatever song is currently running through your head, stopping it at a random point, and getting a random bit by seeing whether the number of consonants in the next word is even or odd.
This is a cool idea, and is the only proposal that really meets my criterion of generating randomness “without aids”. I think for some people it could work quite well. “Stopping at a random point” is somewhat problematic—you might be biased to stop at certain points more than others—but it’s pretty hard to know how many consonants are in a word before you count, so I’m not sure this would really bias the results that much.
Unfortunately, however, it won’t work for me because, although I do always have some kind of music running through my head, it often has no lyrics! Kaligule suggested using whether the melody goes up or down, but this is obvious (unlike number of consonants in a word) and too easy to “cheat”, i.e. pick a stopping point that gives me the bit I “want”.
This suggested another idea to me, however: just pre-generate some random data and put some up-front effort into memorizing it. Whenever you need some randomness, use the next part of the random sequence you memorized. When you use it up, generate another and memorize that instead. This leaves a number of questions:
How do you reliably keep track of where you are in the sequence? I don’t actually have a good answer to this. I think in practice I would get confused and forget whether I had already used a certain part or not. Though maybe this doesn’t really matter that much.
What format would be most effective, and how do you go about memorizing it? Some ideas:
My first idea is to generate a sequence of random bits, and then write a story where sequential words have even or odd numbers of letters corresponding to the bits in your sequence. Unfortunately, this seems like a relatively inefficient way to memorize data, but writing a story that corresponds to a given sequence of bits does sound like a fun exercise in constrained writing.
Alternatively, one could simply generate a random sequence of digits (or hexadecimal digits) and memorize them using whatever sort of memorization technique you like (e.g. a memory palace). This is less fun but probably more effective. Memorizing a story sounds like it would be easier, but I don’t think it is, especially since you would have to memorize it word-for-word and you only get one bit per word memorized, as opposed to something like e.g. four bits per hexadecimal digit.
I have generated some random hexadecimal digits but haven’t gotten around to trying to memorize them yet. If I do I will definitely report on the experience. In the meantime, I’m also open to more ideas!
]]>[Disclosure of Material Connection: Princeton Press kindly provided me with a free review copy of this book. I was not required to write a positive review. The opinions expressed are my own.]
The Mathematics of Various Entertaining Subjects, Volume 3: The Magic of Mathematics
Jennifer Beineke and Jason Rosenhouse, eds.
Princeton University Press, 2019
The MOVES conference takes place every two years in New York. MOVES is an acronym for “The Mathematics of Various Entertaining Subjects”, and the conference is a celebration of math that isn’t necessarily considered an Important Research Topic, and doesn’t necessarily have Important Applications—but simply math that is fun for its own sake. (Although in hindsight, math that starts out as Just For Fun often seems to end up with important applications too—for example, think of graph theory or probability theory.) The most recent conference took place just a few months ago, in August 2019; the next one will be in August 2021 (you can already register if you like to plan that far ahead!).
This book is basically the conference proceedings from 2017—a collection of papers that were presented at the conference, published all together in book form. So it’s important to state at the outset that although the topics are entertaining, this really is a collection of research papers. Overall this is definitely not a book written for a general audience! I had to work hard to understand some of the papers, and some of them lost me completely.
However, there’s some great stuff in here that rewards patient study. Some of my favorites that are more generally accessible include:
A chapter on “Wiggly Games and Burnside’s Lemma” that does a great job explaining Burnside’s Lemma—a classic result about counting things with symmetry, at the intersection of combinatorics and group theory—via applications to counting the number of possible tiles in several different games.
“Solving Puzzles Backwards” has some nice puzzles and a discussion of elegant ways to approach their solutions.
“Should we Call Them Flexa-Bands?” has some interesting reflections on the topology of different types of flexagons.
Some other things I particularly enjoyed but which are not so accessible without some background include a chapter on the computational complexity of losing at checkers, a chapter on “Kings, sages, hats, and codes” that I wish I understood better, and a chapter on the combinatorics of Legos.
There’s so much other stuff in there on such wildly varying topics that it’s impossible to summarize. In any case, definitely recommended if you are a professional mathematician looking for some fun yet still technically meaty reading; definitely not recommended if you’re looking for a casual read of a popular math book. And if you’re somewhere in between—that is, you’re not a professional mathematician but you aspire to read and understand things on that level—this could honestly be a great place to start!
]]>which is what we get when we start with a sequence of consecutive th powers and repeatedly take successive differences.
Recall that we defined as the set of all functions from a set of size (visualized as blue dots) to a set of size (visualized as yellow dots on top of blue dots) such that the blue dot numbered is missing. I also explained in my previous post that the functions with at least one blue dot missing from the output are exactly the “bad” functions, that is, the functions which do not correspond to a one-to-one matching between the blue dots on the left and the blue dots on the right.
As an example, the function pictured above is an element of , as well as an element of . (That means it’s also an element of the intersection —this will be important later!)
Let be the set of all functions from to , and let be the set of “good” functions, that is, the subset of consisting of matchings (aka Permutations—I couldn’t use for Matchings because is already taken!) between the blue sets. We already know that the number of matchings between two sets of size , that is, , is equal to . However, let’s see if we can count them a different way.
Every function is either “good” or “bad”, so we can describe the set of good functions as what’s left over when we remove all the bad ones:
(Notice how we can’t just write , because the sets overlap! But if we union all of them we’ll get each “bad” function once.)
In other words, we want to count the functions that aren’t in any of the . But this is exactly what the Principle of Inclusion-Exclusion (PIE) is for! PIE tells us that the size of this set is
that is, we take all possible intersections of some of the , and either add or subtract the size of each intersection depending on whether the number of sets being intersected is even or odd.
We’re getting close! To simplify this more we’ll need to figure out what those intersections look like.
What does look like? The members of are exactly those functions which are in both and , so contains all the functions that are missing both and (and possibly other elements). Likewise, contains all the functions that are missing (at least) , , and ; and so on.
Last time we argued that , since functions from to that are missing can be put into a 1-1 matching with arbitrary functions from to , just by deleting or inserting element :
So what about an intersection—how big is (assuming )? By a similar argument, it must be , since we can match up each function in with a function from to : just delete or insert both elements and , like this:
Generalizing, if we have a subset and intersect all the for , we get the set of functions whose output is missing all the elements of , and we can match them up with functions from to . In formal notation,
Substituting this into the previous expression for the number of blue matchings , we get
Notice that the value of depends only on the size of the subset and not on its specific elements. This makes sense: the number of functions missing some particular number of elements is the same no matter which specific elements we pick to be missing.
So for each particular size , we are adding up a bunch of copies of the same value —as many copies as there are different subsets of size . The number of subsets of size is , the number of ways of choosing exactly things out of . Therefore, if we add things up size by size instead of subset by subset, we get
But this is exactly the expression for that we came up with earlier! And since we already know this means that too.
And that’s essentially it for the proof! I think there’s still more to say about the big picture, though. In a future post I’ll wrap things up and offer some reflections on why I find this interesting and where else it might lead.
]]>This is an efficient way to find all the primes up to a given limit. Note that it doesn’t require doing any division or factoring, just adding. Here’s the image of the sieve again:
Some questions for you to ponder:
And just for fun here’s the sieve diagram for , one of my favorites. Click here for a larger version.
Everyone is probably familiar with the so-called “order of operations”, which is
a collection of rules that reflect conventions about which procedures to perform first in order to evaluate a given mathematical expression.
(the above quote is from the Wikipedia page on Order of Operations). If you grew up in the US, like me, you might have memorized the acronym PEMDAS; I have recently learned that people in other parts of the world use other acronyms, like BEDMAS. In any case, these mnemonics help you remember the order in which you should conventionally perform various arithmetic operations, right?
For example:
Makes sense, right? We did the stuff in parentheses first, then the exponent, then the multiplication and division (from left to right), then the addition.
OK, pop quiz: what is the value of
?
Of course it is , you say. Aha, but did you follow the order of operations? I thought not. Supposedly the order of operations tells you that you have to perform the exponentiation before the multiplication, but I am willing to bet that you skipped straight over the exponentiation and did the multiplication first! Really, you should be ashamed of yourself.
Another pop quiz: what is the value of
?
Well, let’s see:
Easy peasy. But wait, did you notice what I did there? I did one of the additions before I did the last multiplication! According to the “order of operations” this is not allowed; you are supposed to perform multiplication before addition, right??
One more quiz: solve for in the equation
.
Of course we can proceed by subtracting from both sides, resulting in , and then dividing both sides by , finding that is the unique solution.
How did we know not to add the and the , resulting in the bogus equation ? Because of the order of operations, of course… but wait, what does the order of operations even mean here? Did you notice that we never actually performed the multiplication or addition at all?
My point is this: casting the “order of operations” in performative terms—as in, the order we should perform the operations—is grossly misleading.
You might think all my examples can be explained away easily enough—and I agree—but if you take literally the idea of the order of operations telling us what order to perform the operations (as many students will, and do), they don’t make sense. In fact, I would argue that saying the order of operations is about the order of performing the operations is worse than misleading, it is just plain wrong.
I made the title of this post provocative on purpose, but of course I am not actually arguing against the order of operations in and of itself. We certainly do need to agree on whether should be or . But we need a better way of explaining it than saying it is the “order in which we perform the operations”.
Any mathematical expression is fundamentally a tree, where each operation is a node in the tree with the things it operates on as subtrees. For example, consider the example expression from the beginning of the post. As a tree, it looks like this:
This tells us that at the very top level, the expression consists of an addition, specifically, the addition of the number and some other expression; that other expression is a division (quick check: do you see why the division should go here, and not the multiplication?), and so on.
However, pretty much all the writing systems we humans have developed are linear, that is, they consist of a sequence of symbols one after the other. But when you go to write down a tree as a linear sequence of symbols you run into problems. For example, which tree does represent?
Without further information there’s no way to tell; the expression is ambiguous.
There are two ways to resolve such ambiguity. One is just to add parentheses around every operation. For example, when fully parenthesized, the example expression from before looks like this:
With one set of parentheses for every tree node, this is an unambiguous way to represent a tree as a linear sequence of symbols. For example, in the case of , we would be forced to write either or , fully specifying which tree we mean.
But, of course, fully parenthesized expressions are quite tedious to read and write. This leads to the second method for resolving ambiguity: come up with some conventions that specify how to resolve ambiguity when it arises. For example, if we had a convention that says has a higher precedence than (i.e. “comes before”, i.e. “binds tighter than”) , then is no longer ambiguous: it must mean the left-hand tree (with the at the top), and if we wanted the other tree we would have to use explicit parentheses, as in .
Of course, this is exactly what the “order of operations” is: a set of conventions that tells us how to interpret otherwise ambiguous linear expressions as unambiguous trees. In particular, the operations that we usually talk of being “performed first” really just have higher precedence than the other operations. Think of the operations as “magnets” trying to attract things to them; higher precedence means stronger magnets.
I wouldn’t necessarily phrase it this way to a student, though. I have never taught elementary or middle school math, or whichever level it is where this is introduced, but I think if I did I would just tell them:
The order of operations tells us where to put parentheses.
The nice thing is that if you think about adding parentheses instead of performing operations, then the order of operations really does tell you what order to do things: first, add parentheses around any exponentiations; then add parentheses around any multiplications and divisions from left to right; finally, add parentheses around any additions and subtractions from left to right. Of course, if the expression already has any parentheses to begin with, you should leave them alone. This also explains what is “different” about the parentheses/brackets at the beginning of PEMDAS/BEDMAS: you can’t “perform” parentheses like you can “perform” the other operations. But from this new point of view, you don’t even need to include them in the mnemonic: they are different because the whole point is to add more of them. Of course, as students become familiar with this process, at some point they no longer need to actually write out all the parentheses, they can just do it in their heads.
To be honest I am not sure exactly what the takeaway should be here. I do think we are doing students a disservice by teaching them the misleading idea that the “order of operations” is about the order in which to perform the operations, and sadly it seems this idea is quite widespread. But I am not exactly sure what should be done about it. If you are a mathematics educator—either one who teaches students about the order of operations, or one who has witnessed the effects of their understanding on later subjects—I’d love to hear your responses and ideas!
]]>.
We are trying to show that , in order to show that starting with a sequence of consecutive th powers and repeatedly taking successive differences will always result in . Last time we also made a picture of a typical function from a set of size to a set of size , which looks like this:
Here’s the plan of action from this point:
What makes a function “bad”? Based on the discussion in my previous post, it seems that there are two things that can make a function “bad”, that is, prevent it from being a matching between the blue sets:
The example function above has both these problems:
From another point of view, however, there is really only one thing that makes a function bad, which encompasses both points above: a function is bad if and only if there is at least one “missing” blue dot, that is, a blue dot on the right-hand side which is not pointed to by an arrow. On the one hand, if there is a missing blue dot, then the function obviously can’t be a matching, since the missing blue dot is unmatched. On the other hand, if there are no missing dots, then each dot must be matched with a unique dot on the left-hand side, since there are the same number of blue dots on both sides. (In fact, if a blue dot is missing, it will always be caused by one of the two problems listed above—there will either be an arrow pointing to an orange dot, or two arrows that collide.) Here is the same function again, but with the missing elements marked as “bad” by fading them out a bit:
Our goal is now to “subtract off” these bad functions, that is, the functions with at least one missing blue dot.
Let’s number the blue dots from to (with at the top), and define as the set of all functions from to where blue dot is missing ( is for “Missing”).
So, for example, the function from before is an element of , since element is missing:
How big is ? Well, if we just leave out the missing blue dot (and renumber the blue dots with numbers greater than ), we are left with a function from to ; conversely, every function from to can be made into a function from to with blue dot missing, just by inserting a new dot numbered (and again renumbering the dots with numbers ).
So, since we can put the function in in one-to-one correspondence with all the functions from a set of size to a set of size , the size of is just the number of such functions, that is, .
At this point, however, we run into a snag: as you can see, the example function we’ve been using is also an element of , since is missing too. And this is why we can’t simply subtract each of the and call it a day: the sets overlap, so if we subtract all of them we would be subtracting functions multiple times.
Hmm, subsets that overlap, and we want to count the things that are in none of the subsets… this is starting to sound familiar! We need some PIE, of course. In my next post we’ll start putting some of the pieces together!
]]>In particular we’re trying to show that the two sides of this equation correspond to two different ways to count the same set of mathematical objects. We already know that the right-hand side counts the number of permutations of a set of size , or put another way, the number of one-to-one matchings between two different sets of size . Somehow, then, we need to show that the left side of the equation is just a complicated way to count the same thing.
Let’s call the left-hand side of the equation , that is,
.
The first thing I want to do is switch the direction of the summation. That is, everywhere we currently have I want to replace it with , and vice versa. So will switch places with , and will switch with , and so on. Ultimately we will still have the same values of , just in the reverse order. Of course, the order in which we add up a bunch of things doesn’t matter (since addition is commutative), so this won’t change the value of at all.
Finally, note that (there are the same number of ways to choose things out of as there are to choose things to exclude), so we can write
.
When , involves the term . Let’s think about what that counts. I explained in a previous post that is the number of functions from a set of size to a set of size (as a reminder, this is because the function gets to independently choose where to map each input value—so there are choices for each of the inputs, for a total of choices). So let’s draw a typical function from a set of size to a set of size :
For this example I chose and . The blue dots on the left represent the set of size . The dots on the right represent the set of size : I made of them blue to emphasize that they “correspond” to the blue dots on the left, and the remaining “extra” dots are orange.
The arrows indicate what the function does. To be a valid function, it has to be defined for each element of the input set; that is, in the picture, there has to be exactly one arrow coming out of each dot on the left. However, the arrows can go anywhere. Note that in the example above, one arrow points to an orange dot, and two other arrows point the same blue dot (we could say they “collide”).
Some of these functions are actually matchings between the blue dots on the left and right, that is, every arrow points to a blue dot, and none of the arrows collide. For example:
We already know that the right-hand side of the equation, , is the number of such matchings. And we’re going to show that the left-hand side of the equation is what you get if you start with all the possible functions from to , and then subtract the ones that aren’t matchings. Of course, these should give you the same result! If you like, between now and my next post, you can try to work out the details for yourself.
]]>Here’s a nice Numberphile interview with Andrew Booker about the new discovery. They also talk about Hilbert’s tenth problem, undecidability, the reasons for doing computer searches like this, the role of science communication (such as Numberphile) in spurring discovery, and other things.
See my previous blog post for why this is interesting!
]]>