The chromatic number of the plane, part 1

About a week ago, Aubrey de Grey published a paper titled “The chromatic number of the plane is at least 5”, which is a really cool result. It’s been widely reported already, so I’m actually a bit late to the party. But in my usual way, I’d like to take a more leisurely approach to explaining the math in more detail. In this post, I’ll explain what the problem is; in upcoming posts I’ll explain what we know about the problem, and how de Grey’s paper fits in.

Graph colorings and chromatic numbers

Suppose we have a graph G with vertices connected by (bidirectional) edges. One natural task is to assign a color to each vertex of G in such a way that for every edge, the two vertices at the ends of the edge have different colors. Put another way, no two vertices connected by an edge should have the same color. This is called a (vertex) coloring of G.

For example, here is a graph with a valid coloring (using four colors):

If you look at any edge in this graph, you can see that the two vertices at the ends of the edge have different colors.

As an example of why we might care about this, suppose we have a graph where the vertices represent people, and two people are connected by an edge if they are enemies. Then a vertex coloring corresponds to a way to assign everyone to a team (one team for each color) such that enemies are never on the same team! Here’s another example: suppose vertices represent courses, and there is an edge between two courses if there is at least one student who is signed up for both courses. Then a vertex coloring is a way to schedule courses—each “color” represents a different time slot—such that no student has a conflict. Or if vertices represent countries, and there is an edge between two vertices whenever those countries share a border, a coloring of the graph is a way to literally color a map so bordering countries never have the same color. (Famously, for any (non-weird) map you can do this using only four colors; this is something I’d like to write more about someday.) These examples are just the tip of the iceberg: this is a natural problem that comes up all the time in practice.

Given a graph G, coming up with a vertex coloring is trivial: just make every vertex a different color! But this is silly; typically, it is better to use fewer colors, so although giving every vertex a different color results in a valid vertex coloring, it is not a very good one. The natural question to ask is: given a graph G, what is the smallest possible number of colors needed to construct a vertex coloring? This smallest possible number is called the chromatic number of G (and is often written \chi(G)).

For example, consider the following three graphs:

The graph on the left has chromatic number 2: we can’t use only one color since then the edges would have vertices of the same color on both ends, but we can get away with only two colors since there’s nothing to prevent the vertices on the ends from having the same color as each other, as long as they have a different color than the middle vertex. On the other hand, the graph in the middle has chromatic number 3: since each vertex is adjacent to both other vertices, we actually need to give every vertex its own distinct color.

The graph on the right is the same as the example graph I showed before, with a coloring using four vertices. So is the chromatic number four? Not so fast! At this point we only know the chromatic number is at most four. I will leave you the following exercises:

  1. Explain why it is not possible to color the vertices of this graph using only two colors.
  2. Either show a way to color the vertices using only three colors, or explain why it is not possible.

The chromatic number of the plane

It’s a bit of a stretch, but we can generalize this to ask about the “chromatic number” not of a graph, but of the plane. In particular, we want to assign a color to every point on the plane such that any two points which are exactly 1 unit apart have different colors. Notice that the plane contains infinitely many points (uncountably many, in fact), so we certainly can’t hope to color them by simply listing all the points and giving a color to each! In essence, we are looking for a function c : \mathbb{R}^2 \to \text{Color} such that for any points p_1 = (x_1, y_1) and p_2 = (x_2, y_2), if the distance between p_1 and p_2 (computed using the distance formula, aka Pythagorean Theorem) is exactly 1, then c(p_1) \neq c(p_2).

You can think of a function \mathbb{R}^2 \to \text{Color} as an (infinite) “picture” or “painting”. Imagine looking at a painting and holding up a 1-inch ruler next to it. No matter where you place the ruler and no matter which way you turn it, the two points at opposite ends of the ruler always have different colors! What could such a painting look like? And how many colors does it need?

The minimum number of colors needed is known as the chromatic number of the plane, abbreviated CNP. The problem of determining CNP (known as the Hadwiger-Nelson problem) was first posed around 1950, and we have known for a long time that the answer is somewhere between 4 and 7, inclusive. (I will explain how we know this in later posts!) De Grey’s contribution was to show that the answer can’t be 4; hence we now know that the answer has to be 5, 6, or 7—but we still don’t know which!

Posted in geometry, proof | Tagged , , , , | 1 Comment

More on sums of palindromes

In my previous post I reported on a recent proof that every positive integer can be written as the sum of three palindromes. The first thing to report in this follow-up post is that Lewis Baxter sent me the Python code which implements their algorithm—and even better, pointed me to a web page which will run the algorithm for you! Go ahead and try it, it’s fun—and it’s quite fast, even for really huge numbers. For example, just for fun I put in my office phone number, and it turns out that

5014501377 = 4010220104 + 963303369 + 40977904.

Good to know!

Commenter Anthony asked if the proof worked for numbers in any base, which was an important point I forgot to mention: what the paper actually proved is that every positive integer can be written as the sum of three palindromes in any base b \geq 5.

The theorem is not true for numbers in base 2. Notice first that any binary number starts with 1, so any binary palindrome must end with 1 as well, that is, it must be odd—except 0, which we define to be a palindrome as well. But a sum of three odd numbers is odd, so for the theorem to true, every even number would have to be the sum of two palindromes in base 2 (plus zero to make three palindromes). But the even binary number 10110000 is a counterexample, since it can be checked (e.g. by writing a computer program to try all possibilities) that it is not the sum of two binary palindromes. The following code does exactly that:

ghci> :set -XBinaryLiterals
ghci> import Text.Printf
ghci> let n = 0b10110000
ghci> let isBinPal x = printf "%b" x == reverse (printf "%b" x :: String)
ghci> [ k | k <- [1..n], isBinPal k, isBinPal (n-k) ]

The [] on the last line is an empty list indicating that there are no binary palindromes which add to 10110000.

This leaves some questions: what about bases 3 and 4? And how many palindromes do you need in base 2? At the time the paper was published, the answers to these questions were not known. It was conjectured that four palindromes are enough in base 2. It was also conjectured that three palindromes are enough for bases 3 and 4—just like in every other base \geq 5; the proof just didn’t cover those cases.

In his email, Lewis Baxter also pointed me to an even more recent paper by Aayush Rajasekaran, Jeffrey Shallit, and Tim Smith which uses a completely different method to resolve the remaining questions: it turns out that the conjectures were correct; we now know that every positive integer can indeed be written as a sum of four binary palindromes, and only three palindromes in base 3 or base 4! Jeffrey Shallit has a nice blog post explaining their method here. It’s not clear to me whether their method can be used to produce an actual algorithm for finding base-2, -3, or -4 palindromes that sum to a given number, though it certainly should be possible.

Posted in arithmetic, computation, links | Tagged , , , | 7 Comments

Every positive integer is a sum of three palindromes

I recently learned from John Cook about a new paper by Javier Cilleruelo, Florian Luca, and Lewis Baxter proving that every positive integer can be written as a sum of three palindromes. A palindrome is a number that is the same when written forward or backward, like 1423241.

The really cool thing is that the proof essentially consists of a (very long, very detailed) algorithm, which, for any positive integer n, explicitly finds three palindromes which sum to n! For example, when n = 314159265358979323846 (the first 21 decimal digits of \pi), their algorithm gives the three palindromes

\begin{array}{rcl} p_1 &=& 210100100111001001012 \\ p_2 &=& 98639929400492993689 \\ p_3 &=& 5419235847485329145 \end{array}

You can easily verify (try it!) that these are all palindromes, and that they add up to 314159265358979323846.

ghci> p1 = 210100100111001001012
ghci> p2 = 98639929400492993689
ghci> p3 = 5419235847485329145
ghci> isPalindrome x = show x == reverse (show x)
ghci> isPalindrome p1 && isPalindrome p2 && isPalindrome p3

ghci> p1 + p2 + p3

The reason this is so cool is that in many cases, the proof of a theorem of the form “For every integer n, there exists a thing” might be nonconstructive—that is, it shows that for every n it can’t be the case that there doesn’t exist a thing, which “proves” the theorem but doesn’t actually help us find the things which we suposedly know must exist. But this proof is very much constructive: not only does it show that every positive integer must be expressible as a sum of three palindromes, it actually backs it up with explicit evidence by giving us a way to find the palindromes.

Apparently the authors have written a Python program which implements the algorithm, and I have asked for a copy; I’m excited to play with it! I’d implement it myself, except that the algorithm has so many cases and sub-cases and sub-sub-cases and special adjustments to the sub-sub-cases, etc. that it would probably take me a week.

Posted in arithmetic, computation, links | Tagged , , , , , | 12 Comments

Why drawing orthogons is hard

We’re nearing the end of this little diversion on orthogons. We now know that orthogons are in 1-1 correspondence with orthobraces, and we can efficiently generate orthobraces. The only thing left is to find a way to turn orthobraces into drawings.

I described a way to turn orthobraces into drawings in a previous post, but the drawings it produced were terrible—the edges get smaller and smaller as the algorithm proceeds. For example, given the string XXXXXXVVXV, the algorithm produced this drawing:

But we’d really like to be able to produce a drawing like this instead:

What kind of drawings do we want in general? Here are my criteria:

  1. No two edges should intersect or touch (other than consecutive edges touching at a vertex).
  2. The length of each edge should be a positive integer.
  3. The sum of all the edge lengths should be as small as possible.

These criteria seem reasonable, to me at least, though one can easily imagine making other choices. For example, some might wish to consider polygons with edges that intersect at a vertex, such as the one shown below.

We could also imagine minimizing the area, rather than the perimeter, of the polygon.1 In any case, the idea is that a polygon with these properties should be as small as possible while remaining non-intersecting and with all edges having a common measurement. (The edges all having integer lengths also means that we will be able to create these drawings by gluing together squares.) Let’s call orthogon drawings with these properties good.

Given these criteria, we can ask questions such as the following:

  1. Given an orthobrace, does a good drawing of it always exist?
  2. Assuming good drawing(s) do exist, can we come up with an algorithm to generate them?

It is easy to show that (1) is true: first, use the naive orthogon drawing algorithm to produce a lopsided drawing with tiny edges. Now, if the smallest edge has length 1/2^k, scale the whole polygon by a factor of 2^k. This results in a polygon with no self-intersections and all integer edges. This proves that at least one such polygon exists; among all such polygons there must be at least one with minimal perimeter.

However, this does not actually give us an algorithm for producing good drawings! That last step of the proof is the problem: just because there must exist some drawing with minimal perimeter doesn’t help us actually find one. In the end, coming up with an algorithm to produce good drawings turns out to be surprisingly tricky, and illustrates a very important principle in mathematics and computer science. Faced with some large problem/task/proof/computation, what we would ideally like to be able to do is to split it into subproblems/subtasks/lemmas/helper functions, then solve/complete/prove/compute each of those subthings independently, and somehow combine the results to come up with an overall result. For example, to compute the product of two matrices, we just have to break it down into a bunch of dot products between two vectors, doing that for each combination of a row and a column, and then put all the results together into a big result matrix. Working with smaller independent pieces and then combining results is a very powerful technique, since it lets us solve problems, do computations, prove theorems, etc. which are much larger than what we would be able to accomplish by attacking them directly.

However, it seems like this approach fails for drawing orthogons! We might imagine something like this: take the orthobrace and break it up into a bunch of substrings; come up with some kind of drawing for each one; then combine the drawings into a drawing for the whole thing. Unfortunately, this does not work, since the length of an edge can be influenced by other features of the orthogon which are far away. For example, look at this polygon:

The top edge has to be very long in order for the polygon to not intersect itself. How do we know how long it has to be? We cannot infer it just by looking at neighboring edges. It is not even due solely to some particular edges on the other side of the polygon. It is in some sense a “global” property of the whole polygon: the way in which all the edges and vertices interact produces the need for the top edge to be long.

Intuitively, it is this “global” nature of deciding on good edge lengths which makes this problem so difficult. In order to decide how long to make each edge, we have to look at all the vertices and edges… and the length we choose for one edge may influence the length of some other edges, which may in turn force us to choose a different length for the original edge! How can we ever make any progress? And even worse, how can we ever hope to know whether some choices we have made lead to a polygon with minimal perimeter?

In my next (and final!) post on orthogons, I will finally explain how we can leverage some amazingly sophisticated software to produce good orthogon drawings!

  1. Can you come up with an example where the minimum-area and minimum-perimeter solutions are not the same?

Posted in computation, geometry | Tagged , , , , , , | 2 Comments

Efficiently listing orthobraces

In my last couple posts, we talked about a simple yet inefficient method for listing all orthobraces of a particular size. So how do we generate them efficiently? It turns out that it can be done: in 2011, Karim, Sawada, Alamgir, and Husnine published a paper entitled Generating Bracelets with Fixed Content which gives us exactly what we need to generate orthobraces efficiently. Unfortunately, the algorithm is rather complicated, and the proof that it works is more complicated still! (Perhaps I will write about it someday, but that would be another many-post series in and of itself!) But I implemented it here (in Haskell), and used it to generate the pictures in my original post.

In one sense, I don’t know why I’m writing about this: it’s more complicated than I’m willing to explain at the moment, and actually it makes little practical difference anyway, because the naive code from my previous post can generate, say, all 3260 orthobraces of length 20 in 1.3 seconds—plenty fast enough for practical purposes, and we can’t reasonably expect to draw or look at more than 3000 of these things. I just think it’s cool that a simple question like this turns out to have a very nontrivial answer, and someone has answered it—but only quite recently! (Incidentally, for comparison, the efficient code generates all 3260 orthobraces of length 20 basically instantaneously, in 0.03 seconds.)

In an upcoming post or two, I’ll explain the remaining missing piece: how do we go from an orthobrace to a picture of an orthogon?

Posted in combinatorics, computation | Tagged , , , , , , , , , , , | 1 Comment

Haskell code to naively list orthobraces

Let’s see some simple Haskell code to generate orthobraces, by generating all sequences and throwing away ones we’ve already generated.

First, some library imports we’ll need.

> import Data.List
> import qualified Data.Set as S

Here’s a function to generate all sequences with n+4 Xs and n Vs.

> -- @orthoseqs n@ generates all length 2n+4 *sequences* with n+4 X's and
> -- n V's.
> orthoseqs :: Int -> [[Char]]
> orthoseqs n = go (n+4) n

orthoseqs works by calling a recursive helper function go, which keeps track of how many Xs and how many Vs still need to be generated.

>   where
>     go 0 v = [replicate v 'V']
>     go x 0 = [replicate x 'X']

If the number of Xs or Vs remaining is 0, we just output the right number of copies of the other letter.

>     go x v = map ('X':) (go (x-1) v) ++ map ('V':) (go x (v-1))

Otherwise, we recursively generate all strings with x-1 X’s and v V’s, and put an X on the front of each; and similarly for strings with a V at the beginning.

Let’s try it:

ghci> orthoseqs 0

ghci> orthoseqs 1

ghci> length (orthoseqs 2)

ghci> length (orthoseqs 3)

This looks right and seems to match up with the numbers we computed last time.

Now, to throw out sequences which are equivalent up to rotation and/or reversal, the easiest way is to write a function canonicalize which computes a canonical representative from each equivalence class: in particular, given a sequence, it computes the lexicographically smallest (that’s a fancy way of saying “comes first in dictionary order”) sequence which is equivalent to it.

> canonicalize :: Ord a => [a] -> [a]
> canonicalize as = minimum (rots as ++ rots (reverse as))

canonicalize just finds the minimum sequence among all rotations of as and rotations of reverse as.

>   where
>     rots xs = map (take n) . take n . tails $ xs ++ xs
>       where
>         n = length xs

To generate all rotations, we do a little list manipulation: xs ++ xs puts two copies of xs together end to end; tails produces something like this:

ghci> xs = [1,4,5]
ghci> tails (xs ++ xs)

Then we take just the first n results of tails and take only the first n elements of each, like so:

ghci> map (take 3) . take 3 . tails $ xs ++ xs

Now we can write orthobraces fairly simply: generate all sequences, canonicalize each (resulting in a bunch of duplicates), then put them into a set (which gets rid of duplicates) and list the elements of the set.

> -- @orthobraces n@ generates all length 2n+4 *bracelets* with n+4 X's
> -- and n V's.  That is, one representative is generated for each
> -- bracelet, up to reflection and rotation.
> orthobraces :: Int -> [[Char]]
> orthobraces = S.toList . S.fromList . map canonicalize . orthoseqs

Let’s try it:

ghci> orthobraces 0

ghci> orthobraces 1

ghci> orthobraces 2

ghci> orthobraces 3

ghci> map (length . orthobraces) [0..6]

It seems to work! As we already discussed last time, this is not a particularly efficient way to generate orthobraces, since we end up spending most of our time throwing away duplicates (and what’s worse, even the proportion of time spent throwing away duplicates increases as n increases). But it’s rather simple to write and test, and can serve as a good reference implementation to compare other implementations against.

Posted in combinatorics, computation | Tagged , , , , , , , , | 1 Comment

Listing orthobraces

I took a bit of a break to finish writing a paper for submission to the International Conference on Functional Programming—which I should write about here! All in good time. (I tend to accumulate things to write about faster than I can write them… a blessing and a curse!) For now, back to your irregularly scheduled orthogons.

Recall from my previous post that a bracelet is a string considered unique only up to rotation and reflection. We also gave the name orthobrace to a bracelet consisting of X’s and V’s, with exactly four more X’s than V’s, and we saw that they are in one-to-one correspondence with orthogons. If we can find a way to list all orthobraces of a certain size (and if we can find a reasonable way to turn them into drawings!) we can make a list of all the orthogons of a certain size.

In my previous post I left you with a few questions.

  1. How many sequences are there with n+4 X’s and n V’s?

    (Here I really do mean just sequence, not bracelet, so e.g. XXXV and XXVX are the same bracelet but different sequences.)

    If we have four X’s and no V’s then there is of course only one. With five X’s and one V, we have a choice of five different places where we could put the V: before the first X, after the first X, after the second X, … and so on. With six X’s and two V’s, we get to choose any two out of six possible places to put the V’s.

    In general, the number of sequences with n+4 X’s and n V’s is the number of ways to choose n places out of the total 2n+4 to put the V’s. We can make a table:

    \begin{array}{rcl}\displaystyle\binom 4 0 &=& 1 \\[1em] \displaystyle\binom{6}{1} &=& 6 \\[1em] \displaystyle\binom{8}{2} &=& 28 \\[1em] \displaystyle\binom{10}{3} &=& 120 \\[1em] \displaystyle\binom{12}{4} &=& 495 \\[1em] \displaystyle\binom{14}{5} &=& 2002 \\[1em] \displaystyle\binom{16}{6} &=& 8008 \end{array}

    As you can see, the number of sequences grows quite quickly; in fact (though I won’t prove this) it grows exponentially relative to n.

  2. What about orthobraces, where we consider sequences equivalent up to rotation and reversal, and just want one representative from each equivalence class? If you try listing them and counting how many there are of each size, you get a sequence of numbers like

    1, 1, 4, 8, 29, 79, 280 \dots

    which does not grow quite so quickly (though it turns out it still grows exponentially).

  3. Gesh made a nice argument in a comment on my previous post which gives us an idea of how quickly we should expect this to grow relative to how many sequences there are. In particular, we expect that every sequence of length 2n+4 will have roughly 4n+8 different sequences which it is equivalent to—2n+4 rotations plus all their reversals. This is only a rough estimate, since if the sequence has any sort of symmetry, some of the rotations and reflections may be identical. But symmetry is special—an average random string will not be symmetric, and hence most of the time all the rotations and reflections will be distinct. So we would expect there to be about 4n+8 times as many sequences as bracelets—a little less to account for some of the rotations and reflections not always being distinct. And if we take ratios, here’s what we get:

    \begin{array}{rcl|c} & \text{ratio} & & 4n+8 \\ 1/1 &=& 1 & 8 \\ 6/1 &=& 6 & 12 \\ 28/4 &=& 7 & 16\\ 120/8 &=& 15 & 20 \\ 495/29 &\approx& 17 & 24 \\ 2002/79 &\approx& 25 & 28 \\ 8008/280 &\approx& 29 & 32 \end{array}

    On the left hand side of the table is the ratio of the number of sequences of size 2n+4 to the number of orthobraces; on the right hand side is 4n+8. As you can see, the ratios do indeed seem to be approaching 4n+8 from below.

  4. So, what about generating all orthobraces of a given size by listing all sequences and throwing away any sequences which are equivalent to one we’ve already generated? This can be done, but as we can see from the above analysis, we would spend most of our time throwing out sequences which are equivalent to ones we’ve already generated.

So generating orthobraces this way isn’t actually such a great idea, although it does have the benefit of being quite simple conceptually. In another post I’ll show some simple code to carry out this “naive” algorithm, and then talk about an alternative!

Posted in combinatorics, computation | Tagged , , , , , | 2 Comments