The chromatic number of the plane, part 3: a new lower bound

In my previous post I explained how we know that the chromatic number of the plane is at least 4. If we can construct a unit distance graph (a graph whose edges all have length 1) which needs at least c colors, than we know the plane also needs at least c colors. We were able to construct a unit distance graph that needs at least 4 colors (the Moser spindle), and hence the chromatic number of the plane is at least 4.

It turns out that de Grey’s new proof follows exactly this same pattern: he constructs a unit distance graph which needs at least 5 colors. So why hasn’t anyone else done this before now? The Moser spindle (a unit distance graph with chromatic number 4) has been around since 1961; why did it take almost 60 years for someone to come up with a unit distance graph with chromatic number 5?

The reason is that the graph is very big! de Grey’s graph has 1581 vertices. Although it is itself constructed out of copies of smaller pieces, which are themselves constructed out of pieces, etc.—so it’s not quite as hard to analyze as any random 1581-vertex graph—proving that it requires 5 colors still required a computer. Even the process of finding the graph in the first place required a computer search. So that’s why no one had done it in the last 60 years.

A lot of people have been working on improving and extending this result. The current record-holder for smallest unit-distance graph with chromatic number 5 has 610 vertices, and was found by Marijn Heule using a computer search. Here’s a picture of it, provided by Marijn:

A 610-vertex unit-distance graph with chromatic number 5
A 610-vertex unit-distance graph with chromatic number 5

It’s unknown whether we will be able to find a proof that can be constructed and understood by humans without the help of a computer. It’s also unknown whether these techniques and the associated technology will be able to extend far enough to find a unit distance graph with chromatic number 6 (if such a thing exists—though it seems many mathematicians believe it should).

Posted in geometry, proof | Tagged , , , , , | 1 Comment

The chromatic number of the plane, part 2: lower bounds

In a previous post I explained the Hadwiger-Nelson problem—to determine the chromatic number of the plane—and I claimed that we now know the answer is either 5, 6, or 7. In the following few posts I want to explain how we know this! The proofs are not too hard to understand (with the possible exception of Aubrey de Grey’s new proof, but I’ll get to that later) and they are really cool.

In this post I’ll talk about lower bounds. That is, how do we know that the chromatic number of the plane (CNP) is at least 5?

CNP > 1

Let’s start with a smaller number. Can we argue that the CNP is at least 2? Sure, this is easy: if the entire plane is colored with only one color, then obviously any two points separated by a distance of 1 unit have the same color. So one color is not enough, and hence CNP \geq 2.

CNP > 2

Is two colors enough? If you play around with it a bit, most likely you will quickly get an intuition that this won’t work either. Here’s an intuitive argument why we need at least three colors. Suppose some region of the plane is colored, say, blue.

(It doesn’t have to be a circle, but let’s just think about a circular-ish region for simplicity’s sake.) The diameter of the region has to be smaller than 1, because otherwise there would be two points within the region that are 1 unit apart. But that means the entire area outside the region has to be, say, red.

But if the diameter of the blue region is less than 1 and the area outside the region is all red, then intuitively, we can find two red points which are a distance of 1 apart, as illustrated in the above picture.

This isn’t really a formal argument at all (what about really weirdly-shaped regions)? But we can give a different argument which is completely formal yet still quite simple. Consider the graph consisting of three vertices connected in a triangle.

As mentioned in my previous post, the chromatic number of this graph is 3—each vertex needs its own color since each vertex is adjacent to both other vertices. But this actually tells us something about the plane, since this graph can be drawn as an equilateral triangle with all its edges having length 1. If we pick any three points in the plane arranged in a unit equilateral triangle, they must all have different colors; hence at least three colors are needed to color the plane.

CNP > 3

Generalizing from the previous proof, we can see a general plan of attack to prove that three colors are not enough to color the plane. We want to find some graph G such that

  • the chromatic number of G is 4, and
  • G is a unit-distance graph, that is, it can be drawn in such a way that every edge has length 1.

If we can find such a graph, it immediately shows that we need at least four colors for the plane, since we can pick a set of points in the plane in exactly the same configuration as the vertices of the graph; since the vertices of the graph need four colors, so do the chosen points of the plane.

Well, such a graph G exists! In fact, there are many such graphs, but the simplest one is called the Moser spindle. (It’s named after William and Leo Moser, who published it in 1961; I have no idea why it is called a “spindle”.) Here it is:

As you can see, it’s made of four equilateral triangles, with two pairs glued together along an edge, a shared vertex at the top, and an extra edge of length exactly 1 connecting the bottom vertices. You can check for yourself that this is a unit-distance graph. But what is its chromatic number? Well, suppose we try coloring it with only three colors. Suppose the top vertex is red. That vertex is connected to two equilateral triangles; of course the other two vertices of each triangle must have the other two colors (say, green and blue). Then the two bottom vertices must again be colored red, since each is part of an equilateral triangle with one green and one blue vertex. But this isn’t allowed: the two bottom vertices are connected by an edge, so they can’t be the same color.

So, three colors aren’t enough to color the Moser spindle, and hence three colors aren’t enough to color the plane either!

CNP > 4?

So the above proofs are not that hard to grasp, and we have known that CNP > 3 since 1961 or so. So how did Aubrey de Grey prove that CNP > 4, and why did it take almost 60 years to go from 3 to 4? I’ll explain that in my next post!

Posted in geometry, proof | Tagged , , , , , | 3 Comments

Iterating squared digit sum

Another fun fact I learned from John Cook. Let s(n) be the function which takes a positive integer and outputs the sum of the squares of its digits. For example, s(149) = 1^2 + 4^2 + 9^2 = 1 + 16 + 81 = 98. Since the output is itself another positive integer, we can feed it back into the function again: s(98) = 9^2 + 8^2 = 81 + 64 = 145. Let’s see what happens when we continue iterating in this fashion:

149 \to 98 \to 145 \to 42 \to 20 \to 4 \to 16 \to 37 \to 58 \to 89 \to 145 \to \dots

In this case, we hit 145 again, after which point the values will start to repeat. So in this case, iterating s enters a loop of length 8, where it will get stuck forever.

Let’s try another starting value, say, 496.

496 \to 133 \to 19 \to 82 \to 68 \to 100 \to 1 \to 1 \to \dots

In this case we hit 1, which is a fixed point of s, that is, s(1) = 1^2 = 1, so we simply get stuck on 1 forever.

The amazing thing is that these are the only two things that can happen—for any positive integer, iterating s will eventually either reach 1, or enter the loop of eight numbers (145,42,20,4,16,37,58,89). Arthur Porges proved this in his article, A Set of Eight Numbers, published in 1945 in The American Mathematical Monthly (Porges 1945).

Let’s see a proof, inspired by Porges’s approach. First, suppose n \geq 1000, and let d be the number of digits in n, so 10^{d-1} \leq n. How big could s(n) be? Since each digit contributes to the sum independently, and the digit with the biggest square is 9, s(n) attains its maximum possible value when n consists of all 9’s, in which case s(n) = 9^2 + 9^2 + \dots + 9^2 = 81d.

I claim that 81d < 10^{d-1} when d \geq 4. We can directly check that this is true when d = 4, since 81 \cdot 4 = 324 < 10^3 = 1000. And every time we increase d by 1, the left-hand side increases by 81, but the right-hand side is multiplied by 10. At that rate the left-hand side can never hope to catch up!

Putting this together, we see that when n has four or more digits, s(n) \leq 81d < 10^{d-1} \leq n, which by transitivity means s(n) < n. In other words, if n has four or more digits, s necessarily makes it smaller. So if we start with some huge 500-digit number and start iterating s, the results will get smaller and smaller until we finally get to a number with fewer than four digits. (A fun exercise for you: if we start with a 500-digit number, what is the maximum number of iterations of s we need to reach a number with fewer than four digits?)

So what happens then? Actually, the bulk of Porges’s article is taken up with analyzing the situation for numbers of three or fewer digits. He proves various lemmas which cleverly reduce the number of cases that actually need to be checked by hand. But you see, in 1945 he didn’t have any computers to help! 1945 is right around the time when the first electronic, digital computers were being developed; it would be at least another ten or twenty years before any would have been generally available for a mathematician to use in checking a conjecture like this.

In 2018, on the other hand, it’s much faster to just write a program to test all the numbers with three or fewer digits than it is to follow Porges’s arguments. Here’s some Haskell code which does just that:

loop :: [Integer]
loop = [145, 42, 20, 4, 16, 37, 58, 89]

s :: Integer -> Integer
s = sum . map (^2) . map (read . (:[])) . show

ok :: Integer -> Bool
ok = any (\m -> m == 1 || m `elem` loop) . iterate s

The ok function checks whether iterating s ever produces either the number 1 or some number in the loop. Now we just have to run ok on all the numbers we want to check:

>>> all ok [1..1000]

And voila!


Porges, Arthur. 1945. “A Set of Eight Numbers.” The American Mathematical Monthly 52 (7). Mathematical Association of America:379–82.

Posted in arithmetic, computation, proof | Tagged , , , , , | 12 Comments

The chromatic number of the plane, part 1

About a week ago, Aubrey de Grey published a paper titled “The chromatic number of the plane is at least 5”, which is a really cool result. It’s been widely reported already, so I’m actually a bit late to the party. But in my usual way, I’d like to take a more leisurely approach to explaining the math in more detail. In this post, I’ll explain what the problem is; in upcoming posts I’ll explain what we know about the problem, and how de Grey’s paper fits in.

Graph colorings and chromatic numbers

Suppose we have a graph G with vertices connected by (bidirectional) edges. One natural task is to assign a color to each vertex of G in such a way that for every edge, the two vertices at the ends of the edge have different colors. Put another way, no two vertices connected by an edge should have the same color. This is called a (vertex) coloring of G.

For example, here is a graph with a valid coloring (using four colors):

If you look at any edge in this graph, you can see that the two vertices at the ends of the edge have different colors.

As an example of why we might care about this, suppose we have a graph where the vertices represent people, and two people are connected by an edge if they are enemies. Then a vertex coloring corresponds to a way to assign everyone to a team (one team for each color) such that enemies are never on the same team! Here’s another example: suppose vertices represent courses, and there is an edge between two courses if there is at least one student who is signed up for both courses. Then a vertex coloring is a way to schedule courses—each “color” represents a different time slot—such that no student has a conflict. Or if vertices represent countries, and there is an edge between two vertices whenever those countries share a border, a coloring of the graph is a way to literally color a map so bordering countries never have the same color. (Famously, for any (non-weird) map you can do this using only four colors; this is something I’d like to write more about someday.) These examples are just the tip of the iceberg: this is a natural problem that comes up all the time in practice.

Given a graph G, coming up with a vertex coloring is trivial: just make every vertex a different color! But this is silly; typically, it is better to use fewer colors, so although giving every vertex a different color results in a valid vertex coloring, it is not a very good one. The natural question to ask is: given a graph G, what is the smallest possible number of colors needed to construct a vertex coloring? This smallest possible number is called the chromatic number of G (and is often written \chi(G)).

For example, consider the following three graphs:

The graph on the left has chromatic number 2: we can’t use only one color since then the edges would have vertices of the same color on both ends, but we can get away with only two colors since there’s nothing to prevent the vertices on the ends from having the same color as each other, as long as they have a different color than the middle vertex. On the other hand, the graph in the middle has chromatic number 3: since each vertex is adjacent to both other vertices, we actually need to give every vertex its own distinct color.

The graph on the right is the same as the example graph I showed before, with a coloring using four vertices. So is the chromatic number four? Not so fast! At this point we only know the chromatic number is at most four. I will leave you the following exercises:

  1. Explain why it is not possible to color the vertices of this graph using only two colors.
  2. Either show a way to color the vertices using only three colors, or explain why it is not possible.

The chromatic number of the plane

It’s a bit of a stretch, but we can generalize this to ask about the “chromatic number” not of a graph, but of the plane. In particular, we want to assign a color to every point on the plane such that any two points which are exactly 1 unit apart have different colors. Notice that the plane contains infinitely many points (uncountably many, in fact), so we certainly can’t hope to color them by simply listing all the points and giving a color to each! In essence, we are looking for a function c : \mathbb{R}^2 \to \text{Color} such that for any points p_1 = (x_1, y_1) and p_2 = (x_2, y_2), if the distance between p_1 and p_2 (computed using the distance formula, aka Pythagorean Theorem) is exactly 1, then c(p_1) \neq c(p_2).

You can think of a function \mathbb{R}^2 \to \text{Color} as an (infinite) “picture” or “painting”. Imagine looking at a painting and holding up a 1-inch ruler next to it. No matter where you place the ruler and no matter which way you turn it, the two points at opposite ends of the ruler always have different colors! What could such a painting look like? And how many colors does it need?

The minimum number of colors needed is known as the chromatic number of the plane, abbreviated CNP. The problem of determining CNP (known as the Hadwiger-Nelson problem) was first posed around 1950, and we have known for a long time that the answer is somewhere between 4 and 7, inclusive. (I will explain how we know this in later posts!) De Grey’s contribution was to show that the answer can’t be 4; hence we now know that the answer has to be 5, 6, or 7—but we still don’t know which!

Posted in geometry, proof | Tagged , , , , | 4 Comments

More on sums of palindromes

In my previous post I reported on a recent proof that every positive integer can be written as the sum of three palindromes. The first thing to report in this follow-up post is that Lewis Baxter sent me the Python code which implements their algorithm—and even better, pointed me to a web page which will run the algorithm for you! Go ahead and try it, it’s fun—and it’s quite fast, even for really huge numbers. For example, just for fun I put in my office phone number, and it turns out that

5014501377 = 4010220104 + 963303369 + 40977904.

Good to know!

Commenter Anthony asked if the proof worked for numbers in any base, which was an important point I forgot to mention: what the paper actually proved is that every positive integer can be written as the sum of three palindromes in any base b \geq 5.

The theorem is not true for numbers in base 2. Notice first that any binary number starts with 1, so any binary palindrome must end with 1 as well, that is, it must be odd—except 0, which we define to be a palindrome as well. But a sum of three odd numbers is odd, so for the theorem to true, every even number would have to be the sum of two palindromes in base 2 (plus zero to make three palindromes). But the even binary number 10110000 is a counterexample, since it can be checked (e.g. by writing a computer program to try all possibilities) that it is not the sum of two binary palindromes. The following code does exactly that:

ghci> :set -XBinaryLiterals
ghci> import Text.Printf
ghci> let n = 0b10110000
ghci> let isBinPal x = printf "%b" x == reverse (printf "%b" x :: String)
ghci> [ k | k <- [1..n], isBinPal k, isBinPal (n-k) ]

The [] on the last line is an empty list indicating that there are no binary palindromes which add to 10110000.

This leaves some questions: what about bases 3 and 4? And how many palindromes do you need in base 2? At the time the paper was published, the answers to these questions were not known. It was conjectured that four palindromes are enough in base 2. It was also conjectured that three palindromes are enough for bases 3 and 4—just like in every other base \geq 5; the proof just didn’t cover those cases.

In his email, Lewis Baxter also pointed me to an even more recent paper by Aayush Rajasekaran, Jeffrey Shallit, and Tim Smith which uses a completely different method to resolve the remaining questions: it turns out that the conjectures were correct; we now know that every positive integer can indeed be written as a sum of four binary palindromes, and only three palindromes in base 3 or base 4! Jeffrey Shallit has a nice blog post explaining their method here. It’s not clear to me whether their method can be used to produce an actual algorithm for finding base-2, -3, or -4 palindromes that sum to a given number, though it certainly should be possible.

Posted in arithmetic, computation, links | Tagged , , , | 7 Comments

Every positive integer is a sum of three palindromes

I recently learned from John Cook about a new paper by Javier Cilleruelo, Florian Luca, and Lewis Baxter proving that every positive integer can be written as a sum of three palindromes. A palindrome is a number that is the same when written forward or backward, like 1423241.

The really cool thing is that the proof essentially consists of a (very long, very detailed) algorithm, which, for any positive integer n, explicitly finds three palindromes which sum to n! For example, when n = 314159265358979323846 (the first 21 decimal digits of \pi), their algorithm gives the three palindromes

\begin{array}{rcl} p_1 &=& 210100100111001001012 \\ p_2 &=& 98639929400492993689 \\ p_3 &=& 5419235847485329145 \end{array}

You can easily verify (try it!) that these are all palindromes, and that they add up to 314159265358979323846.

ghci> p1 = 210100100111001001012
ghci> p2 = 98639929400492993689
ghci> p3 = 5419235847485329145
ghci> isPalindrome x = show x == reverse (show x)
ghci> isPalindrome p1 && isPalindrome p2 && isPalindrome p3

ghci> p1 + p2 + p3

The reason this is so cool is that in many cases, the proof of a theorem of the form “For every integer n, there exists a thing” might be nonconstructive—that is, it shows that for every n it can’t be the case that there doesn’t exist a thing, which “proves” the theorem but doesn’t actually help us find the things which we suposedly know must exist. But this proof is very much constructive: not only does it show that every positive integer must be expressible as a sum of three palindromes, it actually backs it up with explicit evidence by giving us a way to find the palindromes.

Apparently the authors have written a Python program which implements the algorithm, and I have asked for a copy; I’m excited to play with it! I’d implement it myself, except that the algorithm has so many cases and sub-cases and sub-sub-cases and special adjustments to the sub-sub-cases, etc. that it would probably take me a week.

Posted in arithmetic, computation, links | Tagged , , , , , | 14 Comments

Why drawing orthogons is hard

We’re nearing the end of this little diversion on orthogons. We now know that orthogons are in 1-1 correspondence with orthobraces, and we can efficiently generate orthobraces. The only thing left is to find a way to turn orthobraces into drawings.

I described a way to turn orthobraces into drawings in a previous post, but the drawings it produced were terrible—the edges get smaller and smaller as the algorithm proceeds. For example, given the string XXXXXXVVXV, the algorithm produced this drawing:

But we’d really like to be able to produce a drawing like this instead:

What kind of drawings do we want in general? Here are my criteria:

  1. No two edges should intersect or touch (other than consecutive edges touching at a vertex).
  2. The length of each edge should be a positive integer.
  3. The sum of all the edge lengths should be as small as possible.

These criteria seem reasonable, to me at least, though one can easily imagine making other choices. For example, some might wish to consider polygons with edges that intersect at a vertex, such as the one shown below.

We could also imagine minimizing the area, rather than the perimeter, of the polygon.1 In any case, the idea is that a polygon with these properties should be as small as possible while remaining non-intersecting and with all edges having a common measurement. (The edges all having integer lengths also means that we will be able to create these drawings by gluing together squares.) Let’s call orthogon drawings with these properties good.

Given these criteria, we can ask questions such as the following:

  1. Given an orthobrace, does a good drawing of it always exist?
  2. Assuming good drawing(s) do exist, can we come up with an algorithm to generate them?

It is easy to show that (1) is true: first, use the naive orthogon drawing algorithm to produce a lopsided drawing with tiny edges. Now, if the smallest edge has length 1/2^k, scale the whole polygon by a factor of 2^k. This results in a polygon with no self-intersections and all integer edges. This proves that at least one such polygon exists; among all such polygons there must be at least one with minimal perimeter.

However, this does not actually give us an algorithm for producing good drawings! That last step of the proof is the problem: just because there must exist some drawing with minimal perimeter doesn’t help us actually find one. In the end, coming up with an algorithm to produce good drawings turns out to be surprisingly tricky, and illustrates a very important principle in mathematics and computer science. Faced with some large problem/task/proof/computation, what we would ideally like to be able to do is to split it into subproblems/subtasks/lemmas/helper functions, then solve/complete/prove/compute each of those subthings independently, and somehow combine the results to come up with an overall result. For example, to compute the product of two matrices, we just have to break it down into a bunch of dot products between two vectors, doing that for each combination of a row and a column, and then put all the results together into a big result matrix. Working with smaller independent pieces and then combining results is a very powerful technique, since it lets us solve problems, do computations, prove theorems, etc. which are much larger than what we would be able to accomplish by attacking them directly.

However, it seems like this approach fails for drawing orthogons! We might imagine something like this: take the orthobrace and break it up into a bunch of substrings; come up with some kind of drawing for each one; then combine the drawings into a drawing for the whole thing. Unfortunately, this does not work, since the length of an edge can be influenced by other features of the orthogon which are far away. For example, look at this polygon:

The top edge has to be very long in order for the polygon to not intersect itself. How do we know how long it has to be? We cannot infer it just by looking at neighboring edges. It is not even due solely to some particular edges on the other side of the polygon. It is in some sense a “global” property of the whole polygon: the way in which all the edges and vertices interact produces the need for the top edge to be long.

Intuitively, it is this “global” nature of deciding on good edge lengths which makes this problem so difficult. In order to decide how long to make each edge, we have to look at all the vertices and edges… and the length we choose for one edge may influence the length of some other edges, which may in turn force us to choose a different length for the original edge! How can we ever make any progress? And even worse, how can we ever hope to know whether some choices we have made lead to a polygon with minimal perimeter?

In my next (and final!) post on orthogons, I will finally explain how we can leverage some amazingly sophisticated software to produce good orthogon drawings!

  1. Can you come up with an example where the minimum-area and minimum-perimeter solutions are not the same?

Posted in computation, geometry | Tagged , , , , , , | 3 Comments