Recall that we defined a “good” orthogon drawing as one which satisfies three criteria:
We know that good drawings always exist, but our proof of this fact gives us no way to find one. It’s also worth noting that good drawings aren’t unique, even up to rotation and reflection. Remember these two equivalent drawings of the same orthogon?
It turns out that they are both good drawings; as you can verify, each has a perimeter of 14 units. (I’m quite certain no smaller perimeter is possible, though I’m not sure how to prove it off the top of my head.)
So, given a list of vertices (convex or concave), how can we come up with a good drawing? Essentially this boils down to picking a length for each edge. The problem, as I explained in my previous post, is that this seems to require reasoning about all the edges globally. I thought about this off and on for a long time. Each new idea I had seemed to run up against this same local-global problem. Finally, I had an epiphany: I realized that the problem of making a good orthogon drawing can be formulated as an input appropriate for an SMT solver.
What is am SMT solver, you ask? SMT stands for “satisfiability modulo theories”. The basic idea is that you give an SMT solver a proposition, and it tries to satisfy it, that is, find values for the variables which make the proposition true. Propositions are built using first-order logic, that is, things like “and”, “or”, “not”, implication, as well as “for all” () and “there exists” (). The “modulo theories” part means that the solver also supports various theories, that is, collections of extra functions and relations over certain sets of values along with axioms explaining how they work. For example, a solver might support the theory of integers with addition, multiplication, and the relation, as well as many other specialized theories.
Sometimes solvers can even do optimization: that is, find not just any solution, but a solution which gives the biggest (or smallest) value for some other function. And this is exactly what we need: we can express all the requirements of an orthogon as a proposition, and then ask the solver to find a solution which minimizes the perimeter length. SMT solvers are really good at solving these sorts of “global optimization” problems.
So, how exactly does this work? Suppose we are given an orthobrace, like XXXXXXVVXV
, and we want to turn it into a drawing. First, let’s give names to the coordinates of the vertices: . (Note these have to be integers, which enforces the constraint that all edge lengths are integers.) Our job is to now specify some constraints on the and which encode all the rules for a valid orthogon.
We might as well start by assuming the first edge, from to , travels in the positive direction. We can encode this with two constraints:
means the edge is horizontal; expresses the constraint that the second endpoint is to the right of the first (and since and are integers, the edge must be at least 1 unit long).
We then look at the first corner to see whether it is convex or concave, and turn appropriately. Let’s assume we are traveling counterclockwise around the orthogon; hence the first corner being convex means we turn left. That means the next edge travels “north”, i.e. in the positive direction:
And so on. We go through each edge (including the last one from back to ), keeping track of which direction we’re facing, and generate two constraints for each.
There’s another very important criterion we have to encode, namely, the requirement that no non-adjacent edges touch each other at all. We simply list all possible pairs of non-adjacent edges, and for each pair we encode the constraint that they do not touch each other. I will leave this as a puzzle for you (I will reveal the solution next time): given two edges and , how can we logically encode the constraint that the edges do not touch at all? You are allowed to use “and” and “or”, addition, comparisons like , , or , and the functions and (which take two integers and tell you which is bigger or smaller, respectively). This is tricky in the general case, but made much easier here by the assumption that the edges are always either horizontal or vertical. It turns out it is possible without even using multiplication.
Finally, we write down an expression giving the perimeter of the orthogon (how?) and ask an SMT solver to optimize it subject to the constraints. I used the Z3 solver via the sbv Haskell library. It outputs specific values for all the and which satisfy the constraints and give a minimum perimeter; from there it’s a simple matter to draw them by connecting the points.
To see this in action, just for fun, let’s turn the orthobrace XVXXVXVVXVVXXVVXVVXXXXXXVVXXVX
(which I made up randomly) into a drawing. This has 30 vertices (and 30 edges); hence we are asking the solver to find values for 60 variables. We give it two constraints per edge plus one (more complex) constraint for every pair of non-touching edges, of which there are 405 (each of the 30 edges is non-adjacent to 27 others, so counts each pair of edges twice); that’s a total of 465 constraints. Z3 is able to solve this in 15 seconds or so, yielding this masterpiece (somehow it kind of reminds me of Space Invaders):
Of course, the original point of the hexagonal 7-coloring in my last two posts is that it establishes an upper bound of CNP (although it turns out it’s also just a really cool pattern). Again, there is a balancing act here: we have to give each hexagon a diameter of , so no two points in the same hexagon will be 1 unit apart; but we also have to make the hexagons big enough that same-colored hexagons are more than 1 unit from each other. This is indeed possible, since same-colored hexagons always have two “layers” of other hexagons in between them. Denis Moskowitz made a really nice graphic illustrating this:
In a comment on my previous post, Will Orrick pointed out that if you tile 3-dimensional space with cubes and color them with seven colors so that each cube is touching six others with all different colors, then take a diagonal slice through that space, you get this!
This is the same as the 7-colored hexagonal tiling I showed before, but with extra triangles in between the hexagons (and the colors of the triangles follow a pattern similar to the hexagons). I could stare at this all day! Here’s a version with numbers if you find it helpful. (If you support me on Patreon you can get automatic access to bigger versions of all the images I post—though to be honest if there’s a particular image you want a bigger version of, you can just ask nicely!)
So, we know CNP is either 5, 6, or 7. So which is it? No one is really sure. With some unsolved problems, there is widespread agreement in the mathematical community on what the right answer “should be”, it’s just that no one has managed to prove it. That isn’t the case here at all. If you ask different mathematicians you will probably get different opinions on which number is correct. Some mathematicians even think the “right” answer might depend on which axioms we choose as a foundation of mathematics!—in particular that the answer might change depending on whether you allow the axiom of choice (a topic for another post, perhaps).
]]>This is a really remarkable tiling. Here are a few special properties I know of:
First of all, I hope you realized that the pattern can be extended infinitely to cover the whole plane with a hexagonal tiling. One way to convince yourself of this is to look at the horizontal strips of hexagons labelled . Each row of hexagons will just have repeating forever; and each row is a copy of the row below it, offset by two and a half hexagons.
Of course, there are seven different colors used (also indicated by the numbers ).
Instead of seeing it as made up of a bunch of repeating strips, you can also see it as made of a bunch of repeating parallelograms:
Each parallelogram has four zeros at its corners, and contains one of every other color/number in its interior. (There’s actually nothing special about zero; you can make parallelograms using any number you choose for the corners.)
Every hexagon touches six other hexagons, exactly one with each other color/number. For example, a hexagon labelled 1 will touch six hexagons labelled 0, 2, 3, 4, 5, and 6. (Can you see how to prove this? Think about how the whole tiling is built out of copies of the strip .) This explains what’s so special about having seven colors!
If you pick any color and look at all the hexagons of that color, they are always arranged in the same pattern, at the vertices of an equilateral triangular grid. For example, below I have arbitrarily chosen to highlight all the number 3 hexagons:
This means that if you pick any two colors/numbers, there is always some translation that will move all the hexagons of one color to the places that used to be occupied by the hexagons of the other color.
Here’s what it looks like if we draw all these triangular grids overlaid on top of each other (with brighter colors just for fun). I think it’s remarkable that seven equilateral triangular grids fit together so precisely!
In fact, since each hexagon touches one of each other color, for any two colors we can move all the hexagons of one color to those of the other by translating just one “hexagon unit”—that is, each hexagon will move to a hexagon next to it. For example, if we want to move all the number 3 hexagons so they match up with where the number 5 hexagons used to be, just move everything one hexagon down and to the right.
I’ll reproduce the original image here so you can refer to it while thinking about this and the following properties:
Each of the six directions we can translate corresponds to adding by a different number mod seven. For example, it’s easy to see that moving to the left corresponds to adding 1 (adding 1 to 6 wraps back around to 0 because we take the remainder when dividing by 7). In other words, pick any hexagon and look at its number; the hexagon to its left will have a number which is 1 bigger (mod 7). Moving down and to the right is adding 2: for example, starting at the 0 in the middle and following the line of hexagons down and to the right, we find , where each number is two more than the previous (mod 7). Down and to the left is adding 3; and so on. I’ll let you work out the other three directions.
As pointed out in a comment by Will, rotating corresponds to multiplication mod 7! For example, rotating 60 degrees counterclockwise around 0 corresponds to multiplying by 3 (mod 7). In counterclockwise order, the six numbers we find around 0 are . We can check that , and , , , and so on. Rotating the other direction is multiplying by 5. Rotating by 120 degrees is multiplying by 2 or 4; rotating by 180 degrees is multiplying by 6.
Pick a hexagon; say it contains the number . We already know that none of the hexagons it touches contain . But in fact, none of the hexagons those hexagons touch contain either (except for the original hexagon we chose). This is because each hexagon touches exactly one copy of every other color/number. So, in other words, each hexagon is contained in two layers of hexagons, none of which share the same color with the central hexagon. For example, if we pick a zero hexagon and go two layers out from there, we can see that the zero in the middle is the only zero (and there are exactly 3 copies of each other number):
Let me know in the comments if you see any other patterns that I missed!
]]>I’ve been writing this blog for almost twelve years (!). Over that time I’ve published more than 400 posts totaling almost 200,000 words. Writing The Math Less Traveled is not my job and I’ve never gotten paid for it; I just love to explore beautiful ideas and find creative ways to explain them to others.
Normally this is the place where I would say how The Math Less Traveled won’t be able to continue to exist without your support, and how I need to feed my family, and so on. But the honest truth is that my family is doing just fine, and I will probably continue writing The Math Less Traveled no matter what! It would just be nice to be rewarded for my hard work, and to at least be able to offset the cost of the frequent writing sessions in my favorite local coffee shop (where I’m writing this right now), buy some relevant books now and then, and upgrade my WordPress plan to have ads removed. I’m also excited to use Patreon as a platform to build a closer, more collaborative relationship with my readers: I will consult patrons at the $5/month level for things like feedback on drafts, research help, and ideas for topics, posts, and visualizations.
If you find my writing valuable, I hope you’ll consider becoming a patron, even at just a few dollars a month. Of course, there are also some cool rewards, like high-resolution versions of any graphics I make for the blog, and access to patron-only content (likely full of my random mathematical musings and stories about math conversations with my kids).
Thanks for reading regardless, and here’s to the next twelve years! Next up: how to color the plane using only seven colors (aka pretty pictures made by mathematically-minded bees!).
]]>But what about an upper bound? Can we say for sure that the CNP has to be smaller than some limit? Indeed, we can: we know that CNP . Today I want to explain how we can show that CNP ; in another post I’ll show the proof that CNP .
Proving an upper bound for the chromatic number of the plane is very different than proving a lower bound. A lower bound is all about showing that you can’t color the plane with a certain number of colors. An upper bound, on the other hand, is all about showing that you can. We know that CNP because there is a specific, valid way to color the plane using only seven colors! It might be possible to do it with fewer colors, using a very clever pattern, which is why is only an upper bound. But at least we know it can be done with .
Let’s start with something simpler. Imagine coloring the whole plane with a repeating pattern like this:
Can this be a valid coloring of the plane? (Of course, we already know it can’t because it only uses four colors, and CNP > 4; but let’s see if we can give a more direct argument why this is not valid). First of all, the squares can’t be too big: we don’t want any pair of points inside the same square to be 1 unit apart. This means the diagonal of each square has to be less than 1.
So if the diagonals of the squares are less than 1, the sides of the squares have to be less than . But the problem is that squares of the same color also have to be far enough apart so that no two points in same-colored squares are 1 unit apart. But in this scenario same-colored squares are separated by only one other square; if the sides of the squares is then they are too close.
This idea can be easily salvaged, however: we just need more colors!
Now we are using nine colors, with a repeating pattern instead of . Let’s see if we can make this work. Once again, the squares can’t be too big (lest two points inside the same square can be 1 unit apart) but same-color squares can’t be too close (lest two points in different same-color squares can be 1 unit apart). Call the side length of the squares. The first condition once again gives us . What about the second condition? The shortest distance between two same-colored squares is now , if you travel straight east-west or north-south. (You can confirm for yourself that any other line between same-colored squares is longer.) So we must have .
Can we pick a value of that makes and both true? Yes! Rearranging a bit, this just says . Since this gives us a valid range of values for ; any such value for gives us a valid -coloring of the plane.
Hence, it is possible to color the plane using 9 colors, such that no two points at distance 1 from each other have the same color! We don’t know if this is the best way to color the plane (in fact, it isn’t!), but at least we know it is a way. So this shows that CNP .
In my next post I’ll explain how we know that CNP : the proof is similar in spirit, but more complex (and, of course, more efficient) due to its use of hexagons instead of squares.
]]>As Denis noted in his comment, the argument from my previous post happily applies in any base, not just base 10! That is, if we consider a -digit number in base , the largest it can be is if it consists of copies of the largest possible digit, which is . In that case the sum of squares of its digits would be . For example, the largest sum of squares we could get with a six-digit number in base 5 is , which gives us a sum of squares of . But the claim is that this will be less than as long as and —which means that taking the sum of squares of the base- digits of any number with four or more such digits will always result in a shorter base- number. We can check that this is true when and , since ; and intuitively, increasing either or will make the right-hand side increase more than the left-hand side, so it will always be true for and . (This is an informal, hand-wavy argument—if you think you know of a way to prove it formally I would love to hear about it!)
The upshot is that for any base we only have to try numbers with up to three digits to look for loops, since any bigger numbers will reduce until they have at most three digits. (One might wonder it will ever suffice to consider only two-digit numbers, when gets large enough. The answer, perhaps surprisingly, is no. We would be looking for a base such that —that is, a base for which three-digit numbers always reduce to two-digit numbers. As you can verify for yourself, the only solutions to this inequality are when ! So four digits is the magic cutoff for every base . (I’m not going to consider negative bases here; maybe in another post.))
So I wrote a program which, for each base , finds all the loops that happen on base- numbers from to . It turns out that only a few bases have a single nontrivial loop, like base (perhaps this is not surprising). The ones I have found are as follows. (After base 10 we quickly run out of digits; I use lowercase letters to to stand for digits through , and then uppercase letters to to stand for digits through .)
Base 20 is particularly magnificent: there is a single nontrivial loop, and it has length 26!
I let my program run all the way up to base 150 (which took about 3.5 minutes on my admittedly not-so-fast laptop). Here’s some trivia:
Since each number involved in a loop has at most three digits, we can think of them as coordinates in a 3d space; I wonder what the trajectories of the loops would look like! Someone should do this. A data file containing all loops up to base 150 can be found here (note that after base 61 I just start writing the numbers in base 10 again).
The program I used to search can be found here. I think my program is fairly efficient but it could probably be sped up in various ways to search much farther.
]]>It turns out that de Grey’s new proof follows exactly this same pattern: he constructs a unit distance graph which needs at least 5 colors. So why hasn’t anyone else done this before now? The Moser spindle (a unit distance graph with chromatic number 4) has been around since 1961; why did it take almost 60 years for someone to come up with a unit distance graph with chromatic number 5?
The reason is that the graph is very big! de Grey’s graph has 1581 vertices. Although it is itself constructed out of copies of smaller pieces, which are themselves constructed out of pieces, etc.—so it’s not quite as hard to analyze as any random 1581-vertex graph—proving that it requires 5 colors still required a computer. Even the process of finding the graph in the first place required a computer search. So that’s why no one had done it in the last 60 years.
A lot of people have been working on improving and extending this result. The current record-holder for smallest unit-distance graph with chromatic number 5 has 610 vertices, and was found by Marijn Heule using a computer search. Here’s a picture of it, provided by Marijn:
It’s unknown whether we will be able to find a proof that can be constructed and understood by humans without the help of a computer. It’s also unknown whether these techniques and the associated technology will be able to extend far enough to find a unit distance graph with chromatic number 6 (if such a thing exists—though it seems many mathematicians believe it should).
]]>In this post I’ll talk about lower bounds. That is, how do we know that the chromatic number of the plane (CNP) is at least 5?
Let’s start with a smaller number. Can we argue that the CNP is at least 2? Sure, this is easy: if the entire plane is colored with only one color, then obviously any two points separated by a distance of 1 unit have the same color. So one color is not enough, and hence CNP .
Is two colors enough? If you play around with it a bit, most likely you will quickly get an intuition that this won’t work either. Here’s an intuitive argument why we need at least three colors. Suppose some region of the plane is colored, say, blue.
(It doesn’t have to be a circle, but let’s just think about a circular-ish region for simplicity’s sake.) The diameter of the region has to be smaller than , because otherwise there would be two points within the region that are unit apart. But that means the entire area outside the region has to be, say, red.
But if the diameter of the blue region is less than 1 and the area outside the region is all red, then intuitively, we can find two red points which are a distance of 1 apart, as illustrated in the above picture.
This isn’t really a formal argument at all (what about really weirdly-shaped regions)? But we can give a different argument which is completely formal yet still quite simple. Consider the graph consisting of three vertices connected in a triangle.
As mentioned in my previous post, the chromatic number of this graph is —each vertex needs its own color since each vertex is adjacent to both other vertices. But this actually tells us something about the plane, since this graph can be drawn as an equilateral triangle with all its edges having length . If we pick any three points in the plane arranged in a unit equilateral triangle, they must all have different colors; hence at least three colors are needed to color the plane.
Generalizing from the previous proof, we can see a general plan of attack to prove that three colors are not enough to color the plane. We want to find some graph such that
If we can find such a graph, it immediately shows that we need at least four colors for the plane, since we can pick a set of points in the plane in exactly the same configuration as the vertices of the graph; since the vertices of the graph need four colors, so do the chosen points of the plane.
Well, such a graph exists! In fact, there are many such graphs, but the simplest one is called the Moser spindle. (It’s named after William and Leo Moser, who published it in 1961; I have no idea why it is called a “spindle”.) Here it is:
As you can see, it’s made of four equilateral triangles, with two pairs glued together along an edge, a shared vertex at the top, and an extra edge of length exactly connecting the bottom vertices. You can check for yourself that this is a unit-distance graph. But what is its chromatic number? Well, suppose we try coloring it with only three colors. Suppose the top vertex is red. That vertex is connected to two equilateral triangles; of course the other two vertices of each triangle must have the other two colors (say, green and blue). Then the two bottom vertices must again be colored red, since each is part of an equilateral triangle with one green and one blue vertex. But this isn’t allowed: the two bottom vertices are connected by an edge, so they can’t be the same color.
So, three colors aren’t enough to color the Moser spindle, and hence three colors aren’t enough to color the plane either!
So the above proofs are not that hard to grasp, and we have known that CNP > 3 since 1961 or so. So how did Aubrey de Grey prove that CNP > 4, and why did it take almost 60 years to go from 3 to 4? I’ll explain that in my next post!
]]>In this case, we hit again, after which point the values will start to repeat. So in this case, iterating enters a loop of length , where it will get stuck forever.
Let’s try another starting value, say, .
In this case we hit , which is a fixed point of , that is, , so we simply get stuck on forever.
The amazing thing is that these are the only two things that can happen—for any positive integer, iterating will eventually either reach , or enter the loop of eight numbers . Arthur Porges proved this in his article, A Set of Eight Numbers, published in 1945 in The American Mathematical Monthly (Porges 1945).
Let’s see a proof, inspired by Porges’s approach. First, suppose , and let be the number of digits in , so . How big could be? Since each digit contributes to the sum independently, and the digit with the biggest square is , attains its maximum possible value when consists of all ’s, in which case .
I claim that when . We can directly check that this is true when , since . And every time we increase by , the left-hand side increases by , but the right-hand side is multiplied by . At that rate the left-hand side can never hope to catch up!
Putting this together, we see that when has four or more digits, , which by transitivity means . In other words, if has four or more digits, necessarily makes it smaller. So if we start with some huge 500-digit number and start iterating , the results will get smaller and smaller until we finally get to a number with fewer than four digits. (A fun exercise for you: if we start with a 500-digit number, what is the maximum number of iterations of we need to reach a number with fewer than four digits?)
So what happens then? Actually, the bulk of Porges’s article is taken up with analyzing the situation for numbers of three or fewer digits. He proves various lemmas which cleverly reduce the number of cases that actually need to be checked by hand. But you see, in 1945 he didn’t have any computers to help! 1945 is right around the time when the first electronic, digital computers were being developed; it would be at least another ten or twenty years before any would have been generally available for a mathematician to use in checking a conjecture like this.
In 2018, on the other hand, it’s much faster to just write a program to test all the numbers with three or fewer digits than it is to follow Porges’s arguments. Here’s some Haskell code which does just that:
loop :: [Integer]
loop = [145, 42, 20, 4, 16, 37, 58, 89]
s :: Integer -> Integer
s = sum . map (^2) . map (read . (:[])) . show
ok :: Integer -> Bool
ok = any (\m -> m == 1 || m `elem` loop) . iterate s
The ok
function checks whether iterating s
ever produces either the number or some number in the loop. Now we just have to run ok
on all the numbers we want to check:
>>> all ok [1..1000]
True
And voila!
Porges, Arthur. 1945. “A Set of Eight Numbers.” The American Mathematical Monthly 52 (7). Mathematical Association of America:379–82. http://www.jstor.org/stable/2304639.