Modular exponentiation by repeated squaring

In my last post we saw how to quickly compute powers of the form a^{2^k} by repeatedly squaring: (a^2)^2 = a^4; then (a^4)^2 = a^8; and so on. This is much more efficient than computing powers by repeated multiplication: for example, we need only three multiplications to compute a^8 by squaring, but we would need seven multiplications to compute it as a \cdot a \cdot \dots \cdot a.

The question I left you with is whether we can use a similar technique to compute other powers which are not themselves powers of two. The idea is that in addition to squaring, we can also multiply by another copy of a at strategic points. For example, suppose we want to compute a^{26}. We can do it like this:

  • a^2 \cdot a = a^3 (square and multiply by another a)
  • (a^3)^2 = a^6 (square)
  • (a^6)^2 \cdot a = a^{13} (square and multiply by a)
  • (a^{13})^2 = a^{26} (square)

So how do we decide when to multiply by an extra copy of a? And can we get any exponent this way?

It becomes easier to answer these questions if we think recursively. Instead of starting with a and building up to a^n, let’s start with our goal of computing a^n and see how to break it down into subproblems we can solve.

So suppose we’re trying to compute a^n. There are two cases, depending on whether n is even or odd. If n is even, all we have to do is compute a^{n/2}; then we can square it to get a^n. What about if n is odd? Then we can compute a^{(n-1)/2}; squaring it gets us a^{n-1}; then multiplying by one more copy of a gets us a^n.

For example, let’s rethink our example with a^{26} in this way. 26 is even, so our goal is to compute a^{26/2} = a^{13}; we can then square that to get a^{26}. So how do we compute a^{13}? Since 13 is odd, we can get it by squaring a^6, then multiplying by a one more time. To compute a^6, we want to square a^3; finally, to compute a^3 we square a^1 and multiply by another a. The base case of the recursion is when n=1, at which point we can stop: a^1 is just a. In fact, we can use n=0 as an even simpler base case: a^0 = 1. (Do you see how we will still get the right answer for a^1 even if n=1 is not a base case?)

a^n = \begin{cases} 1 & \text{if \begin{math}n=0\end{math}} \\ (a^{n/2})^2 & \text{if \begin{math}n\end{math} is even} \\ (a^{(n-1)/2})^2 \cdot a & \text{if \begin{math}n\end{math} is odd} \end{cases}

As you may have already noticed, we can think of this in terms of the binary expansion of n. Whether n is even or odd is determined by its final bit: n is even when the final bit is 0, and odd when the final bit is 1. Computing n/2 (or (n-1)/2 when n is odd) corresponds to chopping off the final bit of n. So what we’re really doing is chopping off one bit of n at a time, squaring at each step and multiplying by an extra copy of a when we see a 1 bit. For example, 26 in binary is 11010_2. Since the final bit is 0, this tells us that we want to compute a^{1101_2} and then square it. Since the final bit is now 1, this in turn means that we want to compute a^{110_2}, square it, and multiply by a; and so on. So here is an equivalent way to write our algorithm as a loop:

result = 1
for each bit of n from left to right:
    result = result * result
    if the current bit is 1:
        result = result * a

Here’s a visualization of the operations needed to compute a^{26} using the two different methods:

Obviously, using the repeated squaring approach requires a lot fewer multiplications! But how many does it take, exactly? The loop repeats once for every bit of n, and each time through the loop, we do either one or two multiplications. The number of bits in n is \lfloor \log_2 n \rfloor + 1; so in the worst case we do twice that many multiplications, that is, 2 \lfloor \log_2 n \rfloor + 2. (This worst case happens precisely when n is one less than a power of two, that is, when its binary expansion consists of all 1’s.)

Remember our example from last time? Suppose n = 10^{10}. Assuming that we can do 10^8 multiplications per second, computing a^n by repeated multiplication would take a minute and a half. So how about by repeated squaring? In that case it will take at worst

2 \lfloor \log_2 10^{10} \rfloor + 2 = 2 \lfloor 10 \log_2 10 \rfloor + 2 = 35

multiplications. Wow! Computing 10^{10} by repeatedly multiplying by a takes ten billion multiplication operations, but computing it by repeated squaring takes only thirty-five! (Actually it will take even less than that, since the binary expansion of 10^{10} does not consist of all 1’s—can you figure out exactly how many multiplications it will take?) I need hardly point out that if we can do 10^8 multiplications per second, doing 35 will take hardly any time at all (about 1/3 of a microsecond; in that amount of time, light travels only about 100 meters).

Obviously this makes a huge difference. And n = 10^{10} is actually rather small—although a minute and a half is a long time compared to a third of a microsecond, waiting a minute and a half for a computation is quite doable. But what about something like n = 10^{512}? Computing a^n by repeated multiplication would require 10^{512} multiplications; at 10^8 per second this would take 10^{504} seconds, which is vastly, unimaginably longer than the estimated age of the universe (which is “only” about 4 \times 10^{17} seconds). Even if you expanded every microsecond in the history of the universe to take an entire age of the universe—and then repeated this process 20 times—you still wouldn’t have enough time! But what about with repeated squaring? Well,

2 \lfloor \log_2 10^{512} \rfloor + 2 = 2 \lfloor 512 \log_2 10 \rfloor + 2 = 1702.

One thousand seven hundred multiplications, at 10^8 per second, would take… about 20 microseconds! 20 microseconds is, shall we say… a tiny bit faster than ages within ages within ages of the universe. This idea of doing repeated squaring—or, more generally, any algorithm where we get to repeatedly halve the size of the problem—is incredibly powerful!

(One thing I have been sweeping under the rug a bit is that not all multiplications take the same amount of time, so if you just want to compute a^k, the multiplications will take longer and longer as the numbers become larger; saying that we can do 10^8 multiplications per second is only true if the numbers involved are less than, say, 2^{64}. Not only that, but the results might be gigantic: for example, as far as we know there isn’t enough space in the entire universe to even write down the result of 2^{10^{512}}, even if we could somehow write one digit on each atom! Everything I’ve said is justified, however, by the fact that we actually want to compute something like a^{n-1} \bmod n: if we reduce \pmod n at each step, then all the multiplications really do take the same amount of time, and we don’t have to worry about the result getting astronomically large.)

Posted in computation, number theory | Tagged , , , , , , , | 4 Comments

Modular exponentiation

In my previous post I explained the Fermat primality test:

  • Input: n
  • Repeat k times:
    • Randomly choose 1 < a < n-1.
    • If a^{n-1} \not \equiv 1 \pmod n, stop and output COMPOSITE.

In future posts I’ll discuss how well this works, things to worry about, and so on. But first, I realized there’s one important piece to discuss first: How do we compute a^{n-1} \bmod n? This is obviously a key component of the above algorithm—and many related algorithms—and the fact that it can actually be done quite efficiently is what makes these algorithms practically useful. But if you’ve never thought about this before, it’s probably not obvious how to compute a^{n-1} \bmod n efficiently.

The obvious, “naive” way is to just do what it literally says: multiply a by itself n-1 times, and then reduce the result \pmod n. However, this has two very big problems; we’ll talk about each and then see how to fix them.

Problem 1: a^{n-1} might be very big!

For example, suppose we want to test n = 13499 to see whether it is prime. (This is a pretty “small” number as far as testing for primality goes!) Say we choose a = 10006 (I actually generated a at random using my computer). But then a^{n-1} = 10006^{13498} has almost 54 thousand digits! (I computed this as \lfloor \log_{10} 10006^{13498} \rfloor + 1 = \lfloor 13498 \log_{10} 10006 \rfloor + 1; since \log_{10} 10006 is very close to 4, this is very close to 4 \times 13498 \approx 4 \times 13500 = 54000.) It takes about 10 bits to store every three decimal digits (since 2^{10} = 1024 is about the same as 10^3 = 1000), so a 54 thousand-digit number would require about 54000 \text{ digits} \times \frac{10 \text{ bits}}{3 \text{ digits}} \times \frac{1 \text{ byte}}{8 \text{ bits}} = 22500 \text{ bytes} to store, about the size of a small image. Such a number would take a while to compute, probably longer than simply trying all the possible divisors of 13499. And if we want to test numbers n with hundreds of digits, we would be completely out of luck.

Of course, it’s not really a^{n-1} we want, but a^{n-1} \bmod n. Thankfully, we don’t actually have compute a^{n-1} and then reduce it \pmod n at the very end. The key is that taking remainders “commutes” with multiplication. That is, it doesn’t matter whether you multiply first or take remainders first:

xy \bmod n \equiv (x \bmod n) (y \bmod n)

So instead of waiting until the very end to reduce \pmod n, we can reduce \pmod n after each multiplication. For example, we could compute a_2 = a^2 \bmod n, and then a_3 = a_2 \cdot a \bmod n, and so on. Much better! Now instead of ending up with a monstrosity with thousands, millions, etc. of digits, the intermediate numbers we have to deal with never get bigger than about n^2, which is quite reasonable.

As a simple example, just to show that this really works, let’s pick a = 7, n = 11. Directly computing 7^{10} yields 282475249, which leaves a remainder of 1 when divided by 11. On the other hand, we could first compute a_2 = 7\cdot 7 \bmod 11 = 49 \bmod 11 = 5, then a_3 = 5 \cdot 7 \bmod 11 = 35 \bmod 11 = 2, and so on: a_4 = 3, a_5 = 10, a_6 = 4, a_7 = 6, a_8 = 9, a_9 = 8, and a_{10} = 1. At each step we multiply by 7 and then reduce \pmod {11}. The numbers we have to deal with are never bigger than 70. And sure enough, we still end up with 1.

However, there’s still another problem:

Problem 2: computing a^{n-1} naively takes too many steps!

How many multiplication operations does it take to compute a^{n-1}? Well, a^{n-1} = \underbrace{a \cdot a \cdot a \cdot \dots \cdot a}_{n-1}, so this takes n-2 multiplication steps, right? First we compute a^2, then a^3, then a^4 … we multiply by a (and reduce \pmod n) at each step.

However, if n is big this could take quite a while! The number of multiplications grows linearly with n, that is, it grows exponentially with the size (number of bits/digits) of n—which is no better than trial division!

For example, assume we can do 10^8 multiplications per second (this is actually fairly realistic). Testing n = 10^{10} + 19—only an 11-digit number—requires computing a^{n-1}, which would take 10^{10} + 17 multiplications. At only 10^8 multiplications per second, this would take about 100 seconds—and we have to repeat this k times! If n actually had hundreds of digits instead of just 11, then this would take way longer than the estimated age of the universe.

But there is a better way! Let’s start with something simple: how many multiplications does it take to compute a^4? Again, you might think it takes three (a \cdot a = a^2, a^2 \cdot a = a^3, a^3 \cdot a = a^4) but it can be done with only two multiplications; can you see how?

The secret is that we don’t have to multiply by a every time. Once we have computed bigger powers of a, we can use them to build up even bigger powers faster. In the specific example of computing a^4, we first compute a^2 = a \cdot a; but now that we have a^2, we need only one more multiplication: we just square a^2 to compute a^2 \cdot a^2 = a^4. With yet one more multiplication, we could then square a^4, getting a^4 \cdot a^4 = a^8, and so on.

So that quickly gets us exponents that are themselves powers of two. What about other exponents? This post is long enough so I think I’ll leave this for the next post. But if you’ve never seen this before I encourage you to think about how this would work. For example, how would you most efficiently compute a^{11}? Can you come up with an algorithm for computing a^n in general?

Posted in computation, number theory | Tagged , , , | 2 Comments

The wizard’s rational puzzle (mind your p’s and q’s!)

You have been abducted by a sadistic math wizard (don’t you hate it when that happens?). He ushers you into a plain but cozy-looking room, with a hardwood floor, a few exotic-looking rugs, and wood paneling on the walls.

He hands you a small polished metal cube. “Inside this cube”, he explains, “is a positive rational number. If you can correctly tell me its numerator and denominator in lowest terms, I will let you go free. If you guess wrong, though…” He pauses to indulge in an evil chuckle. “Let’s just say that my sharks are hoping you will fail.” He strides out and slams the door behind him.

Of course you try to open the cube, but you can’t find any way to do it: you don’t see any hinges, or a lid, or even any kind of seam at all, just smooth metal. You try throwing it on the floor to see if it will shatter, but you only succeed in denting the floor. You next try the door, but of course it is locked, and seems extremely solid.

You sigh and start to look around the room. There are no windows, but you quickly notice that a few of the wooden panels have various holes, knobs, and inscriptions on them. You go to examine them more closely.

On the west wall is a panel labelled “RECIPROCATOR”. To one side there is a square hole labelled with the letter x; next to it is another hole labelled 1/x. The holes look like they might be just about the same size as the metal cube, so you hold your cube up next to the hole labelled x to compare their sizes. Suddenly, with a whirring sound, something starts to pull the cube out of your hand, into the hole; you desperately try to hang on, but the mechanism is too strong. You experience a brief moment of panic as the cube slips from your fingers, but instead of disappearing, it stops as soon as it is flush with the panel. There are more whirring sounds from inside the wall, and a few moments later, your cube slowly emerges from the hole again, along with new cube from the hole labelled 1/x. They look identical; you are extremely glad at this moment of the decision you made, eleven years ago, to carry a permanent marker in your pocket at all times Just In Case. You carefully label the original cube and the new cube so you will not get them mixed up, since you are starting to get the feeling that you might end up with a lot of these.

On the north wall is a panel labelled “ARITHMETIZER”. There are three square holes: one labelled x, one labelled y, and one labelled x + y. However, you notice a little knob below the + symbol; by turning it you are able to change the + into - or \times. You turn it back to + and, sure enough, when you put your two cubes next to the holes labelled x and y, they are sucked into the machine briefly, and after some whirring sounds they spit back out, along with a third cube (which you carefully label) ejected from the third hole.

The east wall has no panels, but it does have a window with a stunning view of the forest you were hiking in just a few hours ago. You realize, belatedly, that the stories that crazy old man at the lodge told about the “haunted math forest” may have had some truth to them after all.

On the south wall is a panel labelled “COMPARATOR”, with two holes labelled x and y. To the right of those holes, under a blank space on the panel, is a label that reads x < y. You try putting in your original cube and the one that came out of the ARITHMETIZER. After the usual whirring, part of the panel over the label x < y slides open for a few seconds, revealing a giant letter T. You swap the two cubes and put them back in, and sure enough, this time the panel slides open to reveal a giant letter F.

In the middle of the room is a small table with a chair and a lamp. On the desk, it looks like the wizard has thoughtfully provided you with a quill, a pot of ink, and a stack of blank paper. You have the first few ticklings of some ideas, so you sit down at the desk, look thoughtfully at the ceiling for a few seconds, then begin to write.

Can you escape? Or will you be feeding the sharks? What’s your best strategy?

Steel cube photograph by Zai Divecha

Posted in arithmetic, challenges, logic, programming, puzzles | Tagged , , , , , | 15 Comments

The Fermat primality test

After several long tangents writing about orthogons and the chromatic number of the plane, I’m finally getting back to writing about primality testing. All along in this series, my ultimate goal has been to present some general primality testing algorithms and the math behind them, and now we’re finally ready to see our first (sort of!). As a reminder, and as a guide for anyone reading this without reading the previous posts, here’s the story so far:

So let’s see our first primality testing machine! This one is actually very simple. Remember that Fermat’s Little Theorem (the first version) says:

If p is prime and a is an integer where 0 < a < p, then a^{p-1} \equiv 1 \pmod p.

We can turn this directly into a test for primality, as follows: given some number n that we want to test for primality, pick an integer a between 0 and n (say, randomly), and compute a^{n-1} \pmod n. If the result is not equal to 1, then n is definitely not prime, since it would contradict Fermat’s Little Theorem. In that case we can immediately stop and report that n is composite (though note that we have not found any factors!).

(In actual practice, we don’t bother trying a = 1 or a = n-1; we only pick from 1 < a < n-1. Can you see why it’s useless to test a = 1 or a = n-1?)

For example, suppose we want to test n = 8. Let’s pick a = 2. We compute a^{n-1} = 2^7 = 128 \equiv 0 \pmod 8; hence n = 8 is not prime (but you probably knew that already). In this particular exampe a is actually a factor of n, but it need not be. For example, let n = 15 and pick a = 7; then computing 7^{14} \equiv 4 \pmod {15} proves that n is composite, even though 7 and 15 share no common factors.

So what if a^{n-1} is equivalent to 1 \pmod n? Unfortunately, Fermat’s Little Theorem is not an “if and only if” statement! It is quite possible to have a^{n-1} \equiv 1 \pmod n even when n is composite. So if we do get a result of 1, we simply can’t conclude anything about n. For example, with n = 15 again, if we happened to pick a = 4 instead of a = 7, then we get 4^{14} \equiv 1 \pmod {15} even though 15 isn’t prime.

So suppose we pick some a and get a^{n-1} \equiv 1 \pmod n. What can we do? Well, just try another a! If we get a^{n-1} \not\equiv 1 \pmod n for the new a, stop and report that n is composite. Otherwise, pick another a, and so on.

In general, we can iterate some fixed number of times k. If we ever find an a such that a^{n-1} \not\equiv 1 \pmod n, then we can report that n is definitely not prime. Otherwise, if we get through testing k different values of a and they all yield 1, then we can report that n is probably prime.

So this is better than nothing, but it’s not quite a primality machine, because it can’t tell us for sure that a number is prime. And it leaves a lot more questions: could we make k big enough so that we could know for sure whether n is prime? How big would k have to be? What about for composite numbers; how fast do we expect this to be? Are there other ways to build on this basic idea to get better (faster, more certain) primality tests? I’ll write about all this and more in future posts!

Posted in computation, number theory, primes | Tagged , , | 5 Comments

SMT solutions

In my last post I described the general approach I used to draw orthogons using an SMT solver, but left some of the details as exercises. In this post I’ll explain the solutions I came up with.

Forbidding touching edges

One of the most important constraints to encode is that two non-adjacent edges of an orthogon shouldn’t touch at all. How can we encode this constraint in a way that the SMT solver will accept? In particular, given two horizontal or vertical edges (x_{11},y_{11})\text{---}(x_{12},y_{12}) and (x_{21},y_{21})\text{---}(x_{22},y_{22}), we want to come up with a logical expression—using “and” and “or”, addition, comparisons like >, <, \geq or \leq, and the functions \mathrm{max} and \mathrm{min}—which is true if and only if the two edges don’t touch.

It turns out that we can simplify matters by thinking about how to test whether two axis-aligned rectangles intersect (though this might not seem simpler at first—trust me!).

By axis-aligned I mean that the edges of the rectangles are horizontal and vertical, as in the picture above. The rectangles in the bottom-right do not intersect; all the other pairs of rectangles do intersect (including the bottom-left pair, which only touch along an edge).

One way we can simplify things yet again is to think about each dimension independently: two rectangles intersect if and only if they overlap along the x-axis and along the y-axis. So imagine projecting the two rectangles down onto the x-axis; this results in two intervals.

One thing we have to be careful about up front is that we don’t know which order the interval endpoints will be given in. Given an interval with endpoints at x_1 and x_2, we can define x_1' = \min(x_1,x_2) and x_2' = \max(x_1,x_2). Then we know that x_1' is the smaller (leftmost) endpoint and x_2' is the bigger (rightmost) endpoint.

So now suppose we have two intervals, (x_{1L}, x_{1R}) and (x_{2L}, x_{2R}), where we have already “sorted” the endpoints using the above \min/\max technique, so we know x_{1L} < x_{1R} and x_{2L} < x_{2R}. How can we tell if these intersect? One approach is to try writing down a bunch of cases: the intervals intersect if x_{1L} is in between the endpoints of the second interval, OR if x_{1R} is in between them, OR if x_{2L}… but this quickly gets tedious, and there is a better way: it is simpler to think about when the intervals don’t intersect. If they don’t intersect, as in the bottom-right example in the above diagram, there must be a gap between the end of the first (leftmost) interval and the start of the next. Put another way, the right end of the left interval must be strictly less than the left end of the right interval.

But how can we tell which interval is the “left” interval and which is the “right” interval? These designations might not even make sense if the intervals do intersect. The key is to realize that we don’t actually care about which interval is where, but only about the relative order of their endpoints: the “right end of the left interval” is really just the leftmost right end, that is, \min(x_{1R}, x_{2R}); similarly the left end of the right interval is just the rightmost left end, that is, \max(x_{1L}, x_{2L}). Putting this together, the intervals do not intersect if and only if

\min(x_{1R}, x_{2R}) < \max(x_{1L}, x_{2L}).

Believe it or not, we’re almost done! We just need a couple more observations:

  • We can similarly project the rectangles onto the y-axis and analyze whether those intervals intersect. Two rectangles intersect if and only if their projections intersect along the x-axis and the y-axis. Conversely, two rectangles don’t intersect if and only their projections don’t intersect along the x-axis or the y-axis.
  • But we wanted to test line segments, not rectangles, right? Well, line segments can be thought of as just infinitely thin rectangles! Nothing in our analysis breaks if we allow this. When we project onto an axis we might get a degenerate interval consisting of a single point (where the left and right endpoints are the same), but this is perfectly OK—everything still works.

Putting this all together, given two horizontal or vertical line segments (x_{11},y_{11})\text{---}(x_{12},y_{12}) and (x_{21},y_{21})\text{---}(x_{22},y_{22}):

  • We first define x_{1L} = \min(x_{11}, x_{12}), x_{1R} = \max(x_{11}, x_{12}), and so on for x_{2L}, x_{2R}, y_{1L}, y_{1R}, y_{2L}, and y_{2R}.
  • Then the segments don’t intersect if they don’t intersect along either axis:

    (\min(x_{1R}, x_{2R}) < \max(x_{1L}, x_{2L})) \lor (\min(y_{1R}, y_{2R}) < \max(y_{1L}, y_{2L}))

Of course this all depends crucially on the assumption that the segments are horizontal or vertical. (Can you see where it goes wrong if this assumption is false? That is, can you come up with an example involving non-axis-aligned segments where this procedure says they intersect when they actually don’t, or vice versa?)

Computing orthogon perimeters

I also mentioned writing down an expression for the perimeter of the generated orthogon, so we can tell the SMT solver to minimize it. Of course, given a segment (x_1,y_1)\text{---}(x_2,y_2) we can just write down the formula for its length as \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}, and then add all these up. But this is no good since the x_i and y_i are explicitly required to be integers, and we’d rather not make the SMT solver deal with arithmetic on real numbers, which we would have to if we wanted it to take square roots. (It can deal with floating-point numbers, but it’s best avoided unless really necessary). And intuitively, dealing with real numbers is definitely overkill here, because the length of each edge is going to be an integer. So we’d like to find a way to talk about the length without using square root.

Once again, the assumption that the edges are always horizontal or vertical comes to our rescue. Suppose the edge is horizontal: in that case, (y_2 - y_1) will be zero, and the distance formula simplifies to \sqrt{(x_2 - x_1)^2 + 0^2} = \sqrt{(x_2 - x_1)^2} = |x_2 - x_1|. Likewise, if the segment is vertical, its length is |y_2 - y_1|. So in general, we could express the length of the segment by writing a condition that checks whether y_1 = y_2 and yields either |x_2 - x_1| or |y_2 - y_1| depending on the answer. But we can be slightly more clever than this: we don’t need any conditions if we simply write the length as

|x_2 - x_1| + |y_2 - y_1|

One of these will always be zero, leaving the other as the correct length.

You might be familiar with the above formula: it is known as the Manhattan or taxicab distance between (x_1,y_1) and (x_2,y_2)—so called because it corresponds to the distance between two points when you can only travel horizontally or vertically (such as in a city, like Manhattan, with a grid of streets). To use a fancier term, it’s the distance between the points in the L_1 metric (the usual distance formula is called the L_2 metric). The above discussion amounts to the observation that the L_1 and L_2 metrics are equal for horizontal or vertical segments, which makes sense: if you want to travel between two locations on the same street, you’ll travel the same distance no matter whether you take a taxi or a crow.

Posted in computation, geometry | Tagged , , , , , , , ,

Drawing orthogons with an SMT solver

I’m long overdue to finish up my post series on orthogons as promised. First, a quick recap:

Recall that we defined a “good” orthogon drawing as one which satisfies three criteria:

  1. No two edges intersect or touch, other than adjacent edges at their shared vertex.
  2. The length of each edge is a positive integer.
  3. The sum of all edge lengths is as small as possible.

We know that good drawings always exist, but our proof of this fact gives us no way to find one. It’s also worth noting that good drawings aren’t unique, even up to rotation and reflection. Remember these two equivalent drawings of the same orthogon?

It turns out that they are both good drawings; as you can verify, each has a perimeter of 14 units. (I’m quite certain no smaller perimeter is possible, though I’m not sure how to prove it off the top of my head.)

So, given a list of vertices (convex or concave), how can we come up with a good drawing? Essentially this boils down to picking a length for each edge. The problem, as I explained in my previous post, is that this seems to require reasoning about all the edges globally. I thought about this off and on for a long time. Each new idea I had seemed to run up against this same local-global problem. Finally, I had an epiphany: I realized that the problem of making a good orthogon drawing can be formulated as an input appropriate for an SMT solver.

What is am SMT solver, you ask? SMT stands for “satisfiability modulo theories”. The basic idea is that you give an SMT solver a proposition, and it tries to satisfy it, that is, find values for the variables which make the proposition true. Propositions are built using first-order logic, that is, things like “and”, “or”, “not”, implication, as well as “for all” (\forall) and “there exists” (\exists). The “modulo theories” part means that the solver also supports various theories, that is, collections of extra functions and relations over certain sets of values along with axioms explaining how they work. For example, a solver might support the theory of integers with addition, multiplication, and the \geq relation, as well as many other specialized theories.

Sometimes solvers can even do optimization: that is, find not just any solution, but a solution which gives the biggest (or smallest) value for some other function. And this is exactly what we need: we can express all the requirements of an orthogon as a proposition, and then ask the solver to find a solution which minimizes the perimeter length. SMT solvers are really good at solving these sorts of “global optimization” problems.

So, how exactly does this work? Suppose we are given an orthobrace, like XXXXXXVVXV, and we want to turn it into a drawing. First, let’s give names to the coordinates of the vertices: (x_0, y_0), (x_1, y_1), \dots, (x_{n-1}, y_{n-1}). (Note these have to be integers, which enforces the constraint that all edge lengths are integers.) Our job is to now specify some constraints on the x_i and y_i which encode all the rules for a valid orthogon.

We might as well start by assuming the first edge, from (x_0, y_0) to (x_1, y_1), travels in the positive x direction. We can encode this with two constraints:

  • y_1 = y_0
  • x_1 > x_0

y_1 = y_0 means the edge is horizontal; x_1 > x_0 expresses the constraint that the second endpoint is to the right of the first (and since x_0 and x_1 are integers, the edge must be at least 1 unit long).

We then look at the first corner to see whether it is convex or concave, and turn appropriately. Let’s assume we are traveling counterclockwise around the orthogon; hence the first corner being convex means we turn left. That means the next edge travels “north”, i.e. in the positive y direction:

  • x_2 = x_1
  • y_2 > y_1

And so on. We go through each edge (including the last one from (x_{n-1}, y_{n-1}) back to (x_0, y_0)), keeping track of which direction we’re facing, and generate two constraints for each.

There’s another very important criterion we have to encode, namely, the requirement that no non-adjacent edges touch each other at all. We simply list all possible pairs of non-adjacent edges, and for each pair we encode the constraint that they do not touch each other. I will leave this as a puzzle for you (I will reveal the solution next time): given two edges (x_{11},y_{11})\text{---}(x_{12},y_{12}) and (x_{21},y_{21})\text{---}(x_{22},y_{22}), how can we logically encode the constraint that the edges do not touch at all? You are allowed to use “and” and “or”, addition, comparisons like >, <, \geq or \leq, and the functions \mathrm{max} and \mathrm{min} (which take two integers and tell you which is bigger or smaller, respectively). This is tricky in the general case, but made much easier here by the assumption that the edges are always either horizontal or vertical. It turns out it is possible without even using multiplication.

Finally, we write down an expression giving the perimeter of the orthogon (how?) and ask an SMT solver to optimize it subject to the constraints. I used the Z3 solver via the sbv Haskell library. It outputs specific values for all the x_i and y_i which satisfy the constraints and give a minimum perimeter; from there it’s a simple matter to draw them by connecting the points.

To see this in action, just for fun, let’s turn the orthobrace XVXXVXVVXVVXXVVXVVXXXXXXVVXXVX (which I made up randomly) into a drawing. This has 30 vertices (and 30 edges); hence we are asking the solver to find values for 60 variables. We give it two constraints per edge plus one (more complex) constraint for every pair of non-touching edges, of which there are 405 (each of the 30 edges is non-adjacent to 27 others, so 30 \times 27 = 810 counts each pair of edges twice); that’s a total of 465 constraints. Z3 is able to solve this in 15 seconds or so, yielding this masterpiece (somehow it kind of reminds me of Space Invaders):

Posted in computation, geometry | Tagged , , , , , , , , , | 1 Comment

Chromatic number of the plane roundup

I’ve had fun writing about the Hadwiger-Nelson problem to determine the chromatic number of the plane, but I think this will be my last post on the topic for now!

More 7-colorings

Of course, the original point of the hexagonal 7-coloring in my last two posts is that it establishes an upper bound of CNP \leq 7 (although it turns out it’s also just a really cool pattern). Again, there is a balancing act here: we have to give each hexagon a diameter of < 1, so no two points in the same hexagon will be 1 unit apart; but we also have to make the hexagons big enough that same-colored hexagons are more than 1 unit from each other. This is indeed possible, since same-colored hexagons always have two “layers” of other hexagons in between them. Denis Moskowitz made a really nice graphic illustrating this:

In a comment on my previous post, Will Orrick pointed out that if you tile 3-dimensional space with cubes and color them with seven colors so that each cube is touching six others with all different colors, then take a diagonal slice through that space, you get this!

This is the same as the 7-colored hexagonal tiling I showed before, but with extra triangles in between the hexagons (and the colors of the triangles follow a pattern similar to the hexagons). I could stare at this all day! Here’s a version with numbers if you find it helpful. (If you support me on Patreon you can get automatic access to bigger versions of all the images I post—though to be honest if there’s a particular image you want a bigger version of, you can just ask nicely!)

What is CNP?

So, we know CNP is either 5, 6, or 7. So which is it? No one is really sure. With some unsolved problems, there is widespread agreement in the mathematical community on what the right answer “should be”, it’s just that no one has managed to prove it. That isn’t the case here at all. If you ask different mathematicians you will probably get different opinions on which number is correct. Some mathematicians even think the “right” answer might depend on which axioms we choose as a foundation of mathematics!—in particular that the answer might change depending on whether you allow the axiom of choice (a topic for another post, perhaps).

Posted in geometry | Tagged , , , , , , , , | 7 Comments