Orthogons and orthobraces

One of these days soon I will get back to writing about primality tests, but for now I am having fun getting sidetracked on orthogons!

In a previous post I gave rules for when two orthogons will be considered the same:

  • Two orthogons are the same if one is the mirror image of the other.
  • Two orthogons are the same if one can be turned into the other just by changing the length of some of the edges.

I just realized that I forgot a rule!

  • Two orthogons are the same if one is a rotation of the other.

You might not have even noticed this omission, because it would be strange not to have this rule: usually we think of two polygons as being the same if one is just a rotation of the other. And if rotating an orthogon were to result in a distinct orthogon, then there would be way too many distinct orthogons (uncountably infinitely many, in fact).

So, for example, these four orthogons are all the same:

Of course this notion of sameness should also be reflexive (any orthogon is the same as itself) and transitive (if A is the same as B and B is the same as C, then A is the same as C), so that it is an equivalence relation.

Now, in my previous post I finished the proof showing that orthogons are in 1-1 correspondence with sequences of X’s and V’s, containing exactly 4 more X’s than V’s, when the sequences are also considered the same up to rotation and reversal. Sequences considered the same up to rotation and reversal are sometimes called bracelets. They are very similar to the necklaces we saw in a recent proof of Fermat’s Little Theorem; the only difference is that two sequences which are the reverse of each other are considered distinct as necklaces, but the same as bracelets. Let’s call bracelets of X’s and V’s, with exactly four more X’s than V’s, orthobraces. (Not to be confused with orthodontics, which are also called braces, but are definitely not the same when rotated.)

So we have reduced the problem of listing all the orthogons with a given number of vertices to two subproblems:

  1. List all orthobraces of a given length.
  2. Turn each orthobrace into a drawing of an orthogon.

I want to start by talking about the first subproblem. To get a sense for why listing bracelets is an interesting problem, here are some questions you can ponder:

  • Start by considering normal sequences (so, e.g., XXXXXV, VXXXXX, and XXXVXX are all different sequences). How many sequences are there with four X’s and no Vs? How many with five X’s and one V? Six X’s and two V’s? In general, how many sequences are there with n+4 X’s and n V’s?

  • Try 1listing small orthobraces. How many do you get for each number of V’s? You can try to come up with a formula if you really want, but this actually turns out to be quite difficult; I suggest just trying to systematically list orthobraces and see how many you get. For example, there is only one orthobrace when v = 0, namely XXXX. There is also only one, XXXXXV, for v = 1. This is the only one since any sequence with five X’s and one V is just a rotation of this one. (The particular sequence we pick to represent the orthobrace doesn’t matter.)

  • Compare how many bracelets there are of each size with how many sequences there are. What seems to be the general trend?

  • An obvious algorithm for listing all bracelets of a given size is to list all the sequences (this is very easy), and throw away any sequences which are equivalent to one we’ve already generated when considered as bracelets (i.e. up to rotation and reversal). Based on your observations above, how feasible is this method?

Posted in combinatorics, geometry | Tagged , , , | 1 Comment

Properties of orthogons II

In my previous post I proved three out of the four properties of orthogons I originally stated. Now let’s prove the final property:

Every sequence of an even number of X’s and V’s, with exactly four more X’s than V’s, corresponds to the vertex sequence of some orthogonal polygon.

From the previous properties we know that any orthogon gives rise to such a vertex sequence; but how do we know the correspondence goes two ways? Could there be some vertex sequence that for some reason is not possible to realize geometrically?

Of course the answer is no—and here’s one proof. (There may well be better ones; I welcome your ideas! Commenter Gesh also gave a similar proof in the comments to my last post.) The proof is by induction on the number of V’s in the sequence.

First, if the sequence contains no occurrences of V, then it must be XXXX (since there are four more X’s than V’s), and we know this is the vertex sequence of a square.

So now suppose there is at least one V (and hence at least five X’s). There must be an occurrence of either VX or XV somewhere in the sequence. If we delete a VX or XV from the sequence, the resulting sequence still has four more X’s than V’s, and has one fewer V, so the induction hypothesis tells us that it is the vertex sequence of some orthogon. We can take this orthogon and turn it into an orthogon corresponding to the original sequence with the VX or XV re-inserted, by inserting a “step” into the appropriate edge, like so:

Since an orthogon is finite and all its edges must have some positive length, it is always possible to make this “step” small enough that it does not cause any problems with the rest of the orthogon. (For example, there might be another parallel edge very near to the edge in which we are inserting the step; we have to make the step small enough so that it does not intersect with this other edge.) Concretely, we could, for example, make the first step have height 1/2, the second one 1/4, and so on, with the nth step having height 1/2^n.

This proof not only shows that there must exist an orthogon corresponding to each valid vertex sequence, it actually gives an algorithm for constructing one. (There is only a small problem: the resulting drawings look terrible! But one thing at a time…)

As a concrete example, let’s consider the string XXXXXXVVXV.

  • Let’s remove the first XV, resulting in XXXXXVXV. We have to recursively produce a drawing of this orthogon before we can re-insert the XV.
    • We remove another XV, leaving XXXXXV.
      • Finally we remove the last XV, leaving XXXX. This is the base case, and corresponds to a square:

      • We re-insert XV to give XXXXXV. This corresponds to inserting a step into one of the edges; we’ll make it 1/2 unit tall, like so:

    • Re-inserting XV again yields XXXXXVXV, which corresponds to adding a step (of size 1/4) to the edge just before the convex vertex of the previously added step:

  • Finally, to re-insert the final XV, we add a size-1/8 step in the middle of the step we just added:

So, in summary, We now know there is a 1-1 correspondence between orthogons and sequences of V’s and X’s with four more X’s than V’s (where we also consider two sequences equivalent if one is a cyclic rotation of the other, or if one is the reverse of the other.) If we can find a way to enumerate such sequences, we can use them to generate all possible orthogons. I’ll talk about generating these sequences in an upcoming post. Also, as I mentioned before, although this algorithm proves that a drawing always exists for any suitable sequence of X’s and V’s, drawings actually produced with this algorithm do not look very good! The exponentially decreasing steps guarantee nothing will ever intersect, but after inserting five or six of them they basically get too small to see. Producing good-looking drawings turns out to be a very interesting challenge, which I will explain in another post.

Posted in combinatorics, geometry | Tagged , , , | 1 Comment

Properties of orthogons I

First things first: from now on, when talking about polygons with only right angles, instead of calling them “orthogonal polygons” I’m going to start calling them “orthogons”, which sounds cool, is much less clunky than “orthogonal polygons”, and doesn’t seem to already be in use.1 This name was recently suggested to me by Mark Dominus (who was the one who got me thinking about orthogons in the first place).

In my previous post, I posed several properties of orthogons and invited you to figure out why they are true. Here are my proofs.

  1. Every orthogon has an even number of vertices.

    Since every vertex is a right angle, the edges of an orthogon must alternate between horizontal and vertical. Hence there must be an even number of vertices; otherwise there would be two vertical edges or two horizontal edges meeting at a vertex, which does not make sense.

  2. Orthogons only have two kinds of vertices: “right turns” and “left turns” (let’s suppose we always travel clockwise around the polygon).

    Since writing this, I realized that it is probably easier to think of the vertices as either convex (the corner points towards the exterior of the polygon) or concave (the corner points towards the interior). I’ll use the letter X to stand for conveX vertices, and V for concaVe.

    This way, we don’t have to worry about whether we traverse the polygon in clockwise or counterclockwise order (and besides, it can be confusing to figure out what “clockwise” vs “counterclockwise” even means when the polygon doubles back on itself a lot, as in the example shown below).

    If you switch directions, left turns become right turns and vice versa. But describing vertices as convex or concave is independent of the polygon’s orientation. You can verify for yourself that the orthogon shown above has the vertex sequence V^9 X^{13}, that is, nine concave vertices in a row followed by 13 convex vertices.

    In any case, the reason there are only two types of vertices is simple: if you are travelling along an edge and come to a right-angle vertex, there are only two choices: turn 90 degrees to the left, or to the right.

  3. Every (closed, non-self-intersecting) orthogon has four more right turns than left turns.

    Again, the way I originally stated this, with left and right turns, is annoying: if you travel around an orthogon in the opposite direction, then it instead has four more left turns than right turns. We can state this in a much nicer way which is independent of direction: an orthgon always has four more convex vertices than concave.

    Obviously this is true for a square, the one possible orthogon with four vertices: there are four convex vertices and no concave vertices. We can see that it also holds for the example orthogon shown above, with nine concave vertices and thirteen convex. Intuitively, the reason it is true in general is that every concave vertex is a turn in “the wrong direction” which needs to be “cancelled” by a convex vertex; on top of that, there need to be four convex vertices to make the whole thing “close up”.

    Let’s see how to prove this more formally. We start from the fact that if you sum the internal angles of any polygon with n vertices, you get \pi(n-2) (that is, n-2 times 180 degrees). When n = 3 this is just the familiar fact that the angles of a triangle sum to \pi. But then for bigger n, we can always cut up an n-gon into n-2 triangles, and adding up the angles of all the triangles is the same as adding up all the internal angles of the whole polygon.

    Now, suppose an orthogon has x convex vertices and v concave vertices, and hence x + v vertices in total. Convex vertices have an internal angle of \pi/2, and concave have an internal angle of 3\pi/2. Thus,

    \displaystyle x(\pi/2) + v(3\pi/2) = \pi(x+v-2)

    If we divide both sides by \pi and multiply by 2, we get

    \displaystyle x + 3v = 2x+2v-4

    and then a bit of algebraic rearranging yields x - v = 4.

    Incidentally, this now gives us another way to see that there must be an even number of vertices: x + v = x + (x - 4) = 2x - 4, which is clearly even.

The final property I stated—namely, that every viable sequence of V’s and X’s describes some orthogon—will have to wait for another post. I’ll give you a hint: the best way to prove this is to give an algorithm that takes a sequence of V’s and X’s as input and produces an actual drawing of an orthogon as output. (But for now it doesn’t matter if the drawing is horrible, just whether one exists!)

  1. Wiktionary suggests that “orthogon” can refer to a right triangle or to a “rectangular figure” but I have never heard it used to refer to those things. Perhaps it is more common in other English-speaking countries?

Posted in combinatorics, geometry, proof | Tagged , , , , , , , | 8 Comments

Orthogonal polygons

It’s time to say more about PWW #21, in which I exhibited things like this:

Quite a few commenters figured out what was going on, and mentioned several nice (equivalent) ways to think about it. Primarily, the idea is to draw all possible orthogonal polygons, that is, polygons with only right angles, organized by the total number of vertices. (So, for example, the picture above shows all orthogonal polygons with exactly ten vertices.) However, we have to be careful what we mean by the phrase “all possible”: there would be an infinite number of such polygons if we think about things like the precise lengths of edges. So we have to say when two polygons are considered the same, and when they are distinct. My rules are as follows:

  • Two polygons are the same if one is the mirror image of the other.
  • Two polygons are the same if one can be turned into the other just by changing the length of some of the edges.

So, for example, I will consider these three polygons are all the same:

(It’s easy to see why the first two are the same. Can you see why the other one is the same too?)

I want to explain some of the mathematics behind generating these. In order to get there, I will start by stating some propositions. Can you see why each of these statements is true?

  1. Every orthogonal polygon has an even number of vertices.
  2. Orthongal polygons only have two kinds of vertices: “right turns” and “left turns” (let’s suppose we always travel clockwise around the polygon).
  3. Every (closed, non-self-intersecting) orthogonal polygon has four more right turns than left turns.
  4. Every sequence of an even number of R’s and L’s, with exactly four more R’s than L’s, corresponds to the vertex sequence of some orthogonal polygon.

The fourth statement may seem trivial but it is worth a bit of thought: how do you know that you can always draw a non-self-intersecting orthogonal polygon for any valid sequence of left and right turns?

Posted in combinatorics, geometry | Tagged , , | 5 Comments

Fast and slow machines

In my previous post, I presented three hypothetical machines which take a positive integer n as input and give us something else as output:

  1. a factorization machine gives us the complete prime factorization of n;
  2. a factor machine gives us one prime factor of n;
  3. a primality machine just tells us whether or not n is prime (that is, it gives us an answer of either true or false).

I mentioned that we know how to make primality machines that are much faster than factorization machines, or even than factor machines. Before I finally get around to explaining how to build such fast primality machines, it’s worth explaining what I mean when I talk about a machine being “fast” or “slow”.

Of course this whole time when I have been talking about machines, what I really have in mind are algorithms, i.e. specific sets of steps to take some given input and produce a desired output. In other words, computer programs. (It’s a fascinating and wonderful fact that computers are universal machines, in the sense that they can simulate any other machine; instead of having one special machine to write email, and another special machine to read Wikipedia, and yet another special machine to find prime numbers, you can just have one single, general-purpose machine that can do all of these things. But this is a subject for another blog post!)

The obvious, naive way to create a factor machine is as follows:

  • For each positive integer d from 2 up to n-1:
    • Try dividing n by d.
    • If d evenly divides n, that is, the remainder is 0, then stop and output d.
  • If every d from 2 to n-1 is tried without finding one that evenly divides n, then n must be prime, so output n.

This is called trial division. (Can you see why it will always return a prime divisor of n?) There are several ways this can be optimized, but we’ll get back to those later. For now let’s think about how long this takes.

When measuring how long an algorithm takes, it is useless to measure the actual running time, in seconds, on some particular computer. For one thing, the amount of time the algorithm takes to run is highly dependent on the particular computer executing it, what other programs are running on the same computer, the weather, time of day, etc., so it is quite difficult to compare to other algorithms. For another thing, the algorithm may take different amounts of time for different inputs. When mathematically analyzing how long an algorithm takes, what we really care about is how the running time scales as a function of the size of the input. This tells us something about how big the inputs can get before the algorithm takes an infeasibly long time to run.

So, what about trial division? The input is an integer n; the size of the input is the number of digits needed to write n. (It is more typical to measure the size in terms of the number of bits needed to write it in binary, but it turns out that the base doesn’t really matter, so we’ll stick to base 10 for now.) Let m be the number of digits needed to write n, so m \approx \log_{10} n, and n \approx 10^m. How many steps does the algorithm take to run? We do one division operation for each d from 2 up to n-1, so it essentially takes time proportional to n. Since n \approx 10^m, this means that the time needed to run scales exponentially with the size of the input: every time we add one more digit to the input, we multiply the time needed by a factor of 10.

At this point you may protest that the version of trial division I gave above is hopelessly naive; and indeed, we can optimize it. For example, we don’t have to try every d: if we try dividing by 2 first, after that point we can just try dividing by all the odd numbers up to n, since if 2 doesn’t divide n then no other even number can either. This essentially halves the necessarily running time; but 10^m/2 is still proportional to 10^m.

A more important optimization is to only test values of d up to \sqrt{n}, instead of going all the way up to n. If n is the product of two divisors, n = xy, then one of them (say, x) must be \leq \sqrt{n}, and the other must be \geq \sqrt n. So if we haven’t found any divisors by the time we get to \sqrt n then we certainly aren’t going to find any above \sqrt n.

So now the algorithm only takes time proportional to \sqrt n, which seems like it is a big improvement. To be sure, it is an improvement; but if we relate it back to the number of digits m, we find \sqrt{n} \approx \sqrt{10^m} = (\sqrt{10})^m \approx 3.16^m, so the running time still grows exponentially in the number of digits m, just with a smaller base.

In general, exponential algorithms are fairly useless—for even modest-sized inputs, the running time can be astronomical. For example, suppose an algorithm takes exactly 3.16^m seconds to run on an input of size m. When m = 1 it takes 3.16 seconds; when m = 2, 10 seconds. So far, so good. For m = 4 it takes a couple minutes (100 seconds); for m = 6, it takes 15 minutes; for m = 8, a few hours; for m = 10, a whole day. By the time we get up to an input of size 20, the algorithm takes over 300 years. For an input of size 30, it takes over 30 million years. We only have to get up to size 35 or so before the algorithm would take longer than the estimated age of the known universe.

In some sense, we don’t know how to do better than this for factor machines and factorization machines. There are factorization algorithms which are indeed much faster than trial division (with fancy names like the Generalized Number Field Sieve). Using such algorithms it’s feasible to factor numbers with up to hundreds of digits instead of just a few—but nevertheless, the running time of these algorithms still scales essentially exponentially with the size of the input. (There is actually quite a lot of technical detail I am sweeping under the rug here, but I don’t really want to get into it so I hope you will forgive me.)

Primality testing, on the other hand, is a completely different story! Trial division is the only algorithm most people know for primality testing, but over the next few posts I will explain a few other algorithms (and prove that they work!) which are much faster.

Posted in computation, number theory, primes | Tagged , , , , , , , , , , | 1 Comment

New Mersenne prime

With impeccable timing, just in the middle of my series about primality testing, a new Mersenne prime has been announced, a little under two years after the previous one. In particular, it has been shown that 2^{77,232,917}-1 is prime; this is now the largest known prime number.

If you want to understand what computers are actually doing when they check a Mersenne number for primality, I wrote a whole series about it two years ago: visit this list of my post series and search for “Lucas-Lehmer”.

Posted in number theory, primes | Tagged , , , , | Leave a comment

A tale of three machines

The Fundamental Theorem of Arithmetic tells us that any positive integer can be factored into a product of prime factors.1 Given a positive integer n, this leads naturally to several questions:

  1. What is the prime factorization of n?

This is the most obvious question we could ask. The Fundamental Theorem tells us such a prime factorization must exist (and must be unique up to the order of the factors), so we can simply ask what it is. For example, if n = 101500, then the answer would be n = 2^2 \cdot 5^3 \cdot 7 \cdot 29. What we really want is some sort of machine (let’s call it a factorization machine) where we can put a positive integer in one end, the machine makes whirring, griding, and beeping sounds for a while, and then a factorization pops out the other end:

Another question we could ask about n would be:

  1. What is one prime factor of n?

An answer to this question is a similar sort of machine, except instead of getting the entire prime factorization out, we just get one prime factor; let’s call this a factor machine. For example, it might work like this:

These first two machines are very closely related. If we have a factorization machine we can easily use it to build a factor machine: just run n through the factorization machine, pick one of the factors to return, and throw the rest away.

Slightly less obviously, we can also use a factor machine to build a factorization machine: run n through the factor machine to get a prime factor p_0. Now we can compute n_1 = n/p_0 (by definition, p_0 must evenly divide n); to get the rest of the factorization of n we just need to factor n_1. So we run n_1 through the factor machine to get a prime factor p_1; we compute n_2 = n_1/p_1; and so on. We are done when putting n_i into the factor machine returns n_i itself; that means n_i = p_i is prime. Finally, we return the factorization p_0 \cdot p_1 \cdot \dots \cdot p_i.

For example, if we run 101500 through the factor machine and get 7, then we compute 101500/7 = 14500. Next we run 14500 through the factor machine again and get, say, 5, and then compute the remaining part that needs to be factored, 14500/5 = 2900; and so on.

Finally, here’s a third question we could ask:

  1. Is n prime?

In answer to this we could imagine another similar machine which outputs a simple yes/no answer. Let’s call this a primality machine:

Given a factor machine, we can easily use it to build a primality machine: take the input n and run it through the factor machine. If the answer is n, then n is prime so output “YES”; otherwise, n is not prime (since it is divisible by the output factor), so output “NO”.

But what about the converse? If we have a working primality machine, can we use it to build a factor machine? Given some n, if the primality machine says YES then n is prime, so the factor machine should output n. But what if the primality machine says NO? On the face of it, the mere fact of knowing that n is composite does not help us find a factor. But what is the primality machine doing on the inside? If it knows that n is not prime, it must have figured out a way to factor n, right? That is, somewhere in the internals of the machine, there must have been a factor that it simply isn’t telling us about:

If this is true, it means we could rip open the guts of the primality machine and use it to build a factor machine. Or, put more simply, this would be saying that the only way to make a primality machine is to build it out of a factor machine. This seems intuitively reasonable: how could you know that n is composite without factoring it?

The punchline is that this is not true!! That is,

It is possible to make working primality machines that truly do not know anything about any factors of n.

Not only that, but we know how to make primality machines that run much faster than the fastest known factor machines!

Let that sink in for a minute. It is really quite surprising that we can find out whether n has any prime divisors without actually finding out what they are, and that moreover it is much faster if you don’t care what the prime factors are. Think of it as a sort of “computational loophole” in our universe.

This is no mere curiosity; it turns out that this “loophole” is actually the basis of almost all of modern cryptography. When your browser makes an encrypted connection to a website in order to securely transmit your credit card information, it is relying on this loophole: your browser needs to pick some numbers and make sure that they are prime, which it can do quickly; but for anyone intercepting the messages to steal your credit card, they would have to factor another number, which (as far as we know) cannot be done quickly. (This is really cool and a topic for another post.)

It gets stranger, though: you may have noticed that I keep using phrases such as “fastest known”, “seems”, and “as far as we know”. It turns out that no one has been able to prove that making faster factor machines can’t be done! So it’s possible that the “loophole” isn’t a loophole at all; perhaps with a clever enough idea it will turn out to be possible to factor numbers just as fast as we can test whether they are prime. But most people don’t believe that.

In some upcoming posts I plan to talk a bit more about what we actually mean when we are talking about these machines being “fast” or “slow”, and then finally get around to explaining some different ways of building fast primality machines.

  1. Possibly a product of no prime factors, which accounts for 1.

Posted in computation, number theory, primes | Tagged , , , , | 11 Comments