The Recamán sequence

I recently learned about a really interesting sequence of integers, called the Recamán sequence (it’s sequence A005132 in the Online Encyclopedia of Integer Sequences). It is very simple to define, but the resulting complexity shows how powerful self-reference is (for both good and evil). Here’s the definition. The first term of the sequence is a_0 = 0, and each term a_n differs from a_{n-1} by n. Now, if a_n were always just n more than a_{n-1}, we would have the triangular numbers:

\displaystyle 0, 1, 3, 6, 10, 15, 21, \dots

Note how a_1 = 1 is one more than a_0 = 0; a_2 = 3 is 2 more than a_1; a_3 = 6 is 3 more than a_2; and so on.

But that’s not quite how the Recamán sequence is defined. In the Recamán sequence, a_n “wants” to be n less than a_{n-1}: it will be so if a_{n-1} - n is nonnegative and has not already appeared in the sequence. Otherwise, it “settles” for being n more than a_{n-1} (as with the triangular numbers).

So a_0 = 0. Then:

  • a_1 has to differ from a_0 by 1—that is, it must be -1 or 1. But it can’t be negative, so 1 it is.
  • How about a_2? It must be 2 away from a_1 = 1, so either -1 or 3—but again, it can’t be negative, so it is 3.
  • Now, a_3 must be 3 away from a_2, so it must be 0 or 6. 0 is nonnegative, but it has already appeared in the sequence as a_0, so we choose 6.
  • So far, this is looking a lot like the triangular numbers! But let’s see what happens with a_4. It must be 6 \pm 4, that is, either 2 or 10. And here something different happens: 2 is positive and has not appeared yet in the sequence, so a_4 = 2.

Continuing this pattern, we get

\displaystyle 0, 1, 3, 6, 2, 7, 13, 20, 12, 21, 11, 22, 10, 23, 9, 24, 8, 25, 43, \dots

From the definition, you might initially think that no number will ever be repeated: we explicitly avoid picking numbers that have already occurred in the sequence, right? Well, we don’t pick a_{n-1} - n if it has already occurred, but in that case we definitely pick a_{n-1} + n whether it has already occurred or not—and in fact sometimes it has. You don’t have to continue the sequence very much farther before you find, for example, a_{20} = a_{24} = 42.

Here’s a scatterplot of the first 100 terms, with the y-axis scaled by 1/2 to make it easier to see:

From the graph we can see that the numbers tend to form long parallel alternating runs where the top numbers are increasing by 1 and the bottom decreasing. For example, we can see the first of these starting at 13:

\displaystyle 13, 20, 12, 21, 11, 22, 10, 23, 9, 24, 8, 25, \dots

This makes sense: we are alternately adding n, then subtracting n+1, then adding n+2, then subtracting n+3, and so on. The run will be broken when we hit a number that has already occurred: in this case, the number after 25 is not 7, because that has already occurred, so instead it jumps up again to 43. After 43, it jumps up one more time (since 24 was already in the sequence) and we get a very short parallel sequence before it falls back down and starts another alternating sequence 41, 18, 42, 17, 43, 16, \dots.

So far, this looks regular-ish—one might hope to discover some regular patterns that could, for example, lead to a closed formula. But our hopes are dashed when we look further out in the sequence:

It doesn’t look very regular-ish anymore! And this is what I find fascinating about the Recamán sequence: the self-reference in its definition (we choose a_{n-1} - n only when it has not already appeared) throws a giant monkey wrench of chaos into the works, so that it is very difficult to find any patterns or prove anything definite about it, even though it is still highly structured. To see what I mean, here’s a graph of the first 5000 terms:

There is most definitely structure—for example, all the terms seem to fall along these curving “bands” radiating out from the origin. I imagine one might even be able to say something approximate about the shape of those curves. But there is still obviously lots of chaos—it’s hard to discern any regular patterns.

So we know that the Recamán sequence is not 1-1: some numbers can appear multiple times. But is it onto? That is, does every number appear at least once, somewhere in the sequence? It is conjectured that this is true, but no one knows—it is an open question! According to the OEIS entry, after looking at the first 10^{15} terms, the smallest number that is still missing is 852655; every number smaller than that has appeared somewhere in the first 10^{15} terms of the sequence. Crazily though, 852655 is still missing after looking at the first 10^{230} terms!!! (That’s a staggeringly large number of terms; it is far more than the estimated number of atoms in the universe.) So perhaps the Recamán sequence is not onto after all—is there something special about 852655 so that it never apears? Or does it eventually appear very very far into the sequence? No one knows, and to be honest, I think the latter actually seems more likely. But it just underscores how difficult it would be to prove this.

About Brent

Associate Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
This entry was posted in arithmetic, recursion, sequences and tagged , , , , . Bookmark the permalink.

5 Responses to The Recamán sequence

  1. Julian R. says:

    Wow!

    How is it possible to calculate 10^230 terms?
    I assume the self reference makes it so, that time taken to calculate more terms is at least linear in the number of terms. And 10^230 seems just way too large.

    (I know nothing about Numerical methods or scientific computing)

  2. Brent says:

    No, there is no possible way to actually calculate 10^{230} terms! Even if you could calculate 10^{15} terms per second (which is definitely an upper bound given today’s fastest supercomputers), it would still take over 10^{200} years. I do not actually know how the computation was done; I got this information from a comment on the OEIS entry. But I assume that it was making use of the structure of the sequence to skip a lot of work. For example, consider the sequence starting at a_6 = 13. At this point we know the sequence will bounce up and down in two parallel trajectories until the bottom trajectory reaches 8, since 7 is the largest number we have seen so far. So we can jump straight ahead to a_{18} = 43 without explicitly computing all the numbers in between—we know exactly what is going to happen. You can probably do much more sophisticated things to skip even more work.

  3. Laura Spencer says:

    It looks like an isosceles right triangle, until I remembered you changed the scale.

  4. Jan Van lent says:

    It is a fun exercise to implement this using a data structure that stores ranges of consecutive integers using just the start and end values. This is like run-length compression on the differences. It seems that that for n terms of the sequence, the number of ranges, i.e., the number of “gaps”, behaves as const*sqrt(n).

  5. ignotus2013 says:

    Thank you Brent for your blog about my sequence. It is my tiny contribution to the formidable building of mathematics. Your exposition of the sequence is clearer than many I have seen. I too believe (and hope) 852655 is somewhere in the sequence, and so every other number.

Comments are closed.