I recently learned about a really interesting sequence of integers, called the *Recamán sequence* (it’s sequence A005132 in the Online Encyclopedia of Integer Sequences). It is very simple to define, but the resulting complexity shows how powerful self-reference is (for both good and evil). Here’s the definition. The first term of the sequence is , and each term differs from by . Now, if were always just *more* than , we would have the triangular numbers:

Note how is one more than ; is more than ; is more than ; and so on.

But that’s not quite how the Recamán sequence is defined. In the Recamán sequence, “wants” to be *less* than : it will be so if is nonnegative and *has not already appeared in the sequence*. Otherwise, it “settles” for being *more* than (as with the triangular numbers).

So . Then:

- has to differ from by —that is, it must be or . But it can’t be negative, so it is.
- How about ? It must be away from , so either or —but again, it can’t be negative, so it is .
- Now, must be away from , so it must be or . is nonnegative, but it has already appeared in the sequence as , so we choose .
- So far, this is looking a lot like the triangular numbers! But let’s see what happens with . It must be , that is, either or . And here something different happens: is positive and has not appeared yet in the sequence, so .

Continuing this pattern, we get

From the definition, you might initially think that no number will ever be repeated: we explicitly avoid picking numbers that have already occurred in the sequence, right? Well, we don’t pick if it has already occurred, but in that case we definitely pick whether it has already occurred or not—and in fact sometimes it has. You don’t have to continue the sequence very much farther before you find, for example, .

Here’s a scatterplot of the first terms, with the -axis scaled by 1/2 to make it easier to see:

From the graph we can see that the numbers tend to form long parallel alternating runs where the top numbers are increasing by and the bottom decreasing. For example, we can see the first of these starting at :

This makes sense: we are alternately adding , then subtracting , then adding , then subtracting , and so on. The run will be broken when we hit a number that has already occurred: in this case, the number after is not , because that has already occurred, so instead it jumps up again to . After , it jumps up one more time (since was already in the sequence) and we get a very short parallel sequence before it falls back down and starts another alternating sequence .

So far, this looks regular-ish—one might hope to discover some regular patterns that could, for example, lead to a closed formula. But our hopes are dashed when we look further out in the sequence:

It doesn’t look very regular-ish anymore! And this is what I find fascinating about the Recamán sequence: the *self-reference* in its definition (we choose only when it has not already appeared) throws a giant monkey wrench of chaos into the works, so that it is very difficult to find any patterns or prove anything definite about it, even though it is still highly structured. To see what I mean, here’s a graph of the first 5000 terms:

There is most definitely structure—for example, all the terms seem to fall along these curving “bands” radiating out from the origin. I imagine one might even be able to say something approximate about the shape of those curves. But there is still obviously lots of chaos—it’s hard to discern any regular patterns.

So we know that the Recamán sequence is not 1-1: some numbers can appear multiple times. But is it *onto*? That is, does every number appear *at least once*, somewhere in the sequence? It is *conjectured* that this is true, but *no one knows*—it is an open question! According to the OEIS entry, after looking at the first terms, the smallest number that is still missing is ; every number smaller than that has appeared somewhere in the first terms of the sequence. Crazily though, is *still* missing after looking at the first terms!!! (That’s a staggeringly large number of terms; it is *far* more than the estimated number of atoms in the universe.) So perhaps the Recamán sequence is not onto after all—is there something special about so that it *never* apears? Or does it eventually appear *very very far* into the sequence? No one knows, and to be honest, I think the latter actually seems more likely. But it just underscores how difficult it would be to prove this.

Wow!

How is it possible to calculate 10^230 terms?

I assume the self reference makes it so, that time taken to calculate more terms is at least linear in the number of terms. And 10^230 seems just way too large.

(I know nothing about Numerical methods or scientific computing)

No, there is no possible way to actually calculate terms! Even if you could calculate terms per second (which is definitely an upper bound given today’s fastest supercomputers), it would still take over years. I do not actually know how the computation was done; I got this information from a comment on the OEIS entry. But I assume that it was making use of the structure of the sequence to skip a lot of work. For example, consider the sequence starting at . At this point we know the sequence will bounce up and down in two parallel trajectories until the bottom trajectory reaches 8, since 7 is the largest number we have seen so far. So we can jump straight ahead to without explicitly computing all the numbers in between—we know exactly what is going to happen. You can probably do much more sophisticated things to skip even more work.

It looks like an isosceles right triangle, until I remembered you changed the scale.

It is a fun exercise to implement this using a data structure that stores ranges of consecutive integers using just the start and end values. This is like run-length compression on the differences. It seems that that for n terms of the sequence, the number of ranges, i.e., the number of “gaps”, behaves as const*sqrt(n).

Thank you Brent for your blog about my sequence. It is my tiny contribution to the formidable building of mathematics. Your exposition of the sequence is clearer than many I have seen. I too believe (and hope) 852655 is somewhere in the sequence, and so every other number.