The birthday candle problem: solution

Recall the birthday candle problem I wrote about in a previous post:

A birthday cake has n lit candles. At each step you pick a number 1 \leq k \leq n uniformly at random and blow out k candles. If any candles remain lit, the process repeats (using a new value of n). As a function of n, what is the expected number of rounds needed to blow out all the candles?

Two commenters came up with the right answer; congratulations! Here’s my solution. Since it’s more fun this way, I’m going to explain not just the solution itself but also the process I went through in finding it.

First, let b_n denote the expected number of steps to blow out all the candles starting from n lit candles. As a base case, b_0 = 0, that is, you don’t need any steps if no candles are lit in the first place. Now, let’s think about an expression for b_n when n \geq 1. If you blow out some candles, leaving r, then you expect to need b_r more steps in addition to the one step you just took. Each r from 0 to n-1 is equally likely, so we can average all these cases to find the expected value of b_n, that is,

\displaystyle b_n = \frac{1}{n} \sum_{0 \leq r < n} (1 + b_r)

We can simplify this a bit: the sum involves adding up n copies of the number 1 in addition to b_r, which yields n, which we then immediately divide by n again. So in fact

\displaystyle b_n = 1 + \frac{1}{n} \sum_{0 \leq r < n} b_r.

That is, you always need 1 step, plus whatever is needed on average for what remains.

At this point, instead of analyzing b_n further, I just computed (by hand!) b_n for small values of n:

\begin{array}{rcl} b_0 &=& 0 \\ b_1 &=& 1 \\ b_2 &=& \displaystyle 1 + \frac{1}{2}(0 + 1) = \frac{3}{2} \\[1em] b_3 &=& \displaystyle 1 + \frac{1}{3}\left(0 + 1 + \frac{3}{2}\right) = 1 + \frac{5}{6} = \frac{11}{6} \\[1em] b_4 &=& \displaystyle 1 + \frac{1}{4}\left(0 + 1 + \frac{3}{2} + \frac{11}{6}\right) = 1 + \frac{1}{4} \cdot \frac{26}{6} = \frac{50}{24} \end{array}

I left b_4 unreduced since it seemed clear at this point that b_n can be written as a fraction with n! in the denominator, which seemed nice.

At this point I actually just looked up the sequence of denominators 0, 1, 3, 11, 50 in the wonderful Online Encyclopedia of Integer Sequences, and found sequence A000254, “unsigned Stirling numbers of the first kind”. Sounds neat! I scanned through the page and this comment jumped out at me: “The numerator of the fraction when we sum (without simplification) the terms in the harmonic sequence. (1+1/2=2/2+1/2= 3/2; 3/2+1/3=9/6+2/6= 11/6; 11/6+1/4=44/24+6/24= 50/24; ). The denominator of this fraction is n!.” At this point my mouth dropped open with surprise, since the definition of b_n does not, on the face of it, look anything like the definition of the harmonic numbers! The harmonic numbers H_n are partial sums of fractions 1/n, that is,

\begin{array}{rcl} H_1 &=& 1/1 \\ H_2 &=& 1/1 + 1/2 \\ H_3 &=& 1/1 + 1/2 + 1/3 \end{array}

and so on. Is it really possible that b_n = H_n? Computing the first few H_n, it sure seems to be so: 1/1 + 1/2 = 3/2, and 3/2 + 1/3 = 9/6 + 2/6 = 11/6, and so on. But this might just be a coincidence for small values of n; can we prove that H_n = b_n for all n?

Sometimes, just getting an idea of what you should try to prove makes all the difference. Once I had this clue, I was able to prove by induction that indeed b_n = H_n. (And perhaps at this point you might like to stop and try to prove it yourself!) Clearly b_0 = H_0 = 0. In the inductive case,

\begin{array}{rcl} b_{n+1} &=& \displaystyle 1 + \frac{1}{n+1} \sum_{0 \leq r < n+1} b_r \\[1.5em] &=& \displaystyle 1 + \frac{b_n}{n+1} + \frac{1}{n+1} \sum_{0 \leq r < n} b_r \\[1.5em] &=& \displaystyle 1 + \frac{b_n}{n+1} + \left(\frac{n}{n+1}\right) \frac{1}{n} \sum_{0 \leq r < n} b_r \\ &=& \displaystyle 1 + \frac{b_n}{n+1} + \frac{n}{n+1} (b_n - 1) \\[1em] &=& \displaystyle b_n + \frac{1}{n+1} \\[1em] &=& \displaystyle H_n + \frac{1}{n+1} \\ &=& H_{n+1} \end{array}

(Typically when I write a calculational proof like this, I like to typeset it with the reason for each step written in between the lines; but that is difficult to do on this blog, and you will probably have more fun figuring out the reasoning yourself!)

Thus, the expected number of steps needed to blow out n candles is the nth harmonic number, H_n = 1 + \frac{1}{2} + \frac{1}{3} + \dots + \frac{1}{n}. Incidentally, we have of course discovered an interesting harmonic number identity: just substitute H_n for b_n in our original recurrence!

After arriving at this result, I tried to see whether there is a more intuitive way to understand it. Here’s what I came up with. Number the candles from 1 to n, such that we always blow out the higest-numbered candles first, and candle number 1 is the last to be blown out. Say that each step of blowing out k candles is “charged” to the lowest-numbered candle to be blown out; the remaining k-1 candles are blown out “for free”. We then consider how much it “costs” to blow out a particular candle. Candle 1 definitely costs 1 step: it will always be charged for the step on which it is blown out. Candle 2 costs, on average, 1/2: either it is the lowest-numbered candle to be blown out, leaving the first candle still lit, in which case it incurs the cost for that step; or else it is blown out for free if the first candle is also blown out in that step. Since these are equally likely, the expected cost is 1/2. Likewise, there are three equally likely scenarios for candle 3; in one case it incurs a cost of 1, and in the other two cases it is blown out for free, yielding an expected cost of 1/3, and so on. Since the expectation of a sum is the sum of expectations (i.e. expectation is linear), the expected total cost is the sum of the individual expected costs, 1 + 1/2 + 1/3 + \dots. I am confident this can be made into a perfectly rigorous argument, which would actually constitute an alternative, “combinatorial” proof that b_n = H_n.

As a final aside, according to a classic result \lim_{n \to \infty} H_n = \ln n + \gamma, where \gamma \approx 0.5772156649\dots is the Euler-Mascheroni constant. So we can also say that the expected number of rounds to blow out n candles is approximately the natural logarithm of n. Apparently, the fact that the expectation is logarithmic in the number of candles (as opposed to linear) can be surprising to people. It was intuitively obvious to me, but that’s probably just because I’ve done a lot of computer science; this problem would fit very well into an analysis of algorithms course.

Posted in combinatorics, probability, solutions | Tagged , , , | 5 Comments

Cosmic Call at The Universe of Discourse

I’ve really been enjoying a series of posts by Mark Dominus on his blog, The Universe of Discourse. The Cosmic Call was a series of radio messages sent in 1999 and 2003 aimed at various nearby stars (in case there are any intelligent aliens there to receive the message).  It is fascinating to consider the question: how do you communicate with other intelligent beings who may be very different from us?  The messages were encoded as simple bitmap images and tried to build up from very simple arithmetic to more complicated things.

Mark has already covered the first five parts of the (23-part) message, with more to come.  You should see if you can decipher each part before reading the explanation! The first part is shown below.

Posted in links | Tagged , , , | 1 Comment

The birthday candle problem

After a 1.5-month epic journey1, I am finally settling into my new position at Hendrix College. Here’s a fun problem I just heard from my new colleage Mark Goadrich:

A birthday cake has n lit candles. At each step you pick a number 1 \leq k \leq n uniformly at random and blow out k candles. If any candles remain lit, the process repeats (using a new value of n). As a function of n, what is the expected number of rounds needed to blow out all the candles?

For example, suppose the cake starts out with 14 lit candles. You roll a fair 14-sided die2 to choose a number between 1 and 14. Suppose you roll a 6. You then blow out 6 candles, leaving 8 lit candles. You now roll a fair 8-sided die; suppose you roll a 3. You blow out 3 candles, leaving 5. So you roll a fair 5-sided die. Maybe this time you actually roll a 5, so you blow out all the remaining candles and you are done. In this particular case, the whole process took three rounds; the question is to determine how many rounds it will take on average.

If one starts with n lit candles, it seems clear that the average number of rounds will be less than n (it would be very unlucky to roll a 1 every single time) but greater than 1 (it is pretty lucky to roll an n on the first try, although not nearly as much so as rolling all 1’s).

Have fun thinking about it. There is, in fact, a nice analytical solution, which I will explain in a future post!


  1. US states visited: 9 (Massachusetts, Vermont, New York, Michigan, Wisconsin, Iowa, Kansas, Missouri, Arkansas); countries visited: 4 (US, Canada, Netherlands, Germany); plane flights: 10; hours of driving: 40; beds: 13; unexpected two-week detours to stay with in-laws: 1; number of days late by which the moving company delivered our belongings: 25.

  2. You can make a fair 2n-sided die for any n \geq 3 out of a bipyramid with an n-gonal base.

Posted in challenges, combinatorics, probability | Tagged , , , | 5 Comments

Mystery curve, animated

As a follow-on to my previous post, here’s an animation (17MB) showing how the “mystery curve” arises as a sum of circular motions:

Recall that the equation for the curve is f(t) = e^{it} + \frac{1}{2} e^{6it} + \frac{i}{3} e^{-14it}. The big blue circle corresponds to the e^{it} term—it is a circle of radius 1 and makes one complete revolution before the animation restarts. The medium orange circle corresponds to \frac{1}{2} e^{6it}: it has a radius of 1/2 and rotates 6 times as fast as the blue circle. The small green circle corresponds to \frac{i}{3} e^{-14t}. It rotates 14 times as fast as the blue circle, but in the opposite direction. It has a radius of 1/3, and starts out 1/4 rotation out of phase with the others, since multiplying by i corresponds to a 1/4 rotation in the complex plane. (Notice that whenever the blue and orange circles are pointing in exactly the same direction, the green circle is perpendicular to them.)

One interesting thing to note is that addition of complex numbers is commutative—which means that we could just as well put the fast green circle in the middle, and the big blue circle next, and the orange circle on the outside, or any other ordering. The red point would always trace exactly the same curve.

Edited to add: can you see how the numbers 1, 6, and -14 result in 5-fold symmetry? Hint: how many times per day do the hands of a clock line up? (Farris proves this analytically in his book, but it wasn’t until staring at this animation that I feel like I really “got it”.)

Posted in complex numbers, geometry, programming | Tagged , , , , , , , | 6 Comments

Random cyclic curves

Princeton Press just sent me a review copy of a new book by Frank Farris called Creating Symmetry: The Artful Mathematics of Wallpaper Patterns. It looks amazing and I’m super excited to read it. Apparently John Cook has been reading it as well, and posted some Python code for generating this curve, which shows up towards the beginning of the book:

Mike Croucher also posted an interactive version using Jupyter notebook, where you can play with sliders to control the parameters of the curve and watch it evolve.

This is a plot of the parametric equation

f(t) = e^{it} + \frac{1}{2} e^{6it} + \frac{i}{3} e^{-14it}

in the complex plane. In general, Farris considers parametric equations of the form

f(t) = \sum_j a_j e^{n_j it},

where a_j are complex numbers, n_j are integers, and as usual i = \sqrt{-1}. All such equations correspond to cyclic plots in the complex plane; he analyzes what sorts of symmetry they will have based on the parameters n_j.

He also spends some time talking about the aesthetics of picking values for a_j and n_j that result in beautiful curves. Instead of making carefully considered choices in this kind of situations, I often like to employ randomness to just generate a bunch of different instantiations and see what comes up (although there is still a certain amount of art in choosing the distributions for the random parameters). So, here are 50 random curves. Each one has

  • Either 2, 3, or 4 terms
  • Exponents chosen uniformly at random from [-30, 30]
  • Coefficients chosen uniformly at random from the square in the complex plane from -1 - i to 1 + i.

I really like seeing these all together. They are not all great individually, but as a group, it’s fun seeing their differences, similarities, and idiosyncracies.

Don’t ask me what the parameters are for an individual curve because the program I used to generate them does not save the parameters anywhere! The code is here if you want to see it; of course it is written in Haskell and uses the diagrams framework.

Posted in complex numbers, geometry, programming | Tagged , , , , , | 24 Comments

The blog lives! and Secrets of Creation Trilogy

I haven’t written here in a really long time! But the blog has not been abandoned, just on hiatus. I have recently started blogging again so you can expect more posts in the near future! In the meanwhile, here’s what I’ve been up to:

I also want to help spread the word that Matthew Watkins’ wonderful Secrets of Creation Trilogy has been republished in a new edition by Liberalis Books. I previously reviewed the first and second books in the trilogy (tl;dr: they are fabulous). I have now read the third book in the trilogy (depicted above), which is just as fabulous, though I never reviewed it here. And if you don’t believe me you can see what others (including Sir Roger Penrose and Clifford Pickover) have said about it. I highly recommend that you check them out if you haven’t already!

Posted in Uncategorized | 2 Comments

Permutation flower

Permutation flower

Image | Posted on by | Tagged , , | 1 Comment