Recall the birthday candle problem I wrote about in a previous post:

A birthday cake has lit candles. At each step you pick a number uniformly at random and blow out candles. If any candles remain lit, the process repeats (using a new value of ). As a function of , what is the expected number of rounds needed to blow out all the candles?

Two commenters came up with the right answer; congratulations! Here’s my solution. Since it’s more fun this way, I’m going to explain not just the solution itself but also the *process* I went through in finding it.

First, let denote the expected number of steps to blow out all the candles starting from lit candles. As a base case, , that is, you don’t need any steps if no candles are lit in the first place. Now, let’s think about an expression for when . If you blow out some candles, leaving , then you expect to need more steps in addition to the one step you just took. Each from to is equally likely, so we can average all these cases to find the expected value of , that is,

We can simplify this a bit: the sum involves adding up copies of the number in addition to , which yields , which we then immediately divide by again. So in fact

.

That is, you always need step, plus whatever is needed on average for what remains.

At this point, instead of analyzing further, I just computed (by hand!) for small values of :

I left unreduced since it seemed clear at this point that can be written as a fraction with in the denominator, which seemed nice.

At this point I actually just looked up the sequence of denominators , , , , in the wonderful Online Encyclopedia of Integer Sequences, and found sequence A000254, “unsigned Stirling numbers of the first kind”. Sounds neat! I scanned through the page and this comment jumped out at me: *“The numerator of the fraction when we sum (without simplification) the terms in the harmonic sequence. (; ; ; ). The denominator of this fraction is .”* At this point my mouth dropped open with surprise, since the definition of does not, on the face of it, look anything like the definition of the harmonic numbers! The harmonic numbers are partial sums of fractions , that is,

and so on. Is it really possible that ? Computing the first few , it sure seems to be so: , and , and so on. But this might just be a coincidence for small values of ; can we prove that for all ?

Sometimes, just getting an idea of what you should try to prove makes all the difference. Once I had this clue, I was able to prove by induction that indeed . (And perhaps at this point you might like to stop and try to prove it yourself!) Clearly . In the inductive case,

(Typically when I write a calculational proof like this, I like to typeset it with the reason for each step written in between the lines; but that is difficult to do on this blog, and you will probably have more fun figuring out the reasoning yourself!)

Thus, the expected number of steps needed to blow out candles is the th harmonic number, . Incidentally, we have of course discovered an interesting harmonic number identity: just substitute for in our original recurrence!

After arriving at this result, I tried to see whether there is a more intuitive way to understand it. Here’s what I came up with. Number the candles from to , such that we always blow out the higest-numbered candles first, and candle number is the last to be blown out. Say that each step of blowing out candles is “charged” to the lowest-numbered candle to be blown out; the remaining candles are blown out “for free”. We then consider how much it “costs” to blow out a particular candle. Candle 1 definitely costs 1 step: it will always be charged for the step on which it is blown out. Candle 2 costs, on average, : either it is the lowest-numbered candle to be blown out, leaving the first candle still lit, in which case it incurs the cost for that step; or else it is blown out for free if the first candle is also blown out in that step. Since these are equally likely, the expected cost is . Likewise, there are three equally likely scenarios for candle ; in one case it incurs a cost of , and in the other two cases it is blown out for free, yielding an expected cost of , and so on. Since the expectation of a sum is the sum of expectations (*i.e.* expectation is linear), the expected total cost is the sum of the individual expected costs, . I am confident this can be made into a perfectly rigorous argument, which would actually constitute an alternative, “combinatorial” proof that .

As a final aside, according to a classic result , where is the Euler-Mascheroni constant. So we can also say that the expected number of rounds to blow out candles is approximately the natural logarithm of . Apparently, the fact that the expectation is *logarithmic* in the number of candles (as opposed to linear) can be surprising to people. It was intuitively obvious to me, but that’s probably just because I’ve done a lot of computer science; this problem would fit very well into an analysis of algorithms course.