The birthday candle problem: solution

Recall the birthday candle problem I wrote about in a previous post:

A birthday cake has n lit candles. At each step you pick a number 1 \leq k \leq n uniformly at random and blow out k candles. If any candles remain lit, the process repeats (using a new value of n). As a function of n, what is the expected number of rounds needed to blow out all the candles?

Two commenters came up with the right answer; congratulations! Here’s my solution. Since it’s more fun this way, I’m going to explain not just the solution itself but also the process I went through in finding it.

First, let b_n denote the expected number of steps to blow out all the candles starting from n lit candles. As a base case, b_0 = 0, that is, you don’t need any steps if no candles are lit in the first place. Now, let’s think about an expression for b_n when n \geq 1. If you blow out some candles, leaving r, then you expect to need b_r more steps in addition to the one step you just took. Each r from 0 to n-1 is equally likely, so we can average all these cases to find the expected value of b_n, that is,

\displaystyle b_n = \frac{1}{n} \sum_{0 \leq r < n} (1 + b_r)

We can simplify this a bit: the sum involves adding up n copies of the number 1 in addition to b_r, which yields n, which we then immediately divide by n again. So in fact

\displaystyle b_n = 1 + \frac{1}{n} \sum_{0 \leq r < n} b_r.

That is, you always need 1 step, plus whatever is needed on average for what remains.

At this point, instead of analyzing b_n further, I just computed (by hand!) b_n for small values of n:

\begin{array}{rcl} b_0 &=& 0 \\ b_1 &=& 1 \\ b_2 &=& \displaystyle 1 + \frac{1}{2}(0 + 1) = \frac{3}{2} \\[1em] b_3 &=& \displaystyle 1 + \frac{1}{3}\left(0 + 1 + \frac{3}{2}\right) = 1 + \frac{5}{6} = \frac{11}{6} \\[1em] b_4 &=& \displaystyle 1 + \frac{1}{4}\left(0 + 1 + \frac{3}{2} + \frac{11}{6}\right) = 1 + \frac{1}{4} \cdot \frac{26}{6} = \frac{50}{24} \end{array}

I left b_4 unreduced since it seemed clear at this point that b_n can be written as a fraction with n! in the denominator, which seemed nice.

At this point I actually just looked up the sequence of denominators 0, 1, 3, 11, 50 in the wonderful Online Encyclopedia of Integer Sequences, and found sequence A000254, “unsigned Stirling numbers of the first kind”. Sounds neat! I scanned through the page and this comment jumped out at me: “The numerator of the fraction when we sum (without simplification) the terms in the harmonic sequence. (1+1/2=2/2+1/2= 3/2; 3/2+1/3=9/6+2/6= 11/6; 11/6+1/4=44/24+6/24= 50/24; ). The denominator of this fraction is n!.” At this point my mouth dropped open with surprise, since the definition of b_n does not, on the face of it, look anything like the definition of the harmonic numbers! The harmonic numbers H_n are partial sums of fractions 1/n, that is,

\begin{array}{rcl} H_1 &=& 1/1 \\ H_2 &=& 1/1 + 1/2 \\ H_3 &=& 1/1 + 1/2 + 1/3 \end{array}

and so on. Is it really possible that b_n = H_n? Computing the first few H_n, it sure seems to be so: 1/1 + 1/2 = 3/2, and 3/2 + 1/3 = 9/6 + 2/6 = 11/6, and so on. But this might just be a coincidence for small values of n; can we prove that H_n = b_n for all n?

Sometimes, just getting an idea of what you should try to prove makes all the difference. Once I had this clue, I was able to prove by induction that indeed b_n = H_n. (And perhaps at this point you might like to stop and try to prove it yourself!) Clearly b_0 = H_0 = 0. In the inductive case,

\begin{array}{rcl} b_{n+1} &=& \displaystyle 1 + \frac{1}{n+1} \sum_{0 \leq r < n+1} b_r \\[1.5em] &=& \displaystyle 1 + \frac{b_n}{n+1} + \frac{1}{n+1} \sum_{0 \leq r < n} b_r \\[1.5em] &=& \displaystyle 1 + \frac{b_n}{n+1} + \left(\frac{n}{n+1}\right) \frac{1}{n} \sum_{0 \leq r < n} b_r \\ &=& \displaystyle 1 + \frac{b_n}{n+1} + \frac{n}{n+1} (b_n - 1) \\[1em] &=& \displaystyle b_n + \frac{1}{n+1} \\[1em] &=& \displaystyle H_n + \frac{1}{n+1} \\ &=& H_{n+1} \end{array}

(Typically when I write a calculational proof like this, I like to typeset it with the reason for each step written in between the lines; but that is difficult to do on this blog, and you will probably have more fun figuring out the reasoning yourself!)

Thus, the expected number of steps needed to blow out n candles is the nth harmonic number, H_n = 1 + \frac{1}{2} + \frac{1}{3} + \dots + \frac{1}{n}. Incidentally, we have of course discovered an interesting harmonic number identity: just substitute H_n for b_n in our original recurrence!

After arriving at this result, I tried to see whether there is a more intuitive way to understand it. Here’s what I came up with. Number the candles from 1 to n, such that we always blow out the higest-numbered candles first, and candle number 1 is the last to be blown out. Say that each step of blowing out k candles is “charged” to the lowest-numbered candle to be blown out; the remaining k-1 candles are blown out “for free”. We then consider how much it “costs” to blow out a particular candle. Candle 1 definitely costs 1 step: it will always be charged for the step on which it is blown out. Candle 2 costs, on average, 1/2: either it is the lowest-numbered candle to be blown out, leaving the first candle still lit, in which case it incurs the cost for that step; or else it is blown out for free if the first candle is also blown out in that step. Since these are equally likely, the expected cost is 1/2. Likewise, there are three equally likely scenarios for candle 3; in one case it incurs a cost of 1, and in the other two cases it is blown out for free, yielding an expected cost of 1/3, and so on. Since the expectation of a sum is the sum of expectations (i.e. expectation is linear), the expected total cost is the sum of the individual expected costs, 1 + 1/2 + 1/3 + \dots. I am confident this can be made into a perfectly rigorous argument, which would actually constitute an alternative, “combinatorial” proof that b_n = H_n.

As a final aside, according to a classic result \lim_{n \to \infty} H_n = \ln n + \gamma, where \gamma \approx 0.5772156649\dots is the Euler-Mascheroni constant. So we can also say that the expected number of rounds to blow out n candles is approximately the natural logarithm of n. Apparently, the fact that the expectation is logarithmic in the number of candles (as opposed to linear) can be surprising to people. It was intuitively obvious to me, but that’s probably just because I’ve done a lot of computer science; this problem would fit very well into an analysis of algorithms course.

About Brent

Associate Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
This entry was posted in combinatorics, probability, solutions and tagged , , , . Bookmark the permalink.

5 Responses to The birthday candle problem: solution

  1. Jamie Hope says:

    Thanks! That’s extremely similar to the path I took to find the answer. My initial guess was that it would be something like the base-2 logarithm, since the expected value of the dice roll is going to be roughly n/2 each time through. Like you, it was a Sloane sequence that gave me the nudge toward harmonics, though I had reduced the fractions and found A124078 by searching for 0,1,3,11,25,137.

    Interestingly, if you use n+1-sided dice and allow for blowing out zero candles at each step, then the answer ends up being H_{n} + 1 — the expected value only goes up by 1.

    • Brent says:

      Yes, my initial guess was the base-2 logarithm as well.

      Wow, the situation with n+1-sided dice is fascinating! Thanks for mentioning it.

  2. Matt Daws says:

    Here’s a more direct proof via simple algebraic manipulation… As b_n = 1 + \frac{1}{n}(b_{n-1}+\cdots+b_0) also b_{n-1} = 1 + \frac{1}{n-1}(b_{n-2}+\cdots+b_0) it follows that b_n = 1 + \frac{n-1}{n}(b_{n-1}-1) + \frac{1}{n}b_{n-1} which simplifies to b_n = \frac{1}{n} + b_{n-1} as we want.

    • Brent says:

      Nice! If you look closely I think you can see that this is essentially the same proof that I gave; mine just shows more intermediate steps.

  3. Jan Van lent says:

    The coupon collector problem is a similar problem that also involves harmonic numbers. The expected number of tries per coupon is H_n.
    https://en.wikipedia.org/wiki/Coupon_collector%27s_problem

Comments are closed.