Probabilistic PIE

Commenters Sylvain B., xpil, and Christian Luca all gave correct answers to the challenge from my previous post. If the probability that someone likes X is $p$, then the probability they don’t like X is $(1-p)$. Therefore the probability that someone doesn’t like anchovies and doesn’t like books is $(1-a)(1-b)$. (If liking anchovies and books are independent, then not liking those things is also independent.) Likewise, the probability that someone doesn’t like all three is $(1-a)(1-b)(1-c)$.

The reason I asked the question, however, is to see the pattern that emerges if we multiply out these expressions, which commenter Buddha Buck explained.

$1 - a - b + ab$

$1 - a - b - c + ab + ac + bc - abc$

$1 - a - b - c - d + ab + ac + ad + bc + bd + cd - abc - abd - acd - bcd + abcd$

The first line is the result of multiplying out $(1-a)(1-b)$; the second is $(1-a)(1-b)(1-c)$; and the last is of course $(1-a)(1-b)(1-c)(1-d)$. After expanding everything as much as possible, we get one term for the product of every possible combination of the probabilities, with a positive sign for an even number of probabilities, and negative for an odd number.

Do you see why this happens? Each term in the result arises from choosing one of the two possible terms from each parenthesized expression. Taking $(1-a)(1-b)(1-c)$ as a specific example, we get $bc$ if we choose $1$ from $(1-a)$, and $-b$ from $(1-b)$, and $-c$ from $(1-c)$, yielding $1 \cdot (-b) \cdot (-c) = bc$.

And now for a set of follow-up questions. Suppose that $P$ is the set of all people in our population, $A$ is the set of people who like anchovies, and $B$ is the set of people who like books. Suppose we also know how many people like both anchovies and books, that is, we know $|A \cap B|$. (The notation $|S|$ denotes the size of a set $S$, and $\cap$ denotes the intersection of two sets, that is, the set of elements the two sets have in common.)

• How many people like neither anchovies nor books?

• Now suppose $C$ is the set of people who like carpets. Suppose we also know the sizes of the sets $A \cap C$, $B \cap C$, and $A \cap B \cap C$. How many people like none of the things?

• And so on?

Associate Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
This entry was posted in pattern, probability, solutions and tagged , . Bookmark the permalink.

7 Responses to Probabilistic PIE

1. The number of people who like neither anchovies nor books would be: P – |A $\cap B|. The number of people who like neither anchovies, books, nor carpets would be P – (|A$\cap C| + |B $\cap C| + |A$\cap B $\cap C|). The number of people who do not like any of the enumerated things considered, which we can call n could be denoted by: P – (|A_1$\cap A_2| + |A_2 $\cap A_3| + |A_3$\cap A_4| + …|A_n-1 $\cap A_n| + |A_1$\cap A_2 $\cap A_3$\cap A_4…$\cap A_n). | • Brent says: Not quite. For example, |P| – |A \cap B| is the number of people who don’t like both, which is not the same as people who like neither. In particular it includes people who like one but not the other. • Buddha Buck says: Do you guys mean ‘\cup’, not ‘\cap’, in your LaTeX? • Brent says: \cup’ corresponds to union and \cap’ is intersection. I meant \cap’ because I was responding to what Christian wrote. And I assume Christian did not typo \cap’ for `\cup’ thirteen times by accident. =) 2. Stuart LoPresti says: If we wish to know how many people like neither anchovies nor books, this would be the size of the intersection of the complements of A and B; that is,$|A' \cap B'|$. Applying De Morgan’s law, this is equal to$\latex |(A \cup B)’|\$. Assuming we know |P|, |A|, and |B|, (which is unfortunately not clear to me in the given statement) then this would be

$|P|-|A \cup B| = |P|-|A|-|B|+|A \cap B|.$

We add $|A \cap B|$ because if we simply subtract |A| and |B| from |P|, we end up subtracting the intersection twice, and so need to counteract this to obtain an accurate value.

Similarly, if we wish to know how many people like neither anchovies nor books nor carpet, this is the size of the intersection of the complements of A, B, and C; that is, $|A' \cap B' \cap C'|$. Again applying De Morgan’s Law, this is equal to $|(A \cup B \cup C)'|$, which, assuming we also know |C|, is equal to

$|P|-|A \cup B \cup C|=|P|-|A|-|B|-|C|+|A \cap B|+|A \cap C|+|B \cap C|-|A \cap B \cap C|.$

Here we add the sizes of the pairwise intersections of sets for the same reason as in the previous paragraph, but in doing so overshoot on $A \cap B \cap C$; the size of this subset has been subtracted three times from |P| and then added back three times with $|A \cap B|$, $|A \cap C|$, and $|B \cap C|$. Thus we need to subtract $|A \cap B \cap C|$ one last time to obtain an accurate value.

In general, let P be the set of all people in our population and let ${A_i}_{i=1}^n$ be a finite collection of subsets of P containing people who like thing i. Then, following the pattern and assuming we know the sizes of P and of every finite intersection of sets $A_i$, the number of people who like none of the things should be equal to

$|P|-|\bigcup_i A_i|=|P|-\sum_i |A_i|+\sum_{i \neq j} |A_i \cap A_j|-\sum_{i \neq j \neq k}|A_i \cap A_j \cap k|+\dots$

This is in direct correlation with the result outlined in the blog post (where $\prod_i^n (1-a_i) = 1+\sum_i a_i-\sum_{i \neq j} a_ia_j+\sum_{i \neq j \neq k} a_i a_j a_k - \dots$), suggesting a relationship between multiplication of probabilities and intersection of sets.