Binomial Theorem#

The study of Discrete Probability Functions necessarily starts with a classic. We have four main discrete probability functions we will study:

  • Binomial

  • Geometric

  • Hypergeometric

  • Negative Binomial

The binomial and geometric distributions are used widely in mathematics but apply as well to probability problem solving. However, a simple change in focus allows the Binomial Distribution to be hugely helpful. With help from the Binomial Theorem we can solve some quite difficult problems will ease.

The Binomial Distribution has been studied for centuries, but it’s relationship to probability theory is a more recent development. If we learned about Pascal’s Triangle in high school, often we were exploring its classic use of expanding a binomial like:

\[(x+y)^5\]

Pascal’s Triangle and Algebra#

We were taught to sketch a few rows of Pascal’s Triangle using a nifty pattern: $\(\begin{array}{ccccccccccccc}&&&&&&1&&&&&&\\&&&&&1&&1&&&&&\\&&&&1&&2&&1&&&&\\&&&1&&3&&3&&1&&&\\&&1&&4&&6&&4&&1&&\\&1&&5&&10&&10&&5&&1&\\1&&6&&15&&20&&15&&6&&1 \end{array}\)$

Knowing the pattern of the powers of \(x\) and \(y\) had to ascend or descend in a standard way, we could write:

\[(x+y)^5=x^5+5x^4 y+10x^3y^2+10x^2y^3+5xy^4+6y^5\]

Pascal’s Triangle gave us the coefficients we needed to know so that we could avoid doing all the multiplications and simplifications required.

Pascal’s Triangle and Probability#

How do we handle repeated draws with replacement? The answer is connected to Pascal’s Triangle and binomial expansions. Consider coin flips.

Example: 3 Coin Flips#

What is the probability of flipping a coin three times and getting exactly \(2\) “heads”?

For these coin flips, let \(H\) be the event of heads with event \(T\) being tails. Let’s create an ordered list covering all possibilities where we place in bold the arrangements that include exactly 2 results of heads.

\[\begin{split}\begin{array}{c} \text{HHH}\\\textbf{HHT}\\\textbf{HTH}\\\text{HTT}\\ \textbf{THH}\\\text{THT}\\\text{TTH}\\\text{TTT}\\ \end{array}\end{split}\]

We see from our ordered list that the probability of 2 heads in three fair coin flips is given by: $\(P(HH)=\frac{|HH|}{|S|}=\frac{3}{8}\)$

Let’s compare all the probabilities the third row of Pascal’s Triangle: $\(\begin{align*} P(HHH)&=\frac{1}{8}\\ P(HH)&=\frac{3}{8}\\ P(H)&=\frac{3}{8}\\ P(\text{0 }H)&=\frac{1}{8}\\ \end{align*}\)$

Writing out the first 4 rows lf Pascal’s Triangle, we see the following:

\[\begin{split}\begin{array}{ccccccccc}&&&&1&&&&\\&&&1&&1&&&\\&&1&&2&&1&&\\&1&&3&&3&&1&\\1&&4&&6&&4&&1 \end{array}\end{split}\]

Comparing the third row of Pascal’s Triangle to the probabilities we calculated above, we find the following intriguing patterns:

  • The third row entires are the numerators for the probabilities shown above.

  • The sum of the third row entries is 8, the denominator of those fractions.

These observations turn out to be true for every row of Pascal’s Triangle and for every probability question where Pascal’s Triangle applies.

Pascal’s Triangle and Binomial Coefficients#

Let’s write out Pascal’s Triangle in a way that lends itself to probability problem solving:

\[\begin{split}\begin{array}{ccccccccc}&&&&1&&&&\\&&&\binom{1}{0}&&\binom{1}{1}&&&\\&&\binom{2}{0}&&\binom{2}{1}&&\binom{2}{2}&&\\&\binom{3}{0}&&\binom{3}{1}&&\binom{3}{2}&&\binom{3}{3}&\\\binom{4}{0}&&\binom{4}{1}&&\binom{4}{2}&&\binom{4}{3}&&\binom{4}{4} \end{array}\end{split}\]

The rows now relate to Bournouli trials. The 4th rows corresponds to a trial repeated 4 times. The lower portion of the coefficient relates to the number of successes observed during those trials.

Bournouli Trials#

A Bournouli trial is a repeated probability experiment with a fixed chance of success (e.g. coin flips). We can use the Binomial Theorem as a problem-solving tool for Bournouli trials. Suppose we conduct \(n\) trials with \(0\leq x\leq 1\) the probability of success and \(y=1-x\) the probability of failure:

\[\begin{split}\begin{align*} (x+y)^n&=\binom{n}{0}x^0y^n+\binom{n}{1}x^1y^{n-1}+\cdots+\binom{n}{n}x^ny^0\\&=\sum_{k=0}^n \binom{n}{k}x^k y^{n-k} \end{align*}\end{split}\]

Note the following:

  • The sum of the terms from \(0\) to \(n\) is equal to \(1\) since \((x+y)^n=(x+1-x)^n=1^n=1\)

  • We can set the probability of success \(x\) as needed provided \(0\leq x\leq 1\).

  • Both R and a TI-84 graphing calculator can evaluate and sum the terms.

Example: 3 Coin Flips Again#

This time, let’s use the Binomial Theorem to solve the question: Given 3 fair coin flips, what is the probability that 2 of them are heads?

We have:

\[\begin{split}\begin{align}n&=3\\k&=2\\x&=\frac{1}{2}\end{align}\end{split}\]

where

  • \(n\) is the number of trials,

  • \(k\) is the number of successes, and

  • \(x\) is the probability of success.

Also note that we have \(y\), the probability of failure, by \(y=1-x\). Thus, we can solve the problem immediately:

\[P(k=2)=\binom{3}{2}\left(\frac{1}{2}\right)^2\left(\frac{1}{2}\right)\]

Example: More Coin Flips#

What if we change the question as follows:

If a fair coin is flipped 10 times, what is the probability that we see 6 or more heads?

The overall probability is the sum of several terms:

  • Probability of exactly 6 heads,

  • Probability of exactly 7 heads,

  • Probability of exactly 8 heads,

  • Probability of exactly 9 heads, and

  • Probability of exactly 10 heads.

Mathematically, it’s quite easy to write down the solution:

\[P(k\geq 6)=\sum_{k=6}^{10}\binom{10}{k}\left(\frac{1}{2}\right)^k\left(\frac{1}{2}\right)^{10-k}\]

Example: Unffair Coin Flips#

We ask the same question as that directly above, but we assign the probability of success to \(\frac{2}{5}\) to represent an unfair coin. This coin is weighted toward tails outcomes. However, the challenge is negligible mathematically.

\[P(k\geq 6)=\sum_{k=6}^{10}\binom{10}{k}\left(\frac{2}{5}\right)^k\left(\frac{3}{5}\right)^{10-k}\]

Calculations with the Biomial Theorem#

We would like to be to use R to estimate these probabilities. First, let’s copy-paste the combinations and permuatations formulae from our [course notes](file:///C:/Users/robbs/Documents/Conda/books/probstat/_build/html/P2a.html#wrapping-up).

combin <- function(n, k) {
    return(factorial(n) / ( factorial(k)*factorial(n-k) )) }
perm <- function(n, k) {
    return(combin(n,k) * factorial(k))}

We are now ready to evaluate these probabilities quickly in R.

Three flips, 2 successes#

We found that mathematically

\[P(k=2)=\binom{3}{2}\left(\frac{1}{2}\right)^2\left(\frac{1}{2}\right)\]

Let’s do that in R.

combin(3,2) * 0.5^2 * 0.5
0.375
# comparing to algebraic answer
3/8
0.375

Ten flips, 6 or more successes#

We are evaluating the following:

\[P(k\geq 6)=\sum_{k=6}^{10}\binom{10}{k}\left(\frac{1}{2}\right)^k\left(\frac{1}{2}\right)^{10-k}\]

The summation in R will be carried out using a FOR loop. We calculate and store each term, and sum after the loop has completed all its work. Note the following:

  • tab is our “tabulation” vector, empty to begin with,

  • lo is our lower limit for the summation,

  • hi is our upper limit for the summation,

  • \(x\) is our probability of success, and

  • \(y\) is our probability of failure.

tab <- c()            ## Empty vector to store all the terms 
n = 10                ## Number of trials
lo = 6                ## LEAST Number of successes
hi = 10                ## MOST Number of successes
x = 1/2               ## Probability of success
y = 1-x               ## Probability of failure
k = 1                 ## Indexing variable for tab vector

for (t in lo:hi){
    tab[k] <- combin(n,t) * x^t * y^(n-t)         # Calculate the term and save in the vector "tab"
    k <- k + 1
}
sum(tab)

tab
0.376953125
  1. 0.205078125
  2. 0.1171875
  3. 0.0439453125
  4. 0.009765625
  5. 0.0009765625

The list of five probabilities appears when we print out the complete tab vector. The sum is shown above that output.

Ten flips, unfair coin#

This requires little adjustment: we set the probability of success differently. Here, we set

\[x=\frac{2}{5}\]
tab <- c()            ## Empty vector to store all the terms 
n = 10                ## Number of trials
lo = 6                ## LEAST Number of successes
hi = 10                ## MOST Number of successes
x = 2/5               ## Probability of success
y = 1-x               ## Probability of failure
k = 1                 ## Indexing variable for tab vector

for (t in lo:hi){
    tab[k] <- combin(n,t) * x^t * y^(n-t)         # Calculate the term and save in the vector "tab"
    k <- k + 1
}
sum(tab)

tab
0.1662386176
  1. 0.111476736
  2. 0.042467328
  3. 0.010616832
  4. 0.001572864
  5. 0.0001048576