Conditional probability is soo powerful. Powerful in that there is a difference in the likelihood of someone developing breast cancer based on family history, lifestyle, genetics, if they are a man, or if they are women. There is math behind those statements!
Before we dive in, if you have not gone over my post on Simple Probability yet, please do so.
Mathematically it is very hard to talk about Independence not knowing what a Conditional is, and Conditionals not knowing Independence. So we will start with assuming we know what Independence is so we can talk about Conditionals. Then once we understand Conditionals, we will Define Independence.
Assume We Know Independence
We will do this by an instructive example, flipping fair coins. Imagin that we have a fair coin that has equally likely outcomes. We could use simple probability to describe the following.
\(\text{H: flip results in “Heads”}\)
\(\text{T: flip results in “Tails”}\)
\(n(S)=2\)
\(n(H)=1\)
\(n(S)=1\)
\(P(H)=\frac{n(H)}{n(S)}=\frac{1}{2}=0.5\)
\(P(T)=\frac{n(T)}{n(S)}=\frac{1}{2}=0.5\)
Now, these probabilities will always hold. While unlikely, it is possible for someone to flip this imagined fair coin 100 times and every single one of those flips results heads. If we were to flip that coin one additional time, the likelihood that the result of heads would still be 0.5. We would then call these coin flips Independent. The result of the next coin flip does not Depend on any previous result.
Now let’s just consider flipping this coin twice, noting the first result than the second. There would be four outcomes
\(HH\)
\(HT\)
\(TH\)
\(TT\)
We could determine the likelihood of the following events using simple probability, since we are talking about a fair coin and each of these four outcomes is equally likely.
\(P(\text{“Two Heads”}) = \frac{1}{4}=0.25\)
\(P(\text{“Two Tails”}) = \frac{1}{4}=0.25\)
\(P(\text{“One of each”}) = \frac{2}{4}=0.5\)
\(P(\text{“at least one Heads”}) = \frac{3}{4}=0.75\)
\(P(\text{“at least one Tails”}) = \frac{3}{4}=0.75\)
Now we will Not use simple probability to determine these probabilities. We will use a tree and multiplication of probabilities. All we will use is \(P(H)\) and \(P(T)\).
\(P(H)=0.5\)
\(P(T)=0.5\)
So
\(P(HH)=0.25\)
\(P(HT)=0.25\)
\(P(TH)=0.25\)
\(P(TT)=0.25\)
And
\(P(\text{“Two Heads”}) =0.25\)
\(P(\text{“Two Tails”}) =0.25\)
\(P(\text{“One of each”})= 0.25+0.25=0.5\)
\(P(\text{“at least one Heads”}) = 0.25+0.25+0.25=0.75\)
\(P(\text{“at least one Tails”}) = 0.25+0.25+0.25=0.75\)
We will revisit Independence after we cover Conditional
Introducing Conditional
Again, this will be done with an example. Imagine now that we have a bag with two coins in it. One coin will be fiar and the other will not.
\(\text{F: fiar coin}\)
\(P(H)=0.5\)
\(P(T)=0.5\)
\(\text{F’: unfiar coin}\)
\(P(H)=0.80\)
\(P(T)=0.20\)
So say our task is to first choose a coin out of this bag. We will give equal likelihood to selecting each coin. We will then flip that coin and note the result.
\(P(F)=0.5\)
\(P(F’)=0.5\)
Multiplying across we have all the possible intersections of Fair/Unfair coin selections and coin flips, note that they total up to one. Now I would like to lable this same tree diagram differently with some notation you might not have seen before.
Just by understanding this example up to now, we have laid out what a conditional is. Let’s take a closer look at the first one.
\(\text{“The likelihood of H occurring given that F is known to have already occurred”}\)
\(P(H|F)\)
or more simply
\(\text{“Probability of H given F”}\)
The tree helps us define how to calculate this conditional probability statement.
\[P(F)P(H|F) = P(F \cap H)\]
\[P(H|F)=\frac{P(F \cap H)}{P(F)}\]
More into the Conditional
Switching to just A and B, there are two ways to look at what we are doing. We can consider that the conditional is a means to the end of getting intersections or that the intersection is the means to get to the conditional.
For Intersections
We can get the intersection by first knowing A and then B given A
\(P(A \cap B)=P(A)P(B|A)\)
Or, We can get the intersection by first knowing B and then A given B
\(P(A \cap B)=P(B)P(A|B)\)
For Conditionals
A given B is obtained by the intersection of the two events divided by the given event, B.
\(P(A|B)=\frac{P(A \cap B)}{P(B)}\)
B given A is obtained by the intersection of the two events divided by the given event, A.
\(P(B|A)=\frac{P(A \cap B)}{P(A)}\)
Focus On Conditional With An Example
I alluded to breast cancer earlier on, so let’s look at an example. These numbers do not specifically come from any real-life data. I just did a very cursory search for relative rates of breast cancer among men and women on the internet.
\(\text{M: male}\)
\(\text{F: female}\)
\(\text{C: breast cancer}\)
\(n(S)=2000\)
\(n(M)=960\)
\(n(F)=1040\)
\(n(C)=118\)
\(n(M \cap C)=3\)
\(n(M \cap C’)=957\)
\(n(F \cap C)=115\)
\(n(F \cap C’)=925\)
We will present this information in a Venn diagram. The Left half represents Men, Right Female, and inside the oval is breast cancer.
I would now like to take a look at a few probabilities to driver home further the meaning behind a conditional.
I first want to take a look at the difference between \(F \cap C\) and \(C|F\).
First \(F \cap C\)
\(P(F \cap C)=\frac{n(F \cap C)}{n(S)}=\frac{115}{2000}=0.0575\)
The total number of people that have breast cancer and are female divided by the total of all people in our sample space, regardless of gender or cancer status.
Now \(C|F\)
\(P(C|F)=\frac{P(F \cap C)}{P(F)}=\frac{\frac{115}{2000}}{\frac{1040}{2000}}=\frac{115}{1040}\approx0.1106\)
I’d like to focus on \(\frac{115}{1040}\). Note that we are interested in the likelihood of breast cancer given that we know the person is female. We can look at the Veen diagram and completely ignore the male half. When viewed this way, we can consider the denominator to be a new sample space of all women. And the numerator is just the number of people in this new sample space that have breast cancer.
Let’s do the same for the men.
First \(M \cap C\)
\(P(M \cap C)=\frac{n(M \cap C)}{n(S)}=\frac{3}{2000}=0.0015\)
Now \(C|M\)
\(P(C|M)=\frac{P(M \cap C)}{P(M)}=\frac{\frac{3}{2000}}{\frac{960}{2000}}=\frac{3}{960}\approx0.0031\)
Hopefully, this sheds some light on what conditional probability is.
Independance
I’m big on examples, picking on men and women again. Let’s assume we have data from a survey of 1000 men and women asking if they like video games, yes or no. Again these numbers are not based on any real survey. What I did was cook the numbers so that they would be independent, Muahahaha.
\(\text{M: male}\)
\(\text{F: female}\)
\(\text{G: likes video games}\)
\(n(S)=1000\)
\(n(M)=470\)
\(n(F)=530\)
\(n(M \cap G)=188\)
\(n(F \cap G)=212\)
Let’s look at a Venn diagram presented slightly differently than the previous example. Well most would probably call it a table, but it is still a Venn diagram.
Now I want to look at the difference between \(P(G)\) vs \(P(G|M)\) and \(P(G|M)\). What do you think the differences will be based on knowing I cooked these to be independent.
\(P(G)=\frac{n(G)}{n(S)}=\frac{400}{1000}=0.4\)
\(P(G|M)=\frac{n(G \cap M)}{n(M)}=\frac{188}{470}=0.4\)
\(P(G|F)=\frac{n(G \cap F)}{n(F)}=\frac{212}{530}=0.4\)
So
\(P(G)=P(G|M)=P(G|F)\)
And the same will be true for
\(P(G’)=P(G’|M)=P(G’|F)=0.6\)
Let’s take a look at a tree of this situation. Note that there is no particular rule of how to set trees up, but most do set them up “logically”. Logically, we would know the gender of someone before we would know if they were a gamer or not, but it could be the other way around.
But
\(P(G)=P(G|M)=P(G|F)\)
And
\(P(G’)=P(G’|M)=P(G’|F)\)
So
Which suggests
\(P(G \cap M)=P(G)P(M)\)
\(P(G \cap F)=P(G)P(F)\)
\(\text{ } \vdots \)
And that is the exact condition fo independence.
Switching to A and B
\(\text{Two events are independent if the probability of their intersection is equal to the product of the two events}\)
\(P(A \cap B)=P(A)P(B)\)
So assuming that A and B are independent, let’s investigate the following.
\(P(A|B)=\frac{P(A \cap B)}{P(B)}\)
But A and B are independent
So
\(P(A|B)=\frac{P(A)P(B)}{P(B)}\)
Simplifying
\(P(A|B)=\frac{P(A)}{1}\)
\(P(A|B)=P(A)\)
I will note that if two events are independent, then any selection of those events and their compliments are independent as well.
Final Thoughts
Go get your Conditional Probability On!
Recent Comments