Why look back at basic probability?
Bayes' theorem lies at the heart of much of modern machine learning. Although it's relatively simple to understand, you do need some grounding in probability theory. This blog post is all about getting you up close and personal with probability theory so I can tell you all about Bayes in a later post.
The very basics
Think of some event that might occur in the future, say winning the lottery, buying a new car, or England winning the World Cup. We can estimate the probability of these events happening; we can call the event A and the probability of the event occurring P(A). If the event is certain to occur, then P(A) =1, if it's certain not to occur, then P(A) = 0, and in all cases: 0 \(\leq \) P(A) \(\leq \) 1.
We'll consider the probability of several events I'm going to call A, B, C, etc. These can be any events at all, including aliens landing, Elvis making a comeback, or getting a pay raise at the end of the year.
The complementary rule
Independence is a huge issue in probability modeling and it can lead to big errors if not handled correctly. On the face of it, it's a simple idea, but there are subtleties.
Two events are independent if one does not affect or influence the other in any way (alternatively, one event does not give any information about the other). For example, the odds of Joe Biden winning the 2020 Presidential election do not depend on the odds of New Zealand opening its borders to international travelers. Looking at things the other way, the odds of me winning the lottery are dependent on my purchasing a ticket (I have to buy a ticket to stand any chance of winning) - these are dependent events. I'm sure you can think of many other examples.
Independent and dependent events are treated very differently mathematically, the big mistake comes when events that are not independent are considered to be independent. For example, an organization might run many opinion polls in an election. The errors in the polls will not be independent of one another because the organization may well have a systemic bias that affects all their polls. There are similar problems in epidemiology; if you and I live together, my probability of catching an infectious disease is not independent of your probability of catching an infectious disease. The most famous example of confusing independent and dependent events was the sub-prime mortgage scandals of 2008 onwards. The analysts who developed the sub-prime mortgage default models assumed that mortgage defaults were independent of one another. Unfortunately for all of us, that wasn't the case in 2008. Economic conditions led to many defaults, which in turn led to broader financial problems, which in turn led to more defaults. In 2008 and onwards, sub-prime mortgage defaults were dependent on one another.
Disjoint (mutually exclusive) events
Two events are disjoint if they're mutually exclusive, in other words, if both can't happen. For example, only one of Joe Biden or Donald Trump can win the election - they both can't be President. In notation I'll explain later: \( P(A \ and \ B) = P(A \cap B) = 0\).
Probability A and B occurring (intersection) - the multiplication rule
Probability of A or B occurring (union) - the addition rule
What's the probability of A or B occurring? Some sources write 'or' and some write '\(\cup\)'. Here's the rule:
\[P(A \ or \ B) = P(A \cup B) = P(A) + P(B) - P(A \cap B) \]
\[= P(A) + P(B) - P(A)P(B | A)\]
For disjoint events, the addition rule simplifies to:
\[P(A \ and \ B) = P(A \cup B) = P(A) + P(B) \]
because from before we have:
\[P(A \cap B) = 0\]
Conditional probability - the conditional rule
- What's the probability I win the lottery given that I've bought a ticket?
- What's the probability I will get a degree if I go to college?
- What's the probability I will have an accident if I'm driving and if it's snowing and if it's dark?
The law of total probability
The law of total probability and conditional probabilities
What use is probability theory?
I grew up hearing about the value of 'common sense', but probability theory often gives results that seem very counterintuitive and 'common sense' can lead you wildly astray. A fun example is the Monty Hall problem, but there are lots of other examples in the real world where the probability of something happening is not what it appears to be at first - and they're not so fun. The counter-intuitive example you find most often on the internet is the probability that you have a disease given a positive test result; it's mostly not what you think.
Bayes' theorem takes us into the world of the counter-intuitive and I'll talk about Bayes in a future blog post.