Understanding Bayes' Theorem
Bayes' theorem solves a specific probability problem: given that something observable has occurred, what is the probability that an underlying cause or condition is true? This differs from asking the reverse question, which is often easier to measure directly.
The theorem formalizes prior probability (what you believe before new information) and likelihood (how probable the new information would be if your belief were true), then produces a posterior probability (your updated belief after considering the evidence).
Imagine a factory produces widgets on three machines. Machine A produces 50% of widgets and has a 2% defect rate. Machine B produces 30% and has a 3% defect rate. Machine C produces 20% and has a 5% defect rate. If you pull a defective widget from the bin, Bayes' theorem tells you which machine most likely produced it—even though you don't know which machine it came from.
The Bayes' Theorem Formula
The fundamental equation calculates the conditional probability of event A occurring, given that event B has been observed:
P(A|B) = [P(B|A) × P(A)] ÷ P(B)
P(A|B)— Posterior probability—the likelihood of A given that B is true.P(B|A)— Likelihood—the probability of observing B if A were actually true.P(A)— Prior probability—your initial belief about the probability of A before observing B.P(B)— Evidence probability—the total probability of observing B across all possible scenarios.
Multi-Hypothesis Extension for Testing
When hypothesis A can occur in multiple mutually exclusive forms, the denominator expands to account for all pathways that could produce the observed evidence B:
P(B) = P(A) × P(B|A) + P(¬A) × P(B|¬A)
This is invaluable in medical testing. Consider a disease affecting 1% of a population. A test correctly identifies 99% of infected patients but also incorrectly flags 5% of healthy people. If someone tests positive, the actual probability they're infected is dramatically lower than 99%—roughly 17%. The high false positive rate swamps the base rate. This explains why confirmatory testing is crucial in medicine.
Deriving Bayes' Theorem from First Principles
The derivation begins with the definition of conditional probability: the probability of two events both occurring divided by the probability of the conditioning event.
Starting with P(A|B) = P(A ∩ B) ÷ P(B) and P(B|A) = P(A ∩ B) ÷ P(A), we recognise that the intersection probability P(A ∩ B) is the same in both equations.
Rearranging: P(A ∩ B) = P(B|A) × P(A). Substituting this into the first equation yields Bayes' theorem. This derivation shows the theorem isn't an arbitrary rule but a logical consequence of how conditional probabilities relate to joint probabilities.
Common Pitfalls When Using Bayes' Theorem
Misapplying Bayes' theorem leads to flawed reasoning, especially in medical and legal contexts.
- Ignoring base rates — The prior probability P(A) often carries more weight than people intuitively expect. A rare disease remains rare even with a positive test. Always anchor to the baseline occurrence rate in your population before weighing evidence.
- Confusing the conditional directions — P(B|A) and P(A|B) are not interchangeable. The probability that a person with the disease tests positive differs from the probability they have the disease given a positive test. Swapping these is the 'prosecutor's fallacy' and has wrongly convicted innocent people.
- Using incomplete or biased evidence — The formula assumes P(B) is correctly estimated. If your data source is skewed—for instance, only testing symptomatic patients—the evidence probability shifts, invalidating downstream calculations. Ensure your data reflects the real-world context you're modelling.
- Forgetting that P(B) must be non-zero — Division by zero is undefined. If P(B) = 0, you cannot have observed B, so the question becomes meaningless. Always verify that your evidence has a non-zero probability before computing the posterior.