Proof of Bayes' Theorem and Prior, Posterior Distributions
Theorem 1
Sample Space $S$ and Event $A$, Probability $P$ If $\left\{ S_1, S_2, \cdots ,S_n \right\}$ is a partition of $S$, then the following holds. $$ P(S_k|A)=\frac { P(S_k)P(A|S_k) }{ \sum _{ k=1 }^{ n }{ P(S_k)P(A|S_k) } } $$
Definition
The right-hand side of Bayes’ theorem, $P \left( S_{k} \right)$, is called the Prior Probability, and the left-hand side, $P \left( S_{k} | A \right)$, is called the Posterior Probability. The probability distributions formed by these probabilities are called Prior Distribution and Posterior Distribution, respectively.
Explanation
Also called Bayes’ Rule, this theorem can be proven quite easily using only two laws, but its applications are extensive. The so-called Bayesian Paradigm divides the field of statistics into two schools of thought, emphasizing its importance cannot be overstated.
What we want to know is the left-hand side of the above equation. What we already know are the probabilities of event $A$ and the partitions $S_k$ of sample space $S$ occurring, and the probability of $A$ occurring when each of these partitions occurs. In short, we start with everything we know about $S_k$ and its impact on $A$. Bayes’ theorem reverses this, allowing us to understand the impact of $A$ on each of these partitions. If this sounds complicated, it’s enough to focus on wanting to find out the left-hand side.
Proof
By the Law of Total Probability and the Multiplication Rule of Probability, we obtain the following equation. $$ \begin{align*} P(A)=&P(A\cap S_1)+P(A\cap S_2)+…+P(A\cap S_n) \\ =&P(S_1)P(A|S_1)+P(S_2)P(A|S_2)+…+P(S_n)P(A|S_n) \\ =& \sum _{ k=1 }^{ n }{ P(S_k)P(A|S_k) } \end{align*} $$ Taking the reciprocal of both sides gives us $$ \begin{align*} & \frac { 1 }{ \sum _{ k=1 }^{ n }{ P(S_k)P(A|S_k) } }=\frac { 1 }{ P(A) } \\ \implies& \frac { P(A\cap S_k) }{ \sum _{ k=1 }^{ n }{ P(S_k)P(A|S_k) } }=\frac { P(A\cap S_k) }{ P(A) } \\ \implies& \frac { P(S_k)P(A|S_k) }{ \sum _{ k=1 }^{ n }{ P(S_k)P(A|S_k) } }=P(S_k|A) \end{align*} $$
■
Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p23. ↩︎