Bayesian Estimator 📂Mathematical Statistics

Bayesian Estimator

Definition ¹ ²

In Bayesian inference, let the random variable for the parameter to be estimated $\theta$ be $\Theta$, and let the random variable for the associated sample be $X$. Denote an estimator of $\theta$ by $\phi(X)$. The expected value of the loss function is called the Bayes risk (function).

$$ R(\Theta, \phi(X)) = E_{\Theta, X} \left[ \mathcal{L}(\Theta, \phi(X)) \right] = \int \int \mathcal{L}(\theta, \phi(x)) p(\theta, x) \mathrm{d}\theta \mathrm{d}x \tag{1} $$

Finding $\phi$ that minimizes the Bayes risk is called a Bayes estimate, and the minimizer $\phi(X)$ is called the Bayes estimator of $\theta$.

$$ \phi(X) = \argmin\limits_{\phi^{\ast}} R(\Theta, \phi^{\ast}(X)) $$

Explanation

By the definition of the joint probability, the integral in $(1)$ can be written as follows. If we denote it by $p(\theta, x) = g(x) p(\theta | x) = h(\theta) p(x | \theta)$,

$$ \int \left[ \int \mathcal{L}(\theta, \phi(x)) p(\theta | x) \mathrm{d}\theta \right] g(x) \mathrm{d}x = \int \left[ \int \mathcal{L}(\theta, \phi(x)) p(x | \theta) \mathrm{d}x \right] h(\theta) \mathrm{d}\theta \tag{2} $$

Here, the expression inside the parentheses on the left-hand side is the expected loss with respect to the posterior distribution; this is called the posterior expected loss. $$ \begin{align*} \text{Posterior expected loss} &:= \int \mathcal{L}(\theta, \phi(x)) p(\theta | x) \mathrm{d}\theta \\ &\ = E_{\Theta} \left[ \mathcal{L}(\Theta, \phi(X)) | X = x \right] \end{align*} $$

On the right-hand side of $(2)$, the value inside the parentheses is the risk function for each fixed $\theta$, and the whole expression is its expectation, so it is called the expected risk.

From the viewpoint of minimizing the Bayes risk, consider the left-hand side of $(2)$. Since $g(x) > 0$ holds for every $x$, to minimize the left-hand side it suffices that the value inside each parentheses is minimal for each fixed $x$. In other words, minimizing the Bayes risk (the entire left-hand expression) is equivalent to minimizing the posterior expected loss (the value inside each parentheses on the left). Therefore, it is immaterial whether one defines Bayes risk or Bayes estimators as below — do not be confused if different texts use different definitions.

$$ \begin{align*} \text{Bayes risk} &:= E_{\Theta, X} \left[ \mathcal{L}(\Theta, \phi(X)) \right] \\[1em] &\ = E_{\Theta} \left[ \mathcal{L}(\Theta, \phi(X)) | X = x\right] \end{align*} $$

$$ \begin{align*} \text{Bayes estimator} &:= \argmin\limits_{\phi} E_{\Theta, X} \left[ \mathcal{L}(\Theta, \phi(X)) \right] \\[1em] &\ = \argmin\limits_{\phi} E_{\Theta} \left[ \mathcal{L}(\Theta, \phi(X)) | X = x\right] \end{align*} $$

Properties

The Bayes estimator under mean squared error is the 🔒posterior mean. $$ \begin{align*} E_{\Theta}[\Theta | X] &= \argmin_{\phi} \int (\theta - \phi(x))^{2} p(\theta | x) \mathrm{d}\theta \\ &= \argmin_{\phi} E_{\Theta} \left[(\Theta - \phi(X))^2 | X \right] \end{align*} $$

Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p612. ↩︎
https://en.wikipedia.org/wiki/Bayes_estimator ↩︎