logo

Monte Carlo Integration 📂Machine Learning

Monte Carlo Integration

Overview

Monte Carlo integration is a numerical approximation method used when calculating the integral of a given function is difficult. Consider the following scenario: For an integrable function $f$ in the given $[0, 1]$, we know the formula of $f(x)$ but calculating its integral is not straightforward. However, we want to calculate the integral $I[f]$ of $f$.

$$ \begin{equation} I[f] = \int_{[0,1]} f(x) dx \end{equation} $$

Definition

Monte Carlo integration is a method to estimate the integral of $f$ by drawing samples $\left\{ x_{i} \right\}$ from a distribution on $[0, 1]$ as follows:

$$ I[f] \approx I_{n}[f] := \dfrac{1}{n}\sum_{i=1}^{n} f(x_{i}) $$

Difference from Riemann Sum

The idea of Riemann Sum is to divide the interval $[0,1]$ into $n$ equal parts and get points $\left\{ x_{i} = \frac{i-1}{n} \right\}_{i=1}^{n}$, then sum up the function values at these points.

$$ \text{mensuration by parts}[f] = \dfrac{1}{n}\sum_{i=1}^{n} f(x_{i}) $$

At first glance, Monte Carlo integration and Riemann Sum may seem similar, but their meanings are completely different. In Riemann Sum, $\left\{ x_{i} \right\}$ are points obtained by dividing the interval $[0, 1]$ into $n$ equal parts, whereas in Monte Carlo integration, $x$ represents $n$ samples drawn from the distribution $p(x)$. Thus, the value obtained by Riemann Sum simply represents the area under the graph drawn by $f$, while the value obtained by Monte Carlo integration represents the expectation of $f$.

Properties

The statistical meaning of equation $(1)$ is that ’the $I[f]$ is equal to the expectation of $f(X)$ when $X$ follows a uniform distribution'.

$$ X \sim U(0,1) \implies I[f] = \int_{[0,1]} f(x) dx = E\left[ f(X) \right] $$

Expectation

Let the random variable $X$ follow a uniform distribution. $I_{n}[f]$ is an unbiased estimator of $I[f]$.

$$ E\left[ I_{n}[f] \right] = I[f] $$

Proof

$$ \begin{align*} E\left[ I_{n}[f] \right] &= E\left[ \dfrac{1}{n} \sum\limits_{i=1}^{n} f(X_{i}) \right] \\ &= \dfrac{1}{n} \sum\limits_{i=1}^{n} E\left[ f(X_{i}) \right] \qquad \text{by linearity of $E$} \\ &= \dfrac{1}{n} \sum\limits_{i=1}^{n} I\left[ f \right] \\ &= I\left[ f \right] \end{align*} $$

Variance

Proof

Properties of Variance

[a] $\Var (aX) = a^{2} \Var (X)$

[b] If $X, Y$ are independent, $\Var (X + Y) = \Var(X) + \Var(Y)$

Let’s denote the variance of $f(X)$ as $\sigma^{2}$. Then, by the properties of variance:

$$ \begin{align*} \Var \left[ I_{n}[f] \right] &= \Var \left[ \dfrac{1}{n} \sum\limits_{i=1}^{n} f(X_{i}) \right] \\ &= \dfrac{1}{n^{2}} \Var \left[ \sum\limits_{i=1}^{n} f(X_{i}) \right] \\ &= \dfrac{1}{n^{2}} \sum\limits_{i=1}^{n} \Var \left[ f(X_{i}) \right] \\ &= \dfrac{1}{n^{2}} \sum\limits_{i=1}^{n} \sigma^{2} \\ &= \dfrac{\sigma^{2}}{n} \end{align*} $$

Generalization

Now consider a function $p(x) \ge 0$ and $\int_{[0,1]} p = 1$. Think about the integral $I[fp]$.

$$ I[fp] = \int_{[0, 1]}f(x)p(x) dx $$

This is the same as the expectation of $f(X)$ for a random variable $X$ with a probability density function $p$. To approximate this value, we can consider the following two methods:

  1. Draw samples $\left\{ x_{i} \right\}_{i=1}^{n}$ uniformly and approximate $I[fp]$ as follows:

$$ X_{i} \sim U(0,1) \qquad I[fp] \approx \dfrac{1}{n}\sum\limits_{i}f(x_{i})p(x_{i}) $$

  1. Draw samples $\left\{ x_{i} \right\}_{i=1}^{n}$ from $p(x)$ and approximate $I[fp]$ as follows:

$$ X_{i} \sim p(x) \qquad I[fp] = I_{p}[f] \approx \dfrac{1}{n}\sum\limits_{i}f(x_{i}) $$

In other words, 1. is averaging $f(x)p(x)$ by uniform sampling, and 2. is averaging $f(x)$ by sampling from $p(x)$. Among these, the one with smaller variance is 1. Let’s simplify the notation as $I = I[fp] = I[fp]$.

  • In case of 1. $$ \begin{align*} \sigma_{1}^{2} = \Var [fp] &= E \left[ (fp - I)^{2} \right] \\ &= \int (fp - I)^{2} dx \\ &= \int (fp)^{2} dx - 2I\int fp dx + I^{2}\int dx\\ &= \int (fp)^{2} dx - 2I^{2} + I^{2}\\ &= \int (fp)^{2} dx - I^{2}\\ \end{align*} $$

  • In case of 2. $$ \begin{align*} \sigma_{2}^{2} = \Var [f] &= E_{p} \left[ (f - I)^{2} \right] \\ &= \int (f - I)^{2}p dx \\ &= \int f^{2}p dx - 2I\int fp dx + I^{2}\int pdx\\ &= \int f^{2}p dx - 2I^{2} + I^{2}\\ &= \int f^{2}p dx - I^{2}\\ \end{align*} $$

However, since $0 \le p \le 1$, it follows that $f^{2}p \ge f^{2}p^{2}$. Therefore,

$$ \sigma_{1}^{2} \le \sigma_{2}^{2} $$