logo

Expectation of the Sum of Random Variables in the Form of Functions 📂Mathematical Statistics

Expectation of the Sum of Random Variables in the Form of Functions

Theorem 1

Given that $X_{1} , \cdots , X_{n}$ is a random sample, and there exist functions $E g \left( X_{1} \right)$ and $\text{Var} g \left( X_{1} \right)$ such that $g : \mathbb{R} \to \mathbb{R}$ is given, then the following hold:

  • [1] Mean: $$ E \left( \sum_{k = 1}^{n} g \left( X_{k} \right) \right) = n E g \left( X_{1} \right) $$
  • [2] Variance: $$ \text{Var} \left( \sum_{k = 1}^{n} g \left( X_{k} \right) \right) = n \text{Var} g \left( X_{1} \right) $$

Explanation

The critical point to note in this theorem is that $\left\{ X_{k} \right\}_{k=1}^{n}$ is a random sample, in other words, iid. For example, when $i \ne j$, then $X_{i} = X_{j}$ and if $g (x) = x$, as is well known from the properties of variance, $$ \text{Var} \left( \sum_{k=1}^{n} X_{k} \right) = \text{Var} \left( n X_{k} \right) = n^{2} \text{Var} X_{k} $$ This means that the condition of being independent is absolutely necessary to derive theorem [2].

Proof

[1]

Since the expectation is linear, and since $X_{1} , \cdots , X_{n}$ follows the same distribution, the following holds: $$ \begin{align*} & E \left( \sum_{k = 1}^{n} g \left( X_{k} \right) \right) \\ =& \sum_{k=1}^{n} E g \left( X_{k} \right) & \because \text{lineartiy} \\ =& n E g \left( X_{1} \right) & \because \text{identical distributed} \end{align*} $$

[2]

Since $X_{1} , \cdots , X_{n}$ is independent, if $i \ne j$, then $\text{Cov} \left( g \left( X_{i} \right) , g \left( X_{j} \right) \right) = 0$ is true. Therefore, $$ \begin{align*} & \text{Var} \left( \sum_{k = 1}^{n} g \left( X_{k} \right) \right) \\ =& E \left[ \sum_{k=1}^{n} g \left( X_{k} \right) - E \sum_{k=1}^{n} g \left( X_{k} \right) \right]^{2} \\ =& E \left[ \sum_{k=1}^{n} \left[ g \left( X_{k} \right) - E g \left( X_{k} \right) \right] \right]^{2} \\ =& \sum_{k=1}^{n} E \left[ g \left( X_{k} \right) - E g \left( X_{k} \right) \right]^{2} + \sum_{i \ne j} E \left[ g \left( X_{i} \right) - E g \left( X_{i} \right) g \left( X_{j} \right) - E g \left( X_{j} \right) \right] \\ =& \sum_{k=1}^{n} \text{Var} g \left( X_{k} \right) + \sum_{i \ne j} \text{Cov} \left( g \left( X_{i} \right) , g \left( X_{j} \right) \right) \\ =& n \text{Var} g \left( X_{1} \right) + 0 \end{align*} $$ holds.


  1. Casella. (2001). Statistical Inference(2nd Edition): p213. ↩︎