Expectation of the Sum of Random Variables in the Form of Functions
Theorem 1
Given that $X_{1} , \cdots , X_{n}$ is a random sample, and there exist functions $E g \left( X_{1} \right)$ and $\operatorname{Var} g \left( X_{1} \right)$ such that $g : \mathbb{R} \to \mathbb{R}$ is given, then the following hold:
- [1] Mean: $$ E \left( \sum_{k = 1}^{n} g \left( X_{k} \right) \right) = n E g \left( X_{1} \right) $$
- [2] Variance: $$ \operatorname{Var} \left( \sum_{k = 1}^{n} g \left( X_{k} \right) \right) = n \operatorname{Var} g \left( X_{1} \right) $$
Explanation
The critical point to note in this theorem is that $\left\{ X_{k} \right\}_{k=1}^{n}$ is a random sample, in other words, iid. For example, when $i \ne j$, then $X_{i} = X_{j}$ and if $g (x) = x$, as is well known from the properties of variance, $$ \operatorname{Var} \left( \sum_{k=1}^{n} X_{k} \right) = \operatorname{Var} \left( n X_{k} \right) = n^{2} \operatorname{Var} X_{k} $$ This means that the condition of being independent is absolutely necessary to derive theorem [2].
Proof
[1]
Since the expectation is linear, and since $X_{1} , \cdots , X_{n}$ follows the same distribution, the following holds: $$ \begin{align*} & E \left( \sum_{k = 1}^{n} g \left( X_{k} \right) \right) \\ =& \sum_{k=1}^{n} E g \left( X_{k} \right) & \because \text{lineartiy} \\ =& n E g \left( X_{1} \right) & \because \text{identical distributed} \end{align*} $$
■
[2]
Since $X_{1} , \cdots , X_{n}$ is independent, if $i \ne j$, then $\operatorname{Cov} \left( g \left( X_{i} \right) , g \left( X_{j} \right) \right) = 0$ is true. Therefore, $$ \begin{align*} & \operatorname{Var} \left( \sum_{k = 1}^{n} g \left( X_{k} \right) \right) \\ =& E \left[ \sum_{k=1}^{n} g \left( X_{k} \right) - E \sum_{k=1}^{n} g \left( X_{k} \right) \right]^{2} \\ =& E \left[ \sum_{k=1}^{n} \left[ g \left( X_{k} \right) - E g \left( X_{k} \right) \right] \right]^{2} \\ =& \sum_{k=1}^{n} E \left[ g \left( X_{k} \right) - E g \left( X_{k} \right) \right]^{2} + \sum_{i \ne j} E \left[ g \left( X_{i} \right) - E g \left( X_{i} \right) g \left( X_{j} \right) - E g \left( X_{j} \right) \right] \\ =& \sum_{k=1}^{n} \operatorname{Var} g \left( X_{k} \right) + \sum_{i \ne j} \operatorname{Cov} \left( g \left( X_{i} \right) , g \left( X_{j} \right) \right) \\ =& n \operatorname{Var} g \left( X_{1} \right) + 0 \end{align*} $$ holds.
■
Casella. (2001). Statistical Inference(2nd Edition): p213. ↩︎