Expectation Formula (Property) Total Review
Definitions
If the probability density function $f$ of the continuous random variable $X$ satisfies $\displaystyle \int_{-\infty}^{\infty} |x| f(x) dx \lt \infty$, the expectation $\mathbb{E}_{X}(X)$ of $X$ is defined as follows. $$ \mathbb{E}_{X}(X) := \int_{-\infty}^{\infty} x f(x) dx $$
If the probability mass function $p$ of the discrete random variable $X$ satisfies $\displaystyle \sum_{x} |x| p(x) \lt \infty$, the expectation $\mathbb{E}_{X}(X)$ of $X$ is defined as follows. $$ \mathbb{E}_{X}(X) := \sum_{x} x p(x) $$
For two random variables $X$ and $Y$, given $Y = y$, the conditional expectation of $X$ is defined as follows. $$ \mathbb{E}_{X|Y}(X|Y) := \int_{-\infty}^{\infty} x f(x|y) dx $$
The expectation $\mathbb{E}_{\mathbf{X}}(\mathbf{X})$ of a random vector $\mathbf{X} = \begin{bmatrix} X_{1} & X_{2} & \cdots & X_{n} \end{bmatrix}^{\mathsf{T}}$ is defined as follows. (See here.) $$ \mathbb{E}_{\mathbf{X}}(\mathbf{X}) := \begin{bmatrix} \mathbb{E}_{X_{1}}(X_{1}) \\ \mathbb{E}_{X_{2}}(X_{2}) \\ \vdots \\ \mathbb{E}_{X_{n}}(X_{n}) \end{bmatrix} $$
Explanation
$\mathbb{E}$ is taken from the initial letter of “expectation”; when there is no ambiguity, the subscript is omitted and it is simply denoted as below. Both notations $\mathbb{E}$ and $E$ are commonly used, and both types of parentheses, () and [], are frequently used.
$$ \mathbb{E}(X), \quad \mathbb{E}[X], \quad E(X), \quad E[X] $$
Formulas and Properties
Linearity: For a random variable $X \in \mathbb{R}$ and constant $a, b \in \mathbb{R}$, $$ \mathbb{E}_{X}(aX + b) = a\mathbb{E}_{X}(X) + b \tag{1} $$ and for random variables $X, Y \in \mathbb{R}$, $$ \mathbb{E}_{X,Y}(X + Y) = \mathbb{E}_{X}(X) + \mathbb{E}_{Y}(Y) $$
For a random variable $X$ and a function $g$, when $Y = g(X)$ holds, $$ \mathbb{E}_{Y}(Y) = \mathbb{E}_{X}[g(X)] $$
Linearity: For random variable $X$, two functions $g_{1}$ and $g_{2}$, and constant $a, b \in \mathbb{R}$, $$ \mathbb{E}_{X}[a g_{1}(X) + b g_{2}(X)] = a \mathbb{E}_{X}[g_{1}(X)] + b \mathbb{E}_{X}[g_{2}(X)] \tag{2} $$ $(1)$ is a special case of $(2)$ where $g_{1}(X) = X, g_{2}(X) = 1$.
Relation to variance: For the variance $\Var(X) = \mathbb{E}_{X}[(X-\mathbb{E}_{X}(X))^2]$, $$ \Var(X) = \mathbb{E}_{X}(X^{2}) - [\mathbb{E}_{X}(X)]^{2} $$
If the variance of random variable $X$ exists, then the following holds. $$ \mathbb{E}_{X} (X^{2}) \ge [\mathbb{E}_{X} (X)]^{2} $$ This in turn means that the variance is greater than or equal to $0$ ($(\Var(X) \ge 0)$).
Interpolation property: For random variable $X$ and natural number $m$, if $\mathbb{E}_{X}(X^{m})$ exists, then for every natural number $k \le m$, $\mathbb{E}_{X}(X^{k})$ exists. $$ \exist \mathbb{E}_{X}(X^{m}) \implies \exist \mathbb{E}_{X}(X^{k}) \quad \forall k \le m $$
Jensen’s inequality: For random variable $X$ and a convex function $\phi$, the following holds. $$ \phi[\mathbb{E}_{X}(X)] \le \mathbb{E}_{X}[\phi(X)] $$
The following holds for conditional expectation. $$ \mathbb{E}_{X} \left[ \mathbb{E}_{Y} (Y | X) \right] = \mathbb{E}_{Y} (Y) $$
For two independent random variables $X$ and $Y$ and functions $f$ and $g$, the following holds. $$ \mathbb{E}_{X, Y} \left[ f(X) g(Y) \right] = \mathbb{E}_{X} \left[ f(X) \right] \mathbb{E}_{Y} \left[ g(Y) \right] $$
