Transformation of Multivariate Random Variables
Formulas
The joint probability density function of a multivariate random variable $X = ( X_{1} , \cdots , X_{n} )$, given by $f$, is assumed to be as follows: $$ y_{1} = u_{1} (x_{1} , \cdots , x_{n}) \\ \vdots \\ y_{n} = u_{n} (x_{1} , \cdots , x_{n}) $$ Consider the following transformation $u_{1} , \cdots , u_{n}$, which might not be injective. Thus, the support $X$ of $S_{X}$ is divided into $k$ partitions $A_{1} , \cdots , A_{i} , \cdots , A_{k}$, and the following inverse transformations $w_{ji} \mid_{i=1,\cdots,k \\ j=1,\cdots,n}$ could be considered: $$ x_{1} = w_{1i} ( y_{1} , \cdots , y_{n} ) \\ \vdots \\ x_{n} = w_{ni} ( y_{1} , \cdots , y_{n} ) $$ Due to such transformation $$ Y_{1} = u_{1} (X_{1} , \cdots, X_{n}) \\ \vdots \\ Y_{n} = u_{n} (X_{1} , \cdots, X_{n}) $$ the joint probability density function $g$ of the transformed multivariate random variable $Y = ( Y_{1} , \cdots , Y_{n} )$ is as follows: $$ g(y_{1},\cdots,y_{n}) = \sum_{i=1}^{k} f \left[ w_{1i}(y_{1},\cdots , y_{n}) , \cdots , w_{ni}(y_{1},\cdots , y_{n}) \right] \left| J_{i} \right| $$
- $J_{i}$ is the $i=1,\cdots , k$-th Jacobian $J_{i} := \begin{bmatrix} {{ \partial w_{1i} } \over { \partial y_{1} }} & \cdots & {{ \partial w_{1i} } \over { \partial y_{n} }} \\ \vdots & \ddots & \vdots \\ {{ \partial w_{ni} } \over { \partial y_{1} }} & \cdots & {{ \partial w_{ni} } \over { \partial y_{n} }} \end{bmatrix}$.
- As a caution, it’s unnecessary to calculate the Jacobian for discrete random variables. It’s a basic mistake yet surprisingly common misconception.
Examples
The transformation of random variables is not only seemingly difficult but also requires honest and complex calculations. For transformations that are not injective, it’s necessary to calculate the Jacobian for each case separately, and how challenging this can be varies by problem. To get a sense of this difficulty, consider the following example: $$ f(x_{1} , x_{2}) = \begin{cases} {{ 1 } \over { \pi }} &, 0 < x_{1}^{2} + x_{2}^{2} < 1 \\ 0 &, \text{otherwise} \end{cases} $$ A random variable with the following probability density function samples points inside a circle uniformly. Naturally, one would transform orthogonal coordinates to polar coordinates, but to facilitate understanding, let’s consider an artificial transformation $Y_{1} = X_{1}^{2} + X_{2}^{2}$, $Y_{2} = X_{1}^{2} / (X_{1}^{2} + X_{2}^{2})$. Since the equation includes squaring, this transformation isn’t injective, and one needs to consider the following four scenarios: $$ x_{1} = \sqrt{y_{1} y_{2}} \land x_{2} = \sqrt{y_{1} (1 - y_{2}) } \\ x_{1} = \sqrt{y_{1} y_{2}} \land x_{2} = - \sqrt{y_{1} (1 - y_{2}) } \\ x_{1} = -\sqrt{y_{1} y_{2}} \land x_{2} = \sqrt{y_{1} (1 - y_{2}) } \\ x_{1} = - \sqrt{y_{1} y_{2}} \land x_{2} = - \sqrt{y_{1} (1 - y_{2}) } $$ The $i = 1, 2, 3, 4$-th Jacobian for each is calculated as follows: $$ J_{1} = \begin{bmatrix} {{ 1 } \over { 2 }} \sqrt{{ y_{2} } \over { y_{1} }} & {{ 1 } \over { 2 }} \sqrt{{ y_{1} } \over { y_{2} }} \\ {{ 1 } \over { 2 }} \sqrt{{ 1 - y_{2} } \over { y_{1} }} & - {{ 1 } \over { 2 }} \sqrt{{ y_{1} } \over { 1 - y_{2} }} \end{bmatrix} = - {{ 1 } \over { 4 \sqrt{y_{2} (1 - y_{2})} }} $$
$$ J_{2} = \begin{bmatrix} {{ 1 } \over { 2 }} \sqrt{{ y_{2} } \over { y_{1} }} & - {{ 1 } \over { 2 }} \sqrt{{ y_{1} } \over { y_{2} }} \\ {{ 1 } \over { 2 }} \sqrt{{ 1 - y_{2} } \over { y_{1} }} & {{ 1 } \over { 2 }} \sqrt{{ y_{1} } \over { 1 - y_{2} }} \end{bmatrix} = - {{ 1 } \over { 4 \sqrt{y_{2} (1 - y_{2})} }} $$
$$ J_{3} = \begin{bmatrix} - {{ 1 } \over { 2 }} \sqrt{{ y_{2} } \over { y_{1} }} & {{ 1 } \over { 2 }} \sqrt{{ y_{1} } \over { y_{2} }} \\ - {{ 1 } \over { 2 }} \sqrt{{ 1 - y_{2} } \over { y_{1} }} & - {{ 1 } \over { 2 }} \sqrt{{ y_{1} } \over { 1 - y_{2} }} \end{bmatrix} = {{ 1 } \over { 4 \sqrt{y_{2} (1 - y_{2})} }} $$
$$ J_{4} = \begin{bmatrix} - {{ 1 } \over { 2 }} \sqrt{{ y_{2} } \over { y_{1} }} & - {{ 1 } \over { 2 }} \sqrt{{ y_{1} } \over { y_{2} }} \\ - {{ 1 } \over { 2 }} \sqrt{{ 1 - y_{2} } \over { y_{1} }} & {{ 1 } \over { 2 }} \sqrt{{ y_{1} } \over { 1 - y_{2} }} \end{bmatrix} = {{ 1 } \over { 4 \sqrt{y_{2} (1 - y_{2})} }} $$ Therefore, the new joint probability density function $g$ obtained through $y_{1} , y_{2}$ is as follows: $$ g(y_{1} , y_{2}) = \sum_{i=1}^{4} {{ 1 } \over { \pi }} \left| \pm {{ 1 } \over { 4 \sqrt{y_{2} (1 - y_{2})} }} \right| = {{ 1 } \over { \pi \sqrt{y_{2} (1 - y_{2})} }} $$
If these calculations seem nauseating, that’s normal.