Probability Vector 📂Mathematical Statistics

Probability Vector

Definition

A vector that satisfies the following condition $\mathbf{p} = \begin{bmatrix}p_{1} & \cdots & p_{n} \end{bmatrix}^{\mathsf{T}}$ is called a probability vector.

$$ 0 \le p_{i} \le 1 \quad (1 \le i \le n)\quad \text{and} \quad \sum_{i=1}^{n} p_{i} = 1 $$

Explanation

A probability vector is a vector that represents the probabilities of each state when there are $n$ states. Conceptually, it is analogous to a probability mass function. If the probability mass function of a discrete random variable $X$ is $p_{X}$, then the probability vector is as follows.

$$ \mathbf{p} = \begin{bmatrix}p_{1} \\ \vdots \\ p_{n} \end{bmatrix} = \begin{bmatrix}p_{X}(1) \\ \vdots \\ p_{X}(n) \end{bmatrix} = \begin{bmatrix}p(1) \\ \vdots \\ p(n) \end{bmatrix} $$

If the probability that the $j$-th state changes to the $i$-th state is $q_{ij} = q(i | j)$, then the probability vector $\mathbf{q}^{(j)} = \begin{bmatrix}q_{1j} & \cdots & q_{nj} \end{bmatrix}^{\mathsf{T}}$ represents the probability that the $j$-th state changes to other states. The matrix of these column vectors becomes a transition matrix.

$$ Q = \begin{bmatrix} \vert & \vert & & \vert \\ \mathbf{q}^{(1)} & \mathbf{q}^{(2)} & \cdots & \mathbf{q}^{(n)} \\ \vert & \vert & & \vert \end{bmatrix} = \begin{bmatrix} q_{11} & q_{12} & & q_{1n} \\ q_{21} & q_{22} & \cdots & q_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ q_{n1} & q_{n2} & \cdots & q_{nn} \end{bmatrix} $$

If $\mathbf{p}$ is the probability vector for the current state, then $\mathbf{p}^{\prime} = Q \mathbf{p}$ is the probability vector for the next state, given that the probabilities for the current state are provided as $\mathbf{p}$.

$$ \mathbf{p}^{\prime} = Q \mathbf{p} = \begin{bmatrix} \sum\limits_{j} q_{ij} p_{j} \\ \sum\limits_{j} q_{2j} p_{j} \\ \vdots \\[1em] \sum\limits_{j} q_{nj} p_{j} \end{bmatrix} = \begin{bmatrix} \sum\limits_{j} q(1 | j) p(j) \\ \sum\limits_{j} q(2 | j) p(j) \\ \vdots \\[1em] \sum\limits_{j} q(n | j) p(j) \end{bmatrix} = \begin{bmatrix}p^{\prime}(1) \\[1em] p^{\prime}(2) \\[1em] \vdots \\[1em] p^{\prime}(n) \end{bmatrix} $$

Notation

If we let the probability row vector for the current state be $\pi$ and the transition matrix be $P = \begin{bmatrix} - \mathbf{p}^{(1)} - \\ \vdots \\ - \mathbf{p}^{(n)} - \end{bmatrix}$, the following notation is frequently used.

$$ \pi P = \pi^{\prime} $$

In this case, $P_{ij} = P(j | i)$, where the element in the $i$-th row and the $j$-th column represents the probability of changing from the $i$-th state to the $j$-th state. The reason for using a row vector notation is not entirely clear. A notation utilizing the letter t for transition, such as $T$, is also commonly used for the transition matrix.