Derivation of the Covariance Matrix of the Multinomial Distribution
📂Probability Distribution Derivation of the Covariance Matrix of the Multinomial Distribution If a random vector X : = ( X 1 , ⋯ , X k ) \mathbf{X} := \left( X_{1} , \cdots , X_{k} \right) X := ( X 1 , ⋯ , X k ) follows a multinomial distribution M k ( n , p ) M_{k} \left( n, \mathbf{p} \right) M k ( n , p ) , then its covariance matrix is as follows.
Cov ( X ) = n [ p 1 ( 1 − p 1 ) − p 1 p 2 ⋯ − p 1 p k − p 2 p 1 p 2 ( 1 − p 2 ) ⋯ − p 2 p 2 ⋮ ⋮ ⋱ ⋮ − p k p 1 − p k p 2 ⋯ p k ( 1 − p k ) ]
\operatorname{Cov} \left( \mathbf{X} \right) = n \begin{bmatrix}
p_{1} \left( 1 - p_{1} \right) & - p_{1} p_{2} & \cdots & - p_{1} p_{k}
\\ - p_{2} p_{1} & p_{2} \left( 1 - p_{2} \right) & \cdots & - p_{2} p_{2}
\\ \vdots & \vdots & \ddots & \vdots
\\ - p_{k} p_{1} & - p_{k} p_{2} & \cdots & p_{k} \left( 1 - p_{k} \right)
\end{bmatrix}
Cov ( X ) = n p 1 ( 1 − p 1 ) − p 2 p 1 ⋮ − p k p 1 − p 1 p 2 p 2 ( 1 − p 2 ) ⋮ − p k p 2 ⋯ ⋯ ⋱ ⋯ − p 1 p k − p 2 p 2 ⋮ p k ( 1 − p k )
Description The components of the multinomial distribution are almost mutually exclusive rather than just non-independent because of the constraint that the sum of the random vector must be n n n . Therefore, when i ≠ j i \ne j i = j , each component necessarily has a negative correlation .
Derivation If i = j i = j i = j then Cov ( X i , X i ) = Var ( X i ) \operatorname{Cov} \left( X_{i} , X_{i} \right) = \Var \left( X_{i} \right) Cov ( X i , X i ) = Var ( X i ) and X i X_{i} X i , each component independently follows a binomial distribution Bin ( n , p i ) \text{Bin} \left( n , p_{i} \right) Bin ( n , p i ) . Thus, the i i i -th diagonal component of the covariance matrix becomes n p i ( 1 − p i ) n p_{i} \left( 1 - p_{i} \right) n p i ( 1 − p i ) .
Properties of the multinomial distribution : For i ≠ j i \ne j i = j , X i + X j X_{i} + X_{j} X i + X j follows a binomial distribution Bin ( n , p i + p j ) \text{Bin} \left( n , p_{i} + p_{j} \right) Bin ( n , p i + p j ) .
X i + X j ∼ Bin ( n , p i + p j )
X_{i} + X_{j} \sim \text{Bin} \left( n , p_{i} + p_{j} \right)
X i + X j ∼ Bin ( n , p i + p j )
If i ≠ j i \ne j i = j , the bundle properties yield the following.
Var ( X i + X j ) = Var X i + Var X j + 2 Cov ( X i , X j ) ⟹ n ( p i + p j ) ( 1 − p i − p j ) = n p i ( 1 − p i ) + n p j ( 1 − p j ) + 2 Cov ( X i , X j ) ⟹ n ( p i + p j ) ( − p i − p j ) = n p i ( − p i ) + n p j ( − p j ) + 2 Cov ( X i , X j ) ⟹ − 2 n p i p j = 2 Cov ( X i , X j ) ⟹ Cov ( X i , X j ) = − n p i p j
\begin{align*}
&& \Var \left( X_{i} + X_{j} \right) =& \Var X_{i} + \Var X_{j} + 2 \operatorname{Cov} \left( X_{i} , X_{j} \right)
\\ \implies && n \left( p_{i} + p_{j} \right) \left( 1 - p_{i} - p_{j} \right) =& n p_{i} \left( 1 - p_{i} \right) + n p_{j} \left( 1 - p_{j} \right)+ 2 \operatorname{Cov} \left( X_{i} , X_{j} \right)
\\ \implies && n \left( p_{i} + p_{j} \right) \left( - p_{i} - p_{j} \right) =& n p_{i} \left( - p_{i} \right) + n p_{j} \left( - p_{j} \right)+ 2 \operatorname{Cov} \left( X_{i} , X_{j} \right)
\\ \implies && - 2 n p_{i} p_{j} =& 2 \operatorname{Cov} \left( X_{i} , X_{j} \right)
\\ \implies && \operatorname{Cov} \left( X_{i} , X_{j} \right) =& - n p_{i} p_{j}
\end{align*}
⟹ ⟹ ⟹ ⟹ Var ( X i + X j ) = n ( p i + p j ) ( 1 − p i − p j ) = n ( p i + p j ) ( − p i − p j ) = − 2 n p i p j = Cov ( X i , X j ) = Var X i + Var X j + 2 Cov ( X i , X j ) n p i ( 1 − p i ) + n p j ( 1 − p j ) + 2 Cov ( X i , X j ) n p i ( − p i ) + n p j ( − p j ) + 2 Cov ( X i , X j ) 2 Cov ( X i , X j ) − n p i p j
■