logo

Direct Sum of Matrices 📂Matrix Algebra

Direct Sum of Matrices

Definition1

The direct sum of two matrices $B \in M_{m\times n}$, $C \in M_{p\times q}$ is defined as matrix $A$ of the following $(m+p) \times (n+q)$, and is denoted by $B \oplus C$.

$$ A = B \oplus C := \begin{bmatrix} b_{11} & \cdots & b_{1n} & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\ b_{m1} & \cdots & b_{mn} & 0 & \cdots & 0 \\ 0 & \cdots & 0 & c_{11} & \cdots & c_{1q} \\ \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\ 0 & \cdots & 0 & c_{p1} & \cdots & c_{pq} \\ \end{bmatrix} $$

$$ A_{ij} := \begin{cases} [B]_{ij} & \text{for } 1\le i \le m,\ 1\le j \le n \\ [C]_{(i-m),(j-n)} & \text{for } m+1\le i \le p+m,\ n+1\le j \le q+n \\ 0 & \text{otherwise} \end{cases} $$

If expressed in block matrix form,

$$ A = \begin{bmatrix} B & O_{mq} \\ O_{pn} & C \end{bmatrix} $$

In this case, $O$ is the zero matrix.

Generalization

The direct sum of matrix $B_{1}, B_{2}, \dots, B_{k}$ is recursively defined as follows.

$$ B_{1} \oplus B_{2} \oplus \cdots \oplus B_{k} := (B_{1} \oplus B_{2} \oplus \cdots \oplus B_{k-1}) \oplus B_{k} $$

If $A = B_{1} \oplus B_{2} \oplus \cdots \oplus B_{k}$,

$$ A = \begin{bmatrix} B_{1} & O & \cdots & O \\ O & B_{2} & \cdots & O \\ \vdots & \vdots & \ddots & \vdots \\ O & O & \cdots & B_{k} \\ \end{bmatrix} $$

Explanation

Simply put, it’s about making a block diagonal matrix with matrices.

$$ B_{1} \oplus B_{2} \oplus \cdots \oplus B_{k} = \href{../2048}{\diag} \begin{bmatrix} B_{1} \\ B_{2} \\ \vdots \\ B_{k} \end{bmatrix} $$

For a concrete example, if $B_{1} = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}$, $B_{2} = \begin{bmatrix} 2 \end{bmatrix}$, and $B_{3} = \begin{bmatrix} 3 & 3 & 3 \\ 3 & 3 & 3 \\ 3 & 3 & 3 \end{bmatrix}$,

$$ B_{1} \oplus B_{2} \oplus B_{3} = \begin{bmatrix} 1 & 1 & 1 & 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 3 & 3 & 3 \\ 0 & 0 & 0 & 0 & 3 & 3 & 3 \\ 0 & 0 & 0 & 0 & 3 & 3 & 3 \end{bmatrix} $$

In many cases, one may encounter direct sum of subspaces before the direct sum of matrices, but the theorem below is sufficient to understand why such a definition is called a direct sum. When a linear transformation $T : V \to V$ is given, if $V = W_{1} \oplus \cdots \oplus W_{k}$, the matrix representation of $T$ appears as the direct sum of the matrix representations of projections $T|_{W_{i}}$, therefore, there’s no reason not to call this operation a direct sum.

Theorem

Let $T : V \to V$ be a linear transformation on the finite-dimensional vector space $V$. Let $W_{1}, \dots, W_{k}$ be an $T$-invariant subspace, and $V$ be the direct sum of $W_{i}$.

$$ V = W_{1} \oplus \cdots \oplus W_{k} $$

Let $\beta_{i}$ be the ordered basis of $W_{i}$, and $\beta = \beta_{1} \cup \cdots \cup \beta_{k}$ (then $\beta$ is a basis for $V$). And if $A = \begin{bmatrix} T \end{bmatrix}_{\beta}$, $B_{i} = \begin{bmatrix} T|_{W}\end{bmatrix}_{\beta_{i}}$, then the following holds.

$$ A = B_{1} \oplus B_{2} \oplus \cdots \oplus B_{k} = \begin{bmatrix} B_{1} & O & \cdots & O \\ O & B_{2} & \cdots & O \\ \vdots & \vdots & \ddots & \vdots \\ O & O & \cdots & B_{k} \\ \end{bmatrix} $$

Proof

The proof is by mathematical induction.

  • It holds when $k=2$.

    Let’s say $\mathbf{v} \in \beta_{1}$. Since $\beta$ is a basis for $V$, $T \mathbf{v} \in V$ is expressed as a linear combination of $\beta$. But since $W_{1}$ is an invariant subspace, $T \mathbf{v} \in W_{1}$ holds. Therefore, in the linear combination for $T \mathbf{v}$, the coefficients of the elements of $\beta_{2}$ are all $0$. This means, when $n = \dim(W_{1})$, the components of the coordinate vector $\begin{bmatrix} T \mathbf{v} \end{bmatrix}_{\beta}$ are all $0$ from the $n+1$nd position onwards. Therefore, $$ \begin{bmatrix} T|_{W_{1}}\mathbf{v}\end{bmatrix}_{\beta_{1}} = \begin{bmatrix} b_{1} \\ \vdots \\ b_{n} \end{bmatrix} \quad \text{and} \quad \begin{bmatrix} T \mathbf{v} \end{bmatrix}_{\beta} = \begin{bmatrix} b_{1} \\ \vdots \\ b_{n} \\ 0 \\ \vdots \\ 0 \end{bmatrix} $$ Similarly, if $\mathbf{v} \in \beta_{2}$, $m = \dim(W_{2})$, then $T \mathbf{v} \in W_{2}$ applies and the coordinate vector is as follows. $$ \begin{bmatrix} T|_{W_{2}}\mathbf{v}\end{bmatrix}_{\beta_{2}} = \begin{bmatrix} b_{n+1} \\ \vdots \\ b_{n+m} \end{bmatrix} \quad \text{and} \quad \begin{bmatrix} T \mathbf{v} \end{bmatrix}_{\beta} = \begin{bmatrix} 0 \\ \vdots \\ 0 \\ b_{n+1} \\ \vdots \\ b_{n+m} \end{bmatrix} $$ Therefore, $$ \begin{bmatrix} T \end{bmatrix}_{\beta} = \begin{bmatrix} \begin{bmatrix} T|_{W_{1}}\end{bmatrix}_{\beta_{1}} & O \\ O & \begin{bmatrix} T|_{W_{2}}\end{bmatrix}_{\beta_{2}} \end{bmatrix} $$

  • If it holds when $k-1$, it also holds when $k$.

    Let’s say $W = W_{1} \oplus \cdots \oplus W_{k-1}$, $\beta_{W} = \beta_{1} \cup \cdots \cup \beta_{k-1}$. Assuming it holds when $k-1$, $$ \begin{bmatrix} T|_{W} \end{bmatrix}_{\beta_{W}} = \begin{bmatrix} \begin{bmatrix} T|_{W_{1}}\end{bmatrix}_{\beta_{1}} & \cdots & O \\ \vdots & \ddots & \vdots \\ O &\cdots & \begin{bmatrix} T|_{W_{k-1}}\end{bmatrix}_{\beta_{k-1}} \end{bmatrix} $$ But since $V = W \oplus W_{k}$, $\beta = \beta_{W} \cup \beta_{k}$ and it holds when $k=2$, $$ \begin{bmatrix} T \end{bmatrix}_{\beta} = \begin{bmatrix} \begin{bmatrix} T|_{W}\end{bmatrix}_{\beta_{W}} & O \\ O & \begin{bmatrix} T|_{W_{k}}\end{bmatrix}_{\beta_{k}} \end{bmatrix} = \begin{bmatrix} \begin{bmatrix} T|_{W_{1}}\end{bmatrix}_{\beta_{1}} & \cdots & O & O \\ \vdots & \ddots & \vdots & \vdots \\ O & \cdots & \begin{bmatrix} T|_{W_{k-1}}\end{bmatrix}_{\beta_{k-1}} & O \\ O & \cdots & O & \begin{bmatrix} T|_{W_{k}}\end{bmatrix}_{\beta_{k}} \\ \end{bmatrix} $$


  1. Stephen H. Friedberg, Linear Algebra (4th Edition, 2002), p320-321 ↩︎