The Matrix Representation of a Linear Transformation
📂Linear Algebra The Matrix Representation of a Linear Transformation Definition Let’s call V , W V, W V , W a finite-dimensional vector space . Let’s call β = { v 1 , … , v n } \beta = \left\{ \mathbf{v}_{1}, \dots, \mathbf{v}_{n} \right\} β = { v 1 , … , v n } and γ = { w 1 , … , w m } \gamma = \left\{ \mathbf{w}_{1}, \dots, \mathbf{w}_{m} \right\} γ = { w 1 , … , w m } the ordered bases for V V V and W W W , respectively. Let’s call T : V → W T : V \to W T : V → W a linear transformation . Then, by the uniqueness of the basis representation , there exists a unique scalar a i j a_{ij} a ij satisfying the following.
T ( v j ) = ∑ i = 1 m a i j w i = a 1 j w 1 + ⋯ + a m j w m for 1 ≤ j ≤ n
T(\mathbf{v}_{j}) = \sum_{i=1}^{m}a_{ij}\mathbf{w}_{i} = a_{1j}\mathbf{w}_{1} + \cdots + a_{mj}\mathbf{w}_{m} \quad \text{ for } 1 \le j \le n
T ( v j ) = i = 1 ∑ m a ij w i = a 1 j w 1 + ⋯ + a mj w m for 1 ≤ j ≤ n
At this time, the m × n m \times n m × n matrix A A A defined by A i j = a i j A_{ij} = a_{ij} A ij = a ij is called the matrix representation for T T T relative to the ordered bases β \beta β and γ \gamma γ , and is denoted by [ T ] γ , β [T]_{\gamma, \beta} [ T ] γ , β or [ T ] β γ [T]_{\beta}^{\gamma} [ T ] β γ .
Explanation Every linear transformation can be represented by a matrix, and conversely, there exists a linear transformation corresponding to any matrix, essentially making the linear transformation and its matrix representation fundamentally the same . This is one of the reasons why matrices are studied in linear algebra. From the definition, such matrix representation can be found using the image of the basis.
If V = W V=W V = W and β = γ \beta=\gamma β = γ , it is simply denoted as follows.
[ T ] β = [ T ] γ , β
[T]_{\beta} = [T]_{\gamma, \beta}
[ T ] β = [ T ] γ , β
Properties Let’s call V , W V, W V , W a finite-dimensional vector space with an ordered basis β , γ \beta, \gamma β , γ given. And let’s call it T , U : V → W T, U : V \to W T , U : V → W . Then the following holds .
Regarding T T T and its inverse transformation T − 1 T^{-1} T − 1 , the following holds \\[0.6em]
T T T being invertible is equivalent to [ T ] β γ [T]_{\beta}^{\gamma} [ T ] β γ being invertible. Furthermore, [ T − 1 ] β γ = ( [ T ] β γ ) − 1 [T^{-1}]_{\beta}^{\gamma} = ([T]_{\beta}^{\gamma})^{-1} [ T − 1 ] β γ = ([ T ] β γ ) − 1 .Let’s call V , W , Z V, W, Z V , W , Z a finite-dimensional vector space, and α , β , γ \alpha, \beta, \gamma α , β , γ their respective ordered bases. And let’s call T : V → W T : V \to W T : V → W , U : W → Z U : W \to Z U : W → Z linear transformations. Then the following holds \\[0.6em]
[ U T ] α γ = [ U ] β γ [ T ] α β [UT]_{\alpha}^{\gamma} = [U]_{\beta}^{\gamma}[T]_{\alpha}^{\beta} [ U T ] α γ = [ U ] β γ [ T ] α β Finding the Matrix Let’s call the basis of V V V as β \beta β , and the basis of W W W as γ \gamma γ . And let’s call the coordinate vector of x ∈ V \mathbf{x} \in V x ∈ V as [ x ] β [\mathbf{x}]_{\beta} [ x ] β , and the coordinate vector of T ( x ) ∈ W T(\mathbf{x})\in W T ( x ) ∈ W as [ T ( x ) ] γ [T(\mathbf{x})]_{\gamma} [ T ( x ) ] γ .
Then our goal is to find the m × n m \times n m × n matrix A A A that transforms the vector [ x ] β [\mathbf{x}]_{\beta} [ x ] β into the vector [ T ( x ) ] γ [T(\mathbf{x})]_{\gamma} [ T ( x ) ] γ by matrix multiplication. By finding A A A , we can perform the linear transformation T T T by calculating matrix multiplication without specifically calculating T ( x ) T(\mathbf{x}) T ( x ) according to the given T T T .
A [ x ] β = [ T ( x ) ] γ
\begin{equation}
A[\mathbf{x}]_{\beta} = [T(\mathbf{x})]_{\gamma}
\end{equation}
A [ x ] β = [ T ( x ) ] γ
Let’s specifically call the two bases β = { v 1 , … , v n } \beta = \left\{ \mathbf{v}_{1}, \dots, \mathbf{v}_{n} \right\} β = { v 1 , … , v n } , γ = { w 1 , … , w m } \gamma = \left\{ \mathbf{w}_{1}, \dots, \mathbf{w}_{m} \right\} γ = { w 1 , … , w m } . Then, for each v i \mathbf{v}_{i} v i , ( 1 ) (1) ( 1 ) must hold, thus we obtain the following.
A [ v 1 ] β = [ T ( v 1 ) ] γ , A [ v 2 ] β = [ T ( v 2 ) ] γ , … , A [ v n ] β = [ T ( v n ) ] γ
\begin{equation}
A[\mathbf{v}_{1}]_{\beta} = [T(\mathbf{v}_{1})]_{\gamma},\quad A[\mathbf{v}_{2}]_{\beta} = [T(\mathbf{v}_{2})]_{\gamma},\quad \dots,\quad A[\mathbf{v}_{n}]_{\beta} = [T(\mathbf{v}_{n})]_{\gamma}
\end{equation}
A [ v 1 ] β = [ T ( v 1 ) ] γ , A [ v 2 ] β = [ T ( v 2 ) ] γ , … , A [ v n ] β = [ T ( v n ) ] γ
Let’s say the matrix A A A is as follows.
A = [ a 11 a 12 ⋯ a 1 n a 21 a 22 ⋯ a 2 n ⋮ ⋮ ⋱ ⋮ a m 1 a m 2 ⋯ a m n ]
A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix}
A = a 11 a 21 ⋮ a m 1 a 12 a 22 ⋮ a m 2 ⋯ ⋯ ⋱ ⋯ a 1 n a 2 n ⋮ a mn
The [ v i ] β [\mathbf{v}_{i}]_{\beta} [ v i ] β s are as follows.
[ v 1 ] β = [ 1 0 ⋮ 0 ] , [ v 2 ] β = [ 0 1 ⋮ 0 ] , … , [ v n ] β = [ 0 0 ⋮ 1 ]
[\mathbf{v}_{1}]_{\beta} = \begin{bmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix}, \quad
[\mathbf{v}_{2}]_{\beta} = \begin{bmatrix} 0 \\ 1 \\ \vdots \\ 0 \end{bmatrix}, \quad \dots,\quad
[\mathbf{v}_{n}]_{\beta} = \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 1 \end{bmatrix}
[ v 1 ] β = 1 0 ⋮ 0 , [ v 2 ] β = 0 1 ⋮ 0 , … , [ v n ] β = 0 0 ⋮ 1
Therefore, we obtain the following.
A [ v 1 ] β = [ a 11 a 12 ⋯ a 1 n a 21 a 22 ⋯ a 2 n ⋮ ⋮ ⋱ ⋮ a m 1 a m 2 ⋯ a m n ] [ 1 0 ⋮ 0 ] = [ a 11 a 21 ⋮ a m 1 ] A [ v 2 ] β = [ a 11 a 12 ⋯ a 1 n a 21 a 22 ⋯ a 2 n ⋮ ⋮ ⋱ ⋮ a m 1 a m 2 ⋯ a m n ] [ 0 1 ⋮ 0 ] = [ a 12 a 22 ⋮ a m 2 ] ⋮ A [ v n ] β = [ a 11 a 12 ⋯ a 1 n a 21 a 22 ⋯ a 2 n ⋮ ⋮ ⋱ ⋮ a m 1 a m 2 ⋯ a m n ] [ 1 0 ⋮ 0 ] = [ a 1 n a 2 n ⋮ a m n ]
\begin{align*}
A[\mathbf{v}_{1}]_{\beta} = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix} &= \begin{bmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{m1} \end{bmatrix}
\\[3em]
A[\mathbf{v}_{2}]_{\beta} = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ \vdots \\ 0 \end{bmatrix} &= \begin{bmatrix} a_{12} \\ a_{22} \\ \vdots \\ a_{m2} \end{bmatrix}
\\[1em]
&\vdots \\[1em]
A[\mathbf{v}_{n}]_{\beta} = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix} &= \begin{bmatrix} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} \end{bmatrix}
\end{align*}
A [ v 1 ] β = a 11 a 21 ⋮ a m 1 a 12 a 22 ⋮ a m 2 ⋯ ⋯ ⋱ ⋯ a 1 n a 2 n ⋮ a mn 1 0 ⋮ 0 A [ v 2 ] β = a 11 a 21 ⋮ a m 1 a 12 a 22 ⋮ a m 2 ⋯ ⋯ ⋱ ⋯ a 1 n a 2 n ⋮ a mn 0 1 ⋮ 0 A [ v n ] β = a 11 a 21 ⋮ a m 1 a 12 a 22 ⋮ a m 2 ⋯ ⋯ ⋱ ⋯ a 1 n a 2 n ⋮ a mn 1 0 ⋮ 0 = a 11 a 21 ⋮ a m 1 = a 12 a 22 ⋮ a m 2 ⋮ = a 1 n a 2 n ⋮ a mn
Then, by ( 2 ) (2) ( 2 ) , we obtain the following.
[ T ( v 1 ) ] γ = [ a 11 a 21 ⋮ a m 1 ] , [ T ( v 2 ) ] γ = [ a 12 a 22 ⋮ a m 2 ] , … , [ T ( v n ) ] γ = [ a 1 n a 2 n ⋮ a m n ]
[T(\mathbf{v}_{1})]_{\gamma} = \begin{bmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{m1} \end{bmatrix},\quad [T(\mathbf{v}_{2})]_{\gamma} = \begin{bmatrix} a_{12} \\ a_{22} \\ \vdots \\ a_{m2} \end{bmatrix},\quad \dots,\quad [T(\mathbf{v}_{n})]_{\gamma} = \begin{bmatrix} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} \end{bmatrix}
[ T ( v 1 ) ] γ = a 11 a 21 ⋮ a m 1 , [ T ( v 2 ) ] γ = a 12 a 22 ⋮ a m 2 , … , [ T ( v n ) ] γ = a 1 n a 2 n ⋮ a mn
Therefore, the j j j th column of the matrix A A A is [ T ( v j ) ] γ [T(\mathbf{v}_{j})]_{\gamma} [ T ( v j ) ] γ .
A = [ [ T ( v 1 ) ] γ [ T ( v 2 ) ] γ ⋯ [ T ( v n ) ] γ ]
A = \begin{bmatrix} [T(\mathbf{v}_{1})]_{\gamma} & [T(\mathbf{v}_{2})]_{\gamma} & \cdots & [T(\mathbf{v}_{n})]_{\gamma}\end{bmatrix}
A = [ [ T ( v 1 ) ] γ [ T ( v 2 ) ] γ ⋯ [ T ( v n ) ] γ ]
Thus, the following equation holds.
[ T ] γ , β [ x ] β = [ T ( x ) ] γ = [ T ] β γ [ x ] β
[T]_{\gamma, \beta} [\mathbf{x}]_{\beta} = [T(\mathbf{x})]_{\gamma} = [T]_{\beta}^{\gamma}[\mathbf{x}]_{\beta}
[ T ] γ , β [ x ] β = [ T ( x ) ] γ = [ T ] β γ [ x ] β
This can be intuitively seen as canceling out adjacent (or duplicated in subscripts) 2 of β \beta β and substituting x \mathbf{x} x into T T T .