Norm of Linear Transformations
Definition1
A linear transformation $T \in L(\mathbb{R}^{n}, \mathbb{R}^{m})$’s norm is defined as follows.
$$ \begin{equation} \| T \| := \sup \limits_{| \mathbf{x} | = 1} | T ( \mathbf{x} ) | \end{equation} $$
Explanation
If we look at (a), the following equation holds, so $\| T \|$ is the rate of change in size when $T$ maps the elements of $\mathbb{R}^{n}$ to $\mathbb{R}^{m}$. In other words, no matter how much the size changes, it means to the extent of $\| T \|$.
$$ \dfrac{|T(\mathbf{x})|}{|\mathbf{x}|} \le \| T \| $$
Also, by definition, $\| T \|$ is known to be the smallest value among $\lambda$ that satisfies the following.
$$ | T (\mathbf{x}) | \le \lambda | \mathbf{x} | , \quad \forall \mathbf{x} \in \mathbb{R}^{n} $$
It can be easily verified that $\| T \|$ satisfies the definition of a norm.
- $\| T \| \ge 0$
- $\| T \| = 0 \iff T = 0$
- $\| c T \| = | c | \| T \|$
- $\| T_{1} + T_{2} \| \le \| T_{1} \| + \| T_{2} \|$
Then, since the distance of $L(\mathbb{R}^{n}, \mathbb{R}^{m})$ can be given as follows, $L(\mathbb{R}^{n}, \mathbb{R}^{m})$ becomes a metric space.
$$ d(T_{1}, T_{2}) = \| T_{1} - T_{2} \|,\quad T_{1},T_{2} \in L(\mathbb{R}^{n}, \mathbb{R}^{m}) $$
Definition $(1)$ and theorem (a) are necessary and sufficient conditions.
$$ \| T \| := \sup \limits_{| \mathbf{x} | = 1} | T ( \mathbf{x} ) | \implies | T(\mathbf{x}) | \le \| T \| | \mathbf{x} |,\quad \forall \mathbf{x} \in \mathbb{R}^{n} $$
$$ \| T \| := \min \left\{ K : | T(\mathbf{x}) | \le K | \mathbf{x} |,\quad \forall \mathbf{x} \in \mathbb{R}^{n} \right\} \implies \| T \| = \sup \limits_{| \mathbf{x} | = 1} | T ( \mathbf{x} ) | $$
Theorem
(a) If $T \in L(\mathbb{R}^{n}, \mathbb{R}^{m})$, the following holds. $$ | T(\mathbf{x}) | \le \| T \| | \mathbf{x} |,\quad \forall \mathbf{x} \in \mathbb{R}^{n} $$
(b) If $T \in L(\mathbb{R}^{n}, \mathbb{R}^{m})$, $\| T \| < \infty$ and $T$ are uniformly continuous.
(c) If $T_{1} \in L(\mathbb{R}^{n}, \mathbb{R}^{m})$ and $T_{2} \in L(\mathbb{R}^{m}, \mathbb{R}^{k})$, the following holds. $$ \|T_{2}\circ T_{1} \| \le \| T_{2} \| \| T_{1} \| $$
Proof
(a)
Let’s say $\mathbf{x} \ne \mathbf{0}$. Then, since $T$ is a linear transformation, the following holds.
$$ \begin{align*} \dfrac{ | T(\mathbf{x}) |}{|\mathbf{x}|} &= \dfrac{1}{|\mathbf{x}|}|T(\mathbf{x})| \\ &= \left| \dfrac{1}{|\mathbf{x}|} T(\mathbf{x}) \right| \\ &= \left| T\left( \dfrac{\mathbf{x}}{|\mathbf{x}|} \right) \right| \end{align*} $$
Then, since $\left| \dfrac{\mathbf{x}}{|\mathbf{x}|} \right| = 1$, by the definition of $\| T \|$, the following holds.
$$ \begin{align*} && \dfrac{ | T(\mathbf{x}) |}{|\mathbf{x}|} &\le \| T \| \\ \implies && | T(\mathbf{x}) | &\le \| T \| |\mathbf{x}|,\quad \forall \mathbf{x} \in \mathbb{R}^{n} \end{align*} $$
■
(b)
Let’s consider $\left\{ \mathbf{e}_{1}, \dots, \mathbf{e}_{n} \right\}$ as the standard basis of $\mathbb{R}^{n}$. Then, for $\mathbf{x} \in \mathbb{R}^{n}$ which is $| \mathbf{x} | \le 1$, the following holds.
$$ \mathbf{x} = \sum c_{i}\mathbf{e}_{i} \quad \text{and} \quad |c_{i}| \le 1 $$
Then, since $T$ is a linear transformation, the following holds.
$$ | T (\mathbf{x}) | = \left| T \left( \sum _{i=1}^{n} c_{i} \mathbf{e}_{i} \right) \right| = \left| \sum _{i=1}^{n} c_{i} T \left(\mathbf{e}_{i} \right) \right| \le \sum _{i=1}^{n} | c_{i} | | T (\mathbf{e}_{i}) | \le \sum _{i=1}^{n} | T (\mathbf{e}_{i}) | $$
Therefore, we obtain the following.
$$ \| T \| \le \sum _{i=1}^{n} | T (\mathbf{e}_{i}) | < \infty $$
By (a), for $\mathbf{x}, \mathbf{y} \in \mathbb{R}^{n}$, the following holds.
$$ |T(\mathbf{x}) - T(\mathbf{y})| \le \| T \| | \mathbf{x} - \mathbf{y} | $$
Let’s assume $\varepsilon > 0$. Let’s say $\delta = \dfrac{\varepsilon}{\| T \|}$. Then, since the following holds, $T$ is uniformly continuous.
$$ | \mathbf{x} - \mathbf{y} | < \delta \implies |T(\mathbf{x}) - T(\mathbf{y})| \le \| T \| | \mathbf{x} - \mathbf{y} | = \| T \| \dfrac{\varepsilon}{\| T \|} = \varepsilon $$
(c)
By (a), the following holds.
$$ | (T_{2}\circ T_{1}) (\mathbf{x}) | = | T_{2} (T_{1} (\mathbf{x})) | \le \| T_{2} \| |T_{1} (\mathbf{x}) | \le \| T_{2} \| \|T_{1}\| | \mathbf{x} | $$
Therefore, we obtain the following.
$$ \| T_{2} \circ T_{1} \| \le \| T_{2} \| \|T_{1}\| $$
Walter Rudin, Principles of Mathmatical Analysis (3rd Edition, 1976), p208 ↩︎