Partial Derivatives: Derivatives of Multivariable Vector Functions
Buildup[^1]
Recall the definition of the derivative of a univariate function.
$$ \lim \limits_{h\to 0} \dfrac{f(x+h) - f(x)}{h} = f^{\prime}(x) $$
By approximating the numerator on the left-hand side as a linear function of $h$, we get the following.
$$ \begin{equation} f(x+h) - f(x) = a h + r(h) \label{1} \end{equation} $$
Let’s call $r(h)$ the remainder, satisfying the condition below.
$$ \lim \limits_{h \to 0} \dfrac{r(h)}{h}=0 $$
Then, dividing both sides of $\eqref{1}$ by $h$ and taking the limit $\lim_{h\to 0}$, we get the following.
$$ \lim \limits_{h\to 0} \dfrac{f(x+h) - f(x)}{h} = \lim \limits_{h\to 0} \dfrac{ah+ r(h)}{h} = a + \lim \limits_{h\to 0} \dfrac{r(h)}{h} = a $$
Here, $a$ was the coefficient of the first-order term in the linear approximation of $h$. In this sense, $a$ is referred to as the derivative “coefficient” of $f$ at $x$. By slightly transforming the above equation, we can see that the derivative coefficient of $f$ at $x$ satisfies the equation for $a$.
$$ \lim \limits_{h\to 0} \dfrac{f(x+h) - f(x) - ah}{h} = \lim \limits_{h\to 0} \dfrac{r(h)}{h} = 0 $$
This forms the basis for defining the derivative of a multivariable vector function.
Definition
Let’s denote $E\subset \mathbb{R}^{n}$ as an open set, and $\mathbf{x}\in E$ accordingly. For $\mathbf{f} : E \to \mathbb{R}^{m}$, if there exists a linear transformation $A\in L(\mathbb{R}^{n}, \mathbb{R}^{m})$ for $\mathbf{h} \in \mathbb{R}^{n}$ that satisfies the following, then $f$ is differentiable at $\mathbf{x}$. Furthermore, $A$ is called the total derivative or simply the derivative of $f$ and is denoted by $\mathbf{f}^{\prime}(\mathbf{x})$.
$$ \begin{equation} \lim \limits_{|\mathbf{h}| \to 0} \dfrac{| \mathbf{f} ( \mathbf{x} + \mathbf{h}) - \mathbf{f} (\mathbf{x}) - A( \mathbf{h} )|}{|\mathbf{h}|} = 0 \label{2} \end{equation} $$
If $\mathbf{f}$ is differentiable at all points in $E$, then $\mathbf{f}$ is said to be differentiable in $E$.
Explanation
The term “total” implies entirety, contrasting with partial derivatives. It’s not the total $\check{}$ function, but the total $\check{}$ derivative.
It is important to note that $\mathbf{f}^{\prime}(\mathbf{x})$ is not a function value but a linear transformation satisfying $\mathbf{f}^{\prime}(\mathbf{x}) : E \subset \R^{n} \to \R^{m}$. Therefore, $\mathbf{f}^{\prime}(\mathbf{x}) = A$ can be represented as a matrix as follows.
$$ \mathbf{f}^{\prime}(\mathbf{x}) = A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix} $$
Then, the total derivative of $\mathbf{f}$, $\mathbf{f}^{\prime}$, can be seen as a function mapping a certain matrix $A$ every time $\mathbf{x} \in E \subset \R^{n}$ is provided. This matrix can be easily obtained from the partial derivatives of $\mathbf{f}$ and is also known as the Jacobian matrix.
$$ \mathbf{f}^{\prime}(\mathbf{x}) = \begin{bmatrix} (D_{1}f_{1}) (\mathbf{x}) & (D_{2}f_{1}) (\mathbf{x}) & \cdots & (D_{n}f_{1}) (\mathbf{x}) \\ (D_{1}f_{2}) (\mathbf{x}) & (D_{2}f_{2}) (\mathbf{x}) & \cdots & (D_{n}f_{2}) (\mathbf{x}) \\ \vdots & \vdots & \ddots & \vdots \\ (D_{1}f_{m}) (\mathbf{x}) & (D_{2}f_{m}) (\mathbf{x}) & \cdots & (D_{n}f_{m}) (\mathbf{x}) \end{bmatrix} $$
The total derivative is the ultimate generalization of differentiation for functions defined on finite dimensions, extending the domain and range of $\mathbf{f}$ to Banach spaces as the Fréchet derivative. The properties that held for univariate functions naturally hold as well.
- Uniqueness
- Chain Rule
Theorem
Uniqueness
Let $E, \mathbf{x}, \mathbf{f}$ be as defined in the Definition. If $A_{1}, A_{2}$ satisfies $\eqref{2}$, then the two linear transformations are equal.
$$ A_{1} = A_{2} $$
Proof
Let’s denote $B = A_{1} - A_{2}$. Then, by the triangle inequality, the following holds.
$$ \begin{align*} | B( \mathbf{h} ) | &= \left| A_{1}(\mathbf{h}) - A_{2}(\mathbf{h}) \right| \\ &= | A_{1}(\mathbf{h}) - \mathbf{f} (\mathbf{x} + \mathbf{h}) - \mathbf{f} (\mathbf{x}) + \mathbf{f} (\mathbf{x} + \mathbf{h}) + \mathbf{f} (\mathbf{x}) - A_{2}(\mathbf{h}) | \\ &\le | \mathbf{f} (\mathbf{x} + \mathbf{h}) + \mathbf{f} (\mathbf{x}) - A_{1}(\mathbf{h}) | + | \mathbf{f} (\mathbf{x} + \mathbf{h}) + \mathbf{f} (\mathbf{x}) - A_{2}(\mathbf{h}) | \end{align*} $$
Then, for a fixed $\mathbf{h} \ne \mathbf{0}$, the equation below holds.
$$ \lim _{t \to 0} \dfrac{ | B( t\mathbf{h} ) |}{| t\mathbf{h} |} \le \lim _{t \to 0}\dfrac{ | \mathbf{f} (\mathbf{x} + t\mathbf{h}) + \mathbf{f} (\mathbf{x}) - A_{1}(t\mathbf{h}) |}{| t\mathbf{h} |} + \lim _{t \to 0}\dfrac{| \mathbf{f} (\mathbf{x} + t\mathbf{h}) + \mathbf{f} (\mathbf{x}) - A_{2}(t\mathbf{h}) |}{| t\mathbf{h} |}=0 $$
However, since $B$ is a linear transformation, the left-hand side is independent of $t$.
$$ \lim _{t \to 0} \dfrac{ | tB( \mathbf{h} ) |}{| t\mathbf{h} |} = \lim _{t \to 0} \dfrac{ | B( \mathbf{h} ) |}{| \mathbf{h} |} = \dfrac{ | B( \mathbf{h} ) |}{| \mathbf{h} |} \le 0 $$
Since $\mathbf{h} \ne \mathbf{0}$, for the above equation to hold, it must be $B=0$. Thus, we obtain the following.
$$ B=A_{1}-A_{2}=0 \implies A_{1} = A_{2} $$
■
Chain Rule
As defined, let’s consider $E \subset \R^{n}$ as an open set and $\mathbf{f} : E \to \R^{m}$ as a function differentiable in $\mathbf{x}_{0} \in E$. Let $\mathbf{g} : \mathbf{f}(E) \to \R^{k}$ be a function differentiable in $\mathbf{f}(\mathbf{x}_{0}) \in \mathbf{f}(E)$. Also, let’s consider $\mathbf{F} : E \to \R^{k}$ as the composition of $\mathbf{f}$ and $\mathbf{g}$.
$$ \mathbf{F} (\mathbf{x}) = \mathbf{g} \left( \mathbf{f}(\mathbf{x}) \right) $$
Then, $\mathbf{F}$ is differentiable in $\mathbf{x}_{0}$, and the total derivative is as follows.
$$ \mathbf{F}^{\prime} (\mathbf{x}_{0}) = \mathbf{g}^{\prime} \left( \mathbf{f}(\mathbf{x}_{0}) \right) \mathbf{f}^{\prime} (\mathbf{x}_{0}) $$
Proof
Generalized proof for normed spaces
■