Chain Rule for Fréchet Derivatives 📂Banach Space

Chain Rule for Fréchet Derivatives

Theorem

Let’s assume $(X, \left\| \cdot \right\|_{X}), (Y, \left\| \cdot \right\|_{Y}), (Z, \left\| \cdot \right\|_{Z})$ is a Banach space. Let $\Omega \subset X$, $U \subset Y$ be open sets. And functions $F : \Omega \to Y$, $G : U \to Z$ are given. Then, $F(\Omega) \subset U$ is satisfied. Now, let’s assume $F$ is differentiable at $x\in\Omega$ in the sense of [Fréchet], and $G$ is differentiable at $z=F(x)\in U$. Then, $H:=G \circ F$ is also differentiable at $x\in \Omega$ and the following equation holds:

$$ DH(x) = DG(z)DF(x)=DG\big( F(x) \big)\cdot DF(x) $$

Explanation

Naturally, the chain rule applies to Fréchet derivatives as well.

Proof

First, let’s assume $R, R_{1}$ is as follows.

$$ \begin{equation} R(x,y)=F(x+y)-F(x)-DF(x)y,\quad \forall y\in X,\ x+y\in \Omega \end{equation} $$

$$ \begin{equation} R_{1}(z,w)=G(z+w)-G(z)-DG(z)w,\quad \forall w\in Y,\ z+w\in U \end{equation} $$

Then, by assumption, since $F$ is differentiable at $x$ and $G$ is differentiable at $z$,

$$ \begin{equation} \lim \limits_{\|y\|_{X} \to 0} \frac{\| R(x,y)\|_{Y}}{\|y\|_{X}}=0= \lim \limits_{\|w\|_{Y} \to 0} \frac{\| R_{1}(z,w)\|_{Z}}{\|w\|_{Y}} \end{equation} $$

Moreover, by $(1)$, for $x+y\in \Omega$ which is $y\in X$,

$$ \begin{align*} H(x+y) =&\ G\big( F(x+y) \big) \\ =&\ G\big( F(x)+DF(x)y+R(x,y) \big) \end{align*} $$

If we assume $DF(x)y+R(x,y)=W^{\prime}$, since $G$ is linear and due to $z=F(x)$, by $(2)$,

$$ \begin{align*} H(x+y) =&\ G(z+W^{\prime}) \\ =&\ G(z)+DG(z)W^{\prime}+R_{1}(z,W^{\prime}) \\ =&\ G(z)+DG(z)\big( DF(x)y+ R(x,y) \big) + R_{1}(z, DF(x)y+R(x,y) \big) \\ =&\ H(x)+DG(z)DF(x)y+DG(z)R(x,y)+ R_{1}\big(z,DF(x)y+R(x,t) \big) \tag{4} \end{align*} $$

Let the last two terms be $R_2(x,y)$, and assume $f$ is as follows.

$$ R_2(x,y)=DG(z)R(x,y)+R_{1}\big( z, DF(x)y +R(x,y) \big) \in Z $$

$$ f(w) = \begin{cases} \dfrac{ \| R_{1}(z,w) \|_{Z}}{\|w\|_{Y}} \quad & \forall w \in Y, z+w\in U, w \ne 0 \\ 0 & w=0 \end{cases} $$

Then, you can verify that $\lim \limits_{\| y\| \to 0} \dfrac{\|R_2(x,y)\|_{Z}}{\|y \|_{X}}=0$ holds. By the definition of the norm, the triangle inequality holds, and since $ \|L x\|\le \|L\| \|x\|$,

$$ \begin{align*} \frac{\| R_2(x,y) \|_{Z}}{\|y \|_{X}} \color{red}{\le}& \frac{\| DG(z)R(x,y) \|_{Z} }{\| y\|_{X}} +\frac{\|R_{1} \big( z, DF(x)y+R(x,y) \big)\|_{Z}}{\|y\|_{X}} \\[1em] \color{green}{\le}& \|DG(z)\| \frac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +\frac{\|R_{1} \big(z, DF(x)y+R(x,y) \big)\|_{Z}}{\|y\|_{X}} \end{align*} $$

Also, due to the definition of $f$ and the triangle inequality,

$$ \begin{array}{ll} & \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +\dfrac{\|R_{1} \big(z, DF(x)y+R(x,y) \big)\|_{Z}}{\|y\|_{X}} \\[1.5em] =&\ \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +\dfrac{\|R_{1} \big(z, DF(x)y+R(x,y) \big)\|_{Z}}{\|DF(x)y +R(x,y)\|_{Y}}\dfrac{\|DF(x)y +R(x,y)\|_{Y}}{\|y\|_{X}} \\[1.5em] \color{magenta}{=}& \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +f\big( DF(x)y +R(x,y) \big)\dfrac{\|DF(x)y +R(x,y)\|_{Y}}{\|y\|_{X}} \\[1.5em] \color{red}{\le}& \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +f\big( DF(x)y +R(x,y) \big)\Bigg[\dfrac{\|DF(x)y\|_{Y}}{\|y\|_{X}} +\dfrac{\|R(x,y)\|_{Y}}{\|y\|_{X}} \Bigg] \\[1.5em] \color{green}{\le}& \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +f\big( DF(x)y +R(x,y) \big)\Bigg[\|DF(x)\|\dfrac{\|y\|_{X}}{\|y\|_{X}} +\dfrac{\|R(x,y)\|_{Y}}{\|y\|_{X}} \Bigg] \end{array} $$

Firstly, since $\lim \limits_{\| y\|_{X} \to 0} \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}}=0$, the first term is $0$ when $\| y\| \to 0$. According to $(3)$ and the definition of $f$, when $DF(x)y+R(x,y) \to 0$, it is $f \to 0$. As we assume differentiability, and since $DF(x)$ is bounded linear, when $\|y\| \to 0$, it is $DF(x)y \to 0$. Also, the very last term also converges to $0$ by the assumption of differentiability. Therefore,

$$ \lim \limits_{\| y\| \to 0} \frac{\| R_2(x,y) \|_{Z}}{\|y \|_{X}}\le \|DG(z) \| \cdot 0 + 0\cdot \Big[ \|DF(x)\| + 0 \Big] =0 $$

Applying this result to $(4)$,

$$ H(x+y)-H(x)+DG(z)DF(x)y=R_2(x,y) $$

$$ \implies \frac{\left\|H(x+y)-H(x)+DG(z)DF(x)y\right\|_{Z}}{\|y\|_{X}}=\frac{\left\| R_2(x,y)\right\|_{Z} }{\|y\|_{X}} $$

$$ \implies \lim \limits_{\|y\|_{X} \to 0}\frac{\left\|H(x+y)-H(x)+DG(z)DF(x)y\right\|_{Z}}{\|y\|_{X}}=\lim \limits_{\|y\|_{X} \to 0}\frac{\left\| R_2(x,y)\right\|_{Z} }{\|y\|_{X}}=0 $$

Therefore, based on the definition of differentiability, $H$ is differentiable at $x\in \Omega$, and the derivative of $H$ is

$$ DH(x)=DG(z)DF(x) $$

■