logo

Chain Rule for Fréchet Derivatives 📂Banach Space

Chain Rule for Fréchet Derivatives

Theorem

Let’s assume (X,X),(Y,Y),(Z,Z)(X, \left\| \cdot \right\|_{X}), (Y, \left\| \cdot \right\|_{Y}), (Z, \left\| \cdot \right\|_{Z}) is a Banach space. Let ΩX\Omega \subset X, UYU \subset Y be open sets. And functions F:ΩYF : \Omega \to Y, G:UZG : U \to Z are given. Then, F(Ω)UF(\Omega) \subset U is satisfied. Now, let’s assume FF is differentiable at xΩx\in\Omega in the sense of [Fréchet], and GG is differentiable at z=F(x)Uz=F(x)\in U. Then, H:=GFH:=G \circ F is also differentiable at xΩx\in \Omega and the following equation holds:

DH(x)=DG(z)DF(x)=DG(F(x))DF(x) DH(x) = DG(z)DF(x)=DG\big( F(x) \big)\cdot DF(x)

Explanation

Naturally, the chain rule applies to Fréchet derivatives as well.

Proof

First, let’s assume R,R1R, R_{1} is as follows.

R(x,y)=F(x+y)F(x)DF(x)y,yX, x+yΩ \begin{equation} R(x,y)=F(x+y)-F(x)-DF(x)y,\quad \forall y\in X,\ x+y\in \Omega \end{equation}

R1(z,w)=G(z+w)G(z)DG(z)w,wY, z+wU \begin{equation} R_{1}(z,w)=G(z+w)-G(z)-DG(z)w,\quad \forall w\in Y,\ z+w\in U \end{equation}

Then, by assumption, since FF is differentiable at xx and GG is differentiable at zz,

limyX0R(x,y)YyX=0=limwY0R1(z,w)ZwY \begin{equation} \lim \limits_{\|y\|_{X} \to 0} \frac{\| R(x,y)\|_{Y}}{\|y\|_{X}}=0= \lim \limits_{\|w\|_{Y} \to 0} \frac{\| R_{1}(z,w)\|_{Z}}{\|w\|_{Y}} \end{equation}

Moreover, by (1)(1), for x+yΩx+y\in \Omega which is yXy\in X,

H(x+y)= G(F(x+y))= G(F(x)+DF(x)y+R(x,y)) \begin{align*} H(x+y) =&\ G\big( F(x+y) \big) \\ =&\ G\big( F(x)+DF(x)y+R(x,y) \big) \end{align*}

If we assume DF(x)y+R(x,y)=WDF(x)y+R(x,y)=W^{\prime}, since GG is linear and due to z=F(x)z=F(x), by (2)(2),

H(x+y)= G(z+W)= G(z)+DG(z)W+R1(z,W)= G(z)+DG(z)(DF(x)y+R(x,y))+R1(z,DF(x)y+R(x,y))= H(x)+DG(z)DF(x)y+DG(z)R(x,y)+R1(z,DF(x)y+R(x,t)) \begin{align*} H(x+y) =&\ G(z+W^{\prime}) \\ =&\ G(z)+DG(z)W^{\prime}+R_{1}(z,W^{\prime}) \\ =&\ G(z)+DG(z)\big( DF(x)y+ R(x,y) \big) + R_{1}(z, DF(x)y+R(x,y) \big) \\ =&\ H(x)+DG(z)DF(x)y+DG(z)R(x,y)+ R_{1}\big(z,DF(x)y+R(x,t) \big) \tag{4} \end{align*}

Let the last two terms be R2(x,y)R_2(x,y), and assume ff is as follows.

R2(x,y)=DG(z)R(x,y)+R1(z,DF(x)y+R(x,y))Z R_2(x,y)=DG(z)R(x,y)+R_{1}\big( z, DF(x)y +R(x,y) \big) \in Z

f(w)={R1(z,w)ZwYwY,z+wU,w00w=0 f(w) = \begin{cases} \dfrac{ \| R_{1}(z,w) \|_{Z}}{\|w\|_{Y}} \quad & \forall w \in Y, z+w\in U, w \ne 0 \\ 0 & w=0 \end{cases}

Then, you can verify that limy0R2(x,y)ZyX=0\lim \limits_{\| y\| \to 0} \dfrac{\|R_2(x,y)\|_{Z}}{\|y \|_{X}}=0 holds. By the definition of the norm, the triangle inequality holds, and since LxLx \|L x\|\le \|L\| \|x\|,

R2(x,y)ZyXDG(z)R(x,y)ZyX+R1(z,DF(x)y+R(x,y))ZyXDG(z)R(x,y)YyX+R1(z,DF(x)y+R(x,y))ZyX \begin{align*} \frac{\| R_2(x,y) \|_{Z}}{\|y \|_{X}} \color{red}{\le}& \frac{\| DG(z)R(x,y) \|_{Z} }{\| y\|_{X}} +\frac{\|R_{1} \big( z, DF(x)y+R(x,y) \big)\|_{Z}}{\|y\|_{X}} \\[1em] \color{green}{\le}& \|DG(z)\| \frac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +\frac{\|R_{1} \big(z, DF(x)y+R(x,y) \big)\|_{Z}}{\|y\|_{X}} \end{align*}

Also, due to the definition of ff and the triangle inequality,

DG(z)R(x,y)YyX+R1(z,DF(x)y+R(x,y))ZyX= DG(z)R(x,y)YyX+R1(z,DF(x)y+R(x,y))ZDF(x)y+R(x,y)YDF(x)y+R(x,y)YyX=DG(z)R(x,y)YyX+f(DF(x)y+R(x,y))DF(x)y+R(x,y)YyXDG(z)R(x,y)YyX+f(DF(x)y+R(x,y))[DF(x)yYyX+R(x,y)YyX]DG(z)R(x,y)YyX+f(DF(x)y+R(x,y))[DF(x)yXyX+R(x,y)YyX] \begin{array}{ll} & \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +\dfrac{\|R_{1} \big(z, DF(x)y+R(x,y) \big)\|_{Z}}{\|y\|_{X}} \\[1.5em] =&\ \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +\dfrac{\|R_{1} \big(z, DF(x)y+R(x,y) \big)\|_{Z}}{\|DF(x)y +R(x,y)\|_{Y}}\dfrac{\|DF(x)y +R(x,y)\|_{Y}}{\|y\|_{X}} \\[1.5em] \color{magenta}{=}& \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +f\big( DF(x)y +R(x,y) \big)\dfrac{\|DF(x)y +R(x,y)\|_{Y}}{\|y\|_{X}} \\[1.5em] \color{red}{\le}& \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +f\big( DF(x)y +R(x,y) \big)\Bigg[\dfrac{\|DF(x)y\|_{Y}}{\|y\|_{X}} +\dfrac{\|R(x,y)\|_{Y}}{\|y\|_{X}} \Bigg] \\[1.5em] \color{green}{\le}& \|DG(z)\| \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}} +f\big( DF(x)y +R(x,y) \big)\Bigg[\|DF(x)\|\dfrac{\|y\|_{X}}{\|y\|_{X}} +\dfrac{\|R(x,y)\|_{Y}}{\|y\|_{X}} \Bigg] \end{array}

Firstly, since limyX0R(x,y)YyX=0\lim \limits_{\| y\|_{X} \to 0} \dfrac{\| R(x,y)\|_{Y}}{\| y\|_{X}}=0, the first term is 00 when y0\| y\| \to 0. According to (3)(3) and the definition of ff, when DF(x)y+R(x,y)0DF(x)y+R(x,y) \to 0, it is f0f \to 0. As we assume differentiability, and since DF(x)DF(x) is bounded linear, when y0\|y\| \to 0, it is DF(x)y0DF(x)y \to 0. Also, the very last term also converges to 00 by the assumption of differentiability. Therefore,

limy0R2(x,y)ZyXDG(z)0+0[DF(x)+0]=0 \lim \limits_{\| y\| \to 0} \frac{\| R_2(x,y) \|_{Z}}{\|y \|_{X}}\le \|DG(z) \| \cdot 0 + 0\cdot \Big[ \|DF(x)\| + 0 \Big] =0

Applying this result to (4)(4),

H(x+y)H(x)+DG(z)DF(x)y=R2(x,y) H(x+y)-H(x)+DG(z)DF(x)y=R_2(x,y)

    H(x+y)H(x)+DG(z)DF(x)yZyX=R2(x,y)ZyX \implies \frac{\left\|H(x+y)-H(x)+DG(z)DF(x)y\right\|_{Z}}{\|y\|_{X}}=\frac{\left\| R_2(x,y)\right\|_{Z} }{\|y\|_{X}}

    limyX0H(x+y)H(x)+DG(z)DF(x)yZyX=limyX0R2(x,y)ZyX=0 \implies \lim \limits_{\|y\|_{X} \to 0}\frac{\left\|H(x+y)-H(x)+DG(z)DF(x)y\right\|_{Z}}{\|y\|_{X}}=\lim \limits_{\|y\|_{X} \to 0}\frac{\left\| R_2(x,y)\right\|_{Z} }{\|y\|_{X}}=0

Therefore, based on the definition of differentiability, HH is differentiable at xΩx\in \Omega, and the derivative of HH is

DH(x)=DG(z)DF(x) DH(x)=DG(z)DF(x)