logo

The Chain Rule of Differentiation in Analysis 📂Analysis

The Chain Rule of Differentiation in Analysis

Theorem1

If f:[a,b]Rf :[a,b] \to \mathbb{R} is a continuous function and is differentiable at x[a,b]x\in [a,b], and if g:f([a,b])Rg : f([a,b])\to \mathbb{R} is differentiable at f(x)f([a,b])f (x)\in f([a,b]), and if we define h:[a,b]Rh : [a,b] \to \mathbb{R} as follows.

h(t)=g(f(t))(atb) h(t)=g\left( f(t) \right)\quad (a\le t \le b)

Then, hh is differentiable at xx and its value is as follows.

h(x)=g(f(x))f(x) h^{\prime}(x)=g^{\prime}(f(x))f^{\prime}(x) Using the composite function symbol, it can be represented as: (gf)(x)=g(f(x))f(x) ( g \circ f)^{\prime}(x)=g^{\prime}(f(x))f^{\prime}(x)

Explanation

This result is commonly referred to as the chain rule.

Here, f(x)f^{\prime}(x) is also called the inner derivative. If we denote y=f(x)y=f(x), z=g(y)z=g(y), and represent it using Leibniz’s notation, it can be expressed as follows. dzdx=dzdydydx \frac{dz}{dx}=\frac{dz}{dy}\frac{dy}{dx}

The reason why Leibniz’s notation is convenient is because the left side of the above equation looks as though it’s being simplified like the right side. dydx\dfrac{dy}{dx} is not “dx over dy” but the derivative of yy, yet treating it like a fraction perfectly fits its meaning.

Proof

First, let’s define the function GG as follows.

G(f(t)):={g(f(x))g(f(t))f(x)f(t)g(f(x))f(t)f(x)0f(t)=f(x),(t[a,b]) G(f(t)) :=\begin{cases} \frac{g(f(x))-g(f(t))}{f(x)-f(t)} -g^{\prime}(f(x)) & f(t) \ne f(x) \\ 0 & f(t)=f(x)\end{cases},\quad (t\in[a,b])

Then, for all f(t)f(t), the following holds.

limf(s)f(t)G(f(s))=G(f(t)) \lim \limits_{ f(s) \to f(t) } G(f(s))=G(f(t))

Since this is a condition for continuity, GG is a continuous function. Furthermore, the following holds.

h(x)h(t)=g(f(x))g(f(t))=(f(x)f(t))(g(f(x))+G(f(t))) h(x)-h(t) = g(f(x))-g(f(t))=\Big( f(x)-f(t) \Big) \Big( g^{\prime}(f(x))+G(f(t)) \Big)

Then, by the properties of limits, the equation below holds.

h(x)= limtxh(x)h(t)xt= limtx(f(x)f(t))(g(f(x))+G(f(t)))xt= limtx[g(f(x))f(x)f(t)xt+G(f(t))f(x)f(t)xt]= limtx[g(f(x))f(x)f(t)xt]+limtx[G(f(t))f(x)f(t)xt]= limtxg(f(x))limtxf(x)f(t)xt+limtxG(f(t))limtxf(x)f(t)xt= g(f(x))f(x)+0f(x)= g(f(x))f(x) \begin{align*} h^{\prime}(x) =&\ \lim \limits_{t \to x} \frac{ h(x)-h(t)}{x-t} \\ =&\ \lim \limits_{t \to x} \frac{ \Big( f(x)-f(t) \Big) \Big( g^{\prime}(f(x))+G(f(t)) \Big)}{x-t} \\ =&\ \lim \limits_{t \to x} \left[ g^{\prime}(f(x))\frac{ f(x)-f(t) }{x-t}+G(f(t))\frac{f(x)-f(t) }{x-t} \right] \\ =&\ \lim \limits_{t \to x} \left[ g^{\prime}(f(x))\frac{ f(x)-f(t) }{x-t}\right]+\lim \limits_{t \to x} \left[G(f(t))\frac{ f(x)-f(t) }{x-t} \right] \\ =&\ \lim \limits_{t \to x} g^{\prime}(f(x))\lim \limits_{t \to x}\frac{ f(x)-f(t) }{x-t}+\lim \limits_{t \to x}G(f(t))\lim \limits_{t \to x}\frac{ f(x)-f(t) }{x-t} \\ =&\ g^{\prime}(f(x))f^{\prime}(x)+0\cdot f^{\prime}(x) \\ =&\ g^{\prime}(f(x))f^{\prime}(x) \end{align*}


  1. Walter Rudin, Principles of Mathmatical Analysis (3rd Edition, 1976), p105 ↩︎