logo

The Chain Rule of Differentiation in Analysis 📂Analysis

The Chain Rule of Differentiation in Analysis

Theorem1

If $f :[a,b] \to \mathbb{R}$ is a continuous function and is differentiable at $x\in [a,b]$, and if $g : f([a,b])\to \mathbb{R}$ is differentiable at $f (x)\in f([a,b])$, and if we define $h : [a,b] \to \mathbb{R}$ as follows.

$$ h(t)=g\left( f(t) \right)\quad (a\le t \le b) $$

Then, $h$ is differentiable at $x$ and its value is as follows.

$$ h^{\prime}(x)=g^{\prime}(f(x))f^{\prime}(x) $$ Using the composite function symbol, it can be represented as: $$ ( g \circ f)^{\prime}(x)=g^{\prime}(f(x))f^{\prime}(x) $$

Explanation

This result is commonly referred to as the chain rule.

Here, $f^{\prime}(x)$ is also called the inner derivative. If we denote $y=f(x)$, $z=g(y)$, and represent it using Leibniz’s notation, it can be expressed as follows. $$ \frac{dz}{dx}=\frac{dz}{dy}\frac{dy}{dx} $$

The reason why Leibniz’s notation is convenient is because the left side of the above equation looks as though it’s being simplified like the right side. $\dfrac{dy}{dx}$ is not “dx over dy” but the derivative of $y$, yet treating it like a fraction perfectly fits its meaning.

Proof

First, let’s define the function $G$ as follows.

$$ G(f(t)) :=\begin{cases} \frac{g(f(x))-g(f(t))}{f(x)-f(t)} -g^{\prime}(f(x)) & f(t) \ne f(x) \\ 0 & f(t)=f(x)\end{cases},\quad (t\in[a,b]) $$

Then, for all $f(t)$, the following holds.

$$ \lim \limits_{ f(s) \to f(t) } G(f(s))=G(f(t)) $$

Since this is a condition for continuity, $G$ is a continuous function. Furthermore, the following holds.

$$ h(x)-h(t) = g(f(x))-g(f(t))=\Big( f(x)-f(t) \Big) \Big( g^{\prime}(f(x))+G(f(t)) \Big) $$

Then, by the properties of limits, the equation below holds.

$$ \begin{align*} h^{\prime}(x) =&\ \lim \limits_{t \to x} \frac{ h(x)-h(t)}{x-t} \\ =&\ \lim \limits_{t \to x} \frac{ \Big( f(x)-f(t) \Big) \Big( g^{\prime}(f(x))+G(f(t)) \Big)}{x-t} \\ =&\ \lim \limits_{t \to x} \left[ g^{\prime}(f(x))\frac{ f(x)-f(t) }{x-t}+G(f(t))\frac{f(x)-f(t) }{x-t} \right] \\ =&\ \lim \limits_{t \to x} \left[ g^{\prime}(f(x))\frac{ f(x)-f(t) }{x-t}\right]+\lim \limits_{t \to x} \left[G(f(t))\frac{ f(x)-f(t) }{x-t} \right] \\ =&\ \lim \limits_{t \to x} g^{\prime}(f(x))\lim \limits_{t \to x}\frac{ f(x)-f(t) }{x-t}+\lim \limits_{t \to x}G(f(t))\lim \limits_{t \to x}\frac{ f(x)-f(t) }{x-t} \\ =&\ g^{\prime}(f(x))f^{\prime}(x)+0\cdot f^{\prime}(x) \\ =&\ g^{\prime}(f(x))f^{\prime}(x) \end{align*} $$


  1. Walter Rudin, Principles of Mathmatical Analysis (3rd Edition, 1976), p105 ↩︎