logo

Definition of Directional Derivative 📂Vector Analysis

Definition of Directional Derivative

Buildup

Let’s say a multivariable function f=RnRf = \mathbb{R}^{n} \to \mathbb{R} is given. When trying to calculate the derivative of ff, unlike the case with a univariable function, one must consider the rate of change in ‘which direction’. A familiar example is the partial derivative. The partial derivative considers the rate of change with respect to only one variable. For instance, the partial derivative fy\dfrac{\partial f}{\partial y} of f=f(x,y,z)f=f(x,y,z) with respect to the variable yy takes into account the change in the function value of ff only in the direction of (0,1,0)(0,1,0).

The directional derivative is a concept meant to think about the rate of change towards any given direction, rather than in the directions of each variable separately.

Definition1

Let’s assume a multivariable function f=RnRf = \mathbb{R}^{n} \to \mathbb{R} and a unit vector uRn\mathbf{u} \in \mathbb{R}^{n} are given. If the following limit exists, it is called the directional derivative of ff in the direction u\mathbf{u} at x\mathbf{x} and is denoted by uf(x)\nabla_{\mathbf{u}}f(\mathbf{x}).

uf(x):=limt0f(x+tu)f(x)t \nabla _{\mathbf{u}} f(\mathbf{x}) := \lim \limits _{t \to 0} \dfrac{f (\mathbf{x} + t \mathbf{u}) - f(\mathbf{x})}{t}

Explanation

Partial Differentiation

fxi(x)=limt0f(x+tei)f(x)t \dfrac{\partial f}{\partial x_{i}}(\mathbf{x}) = \lim \limits _{t \to 0} \dfrac{f (\mathbf{x} + t \mathbf{e}_{i}) - f(\mathbf{x})}{t}

The definition of the directional derivative only differs from that of partial differentiation in that the ei\mathbf{e}_{i}, which signifies the direction of each variable, is replaced by any arbitrary direction u\mathbf{u}. By generalizing it this way, one can see that the partial derivative is a special case of the directional derivative.

The following notations are used.

uf(x)=fu(x)=Duf(x)=uf(x)=fu(x) \nabla _{\mathbf{u}} f(\mathbf{x}) = f_{\mathbf{u}}^{\prime}(\mathbf{x}) = D _{\mathbf{u}} f(\mathbf{x}) = \partial_{\mathbf{u}}f(\mathbf{x}) = \dfrac{\partial f}{\partial \mathbf{u}}(\mathbf{x})

Let’s assume there’s a fixed unit vector u\mathbf{u}. Then, every time ff is given, uf\nabla _{\mathbf{u}}f is determined, which means the vector u\mathbf{u} itself can be considered an operator. Therefore, notations such as uf\mathbf{u}f or u[f]\mathbf{u}[f] are also used. Especially in differential geometry, tangent vectors are treated as operators, and it’s thought that “tangent vector = differentiation”. Refer to See Also for more information.

From the theorem introduced below, it can be understood that the directional derivative can be expressed by partial derivatives.

Furthermore, it can be shown that the value of the directional derivative is greatest when u\mathbf{u} is in the same direction as the gradient f\nabla f, hence the direction of f\nabla f is the same as that of the direction in which the rate of change of ff is the highest. Thus, it can be considered that the gradient notation does not have a subscript in \nabla because it is the directional derivative in ’that highest rate of change direction’.

Theorem

The following equation holds between the directional derivative uf\nabla _{\mathbf{u}} f of ff and its gradient f\nabla f.

uf=fu=fx1u1+fx2u2++fxnun \nabla _{\mathbf{u}} f = \nabla f \cdot \mathbf{u} = \dfrac{\partial f}{\partial x_{1}} u_{1} + \dfrac{\partial f}{\partial x_{2}} u_{2} + \dots + \dfrac{\partial f}{\partial x_{n}} u_{n}

Proof

Let’s say g(t)=f(x+tu)g (t) = f (\mathbf{x} + t \mathbf{u}). If we find the derivative of gg, since the derivative of a scalar function is the gradient, by the chain rule,

g(t)=f(x+tu)u=f(x+tu)u g^{\prime} (t) = f ^{\prime} (\mathbf{x} + t \mathbf{u}) \cdot \mathbf{u} = \nabla f (\mathbf{x} + t \mathbf{u}) \cdot \mathbf{u}

Then we obtain the following.

g(0)=f(x)u g^{\prime} (0) = \nabla f (\mathbf{x}) \cdot \mathbf{u}

Furthermore, by the definition of the directional derivative, the following holds.

uf=limt0f(x+tu)f(x)t=limt0g(t)g(0)t=g(0)=f(x)u \nabla _{\mathbf{u}} f = \lim \limits _{t \to 0} \dfrac{f (\mathbf{x} + t \mathbf{u}) - f(\mathbf{x})}{t} = \lim \limits _{t \to 0} \dfrac{ g(t) - g(0)}{t} = g^{\prime}(0) = \nabla f (\mathbf{x}) \cdot \mathbf{u}

See Also


  1. Walter Rudin, Principles of Mathmatical Analysis (3rd Edition, 1976), p216-218 ↩︎