logo

Gradient of Scalar Function in Cartesian Coordinate System 📂Mathematical Physics

Gradient of Scalar Function in Cartesian Coordinate System

Definition

For a scalar function f=f(x,y,z)f=f(x,y,z), the following vector function is defined as the gradient of ff, denoted by f\nabla f:

f:=fxx^+fyy^+fzz^=(fx,fy,fz) \nabla f := \frac{ \partial f}{ \partial x }\hat{\mathbf{x}}+\frac{ \partial f}{ \partial y}\hat{\mathbf{y}}+\frac{ \partial f}{ \partial z}\hat{\mathbf{z}} = \left( \dfrac{\partial f}{\partial x}, \dfrac{\partial f}{\partial y}, \dfrac{\partial f}{\partial z} \right)

Explanation

The gradient is translated into English as gradient, slope, or incline. The terms ‘slope’ and ‘incline’ are old translations of the gradient and are not commonly used nowadays. Also, ‘slope’ is a Sino-Korean word for gradient, so it’s essentially the same. The gradient is actually a vector, so the term ‘slope’ seems insufficient to fully capture the meaning of the gradient. Here at Sashimi Sushi, we prefer to use the term ‘gradient’ consistently.

Geometrically, f\nabla f represents the direction in which ff changes most rapidly. In other words, the direction in which the rate of increase of ff is highest at the point (x,y,z)(x,y,z) is the vector (f(x,y,z)x,f(x,y,z)y,f(x,y,z)z)\left( \dfrac{\partial f(x,y,z)}{\partial x}, \dfrac{\partial f(x,y,z)}{\partial y}, \dfrac{\partial f(x,y,z)}{\partial z} \right). This is just an extension of the concept of differential coefficients to multiple dimensions. If ff is increasing, the differential coefficient is positive; if ff is decreasing, the coefficient is negative.

Meanwhile, it’s important to note that (fx,fy,fz)\left( \dfrac{\partial f}{\partial x}, \dfrac{\partial f}{\partial y}, \dfrac{\partial f}{\partial z} \right) is denoted as f\nabla f in the definition. While \nabla is called the del operator, thinking of it as having its own meaning can lead to misunderstandings, such as misinterpreting F\nabla \cdot \mathbf{F} or ×F\nabla \times \mathbf{F} as dot products or cross products. Thus, \nabla should be understood merely as a convenient notation, and it’s better to think of the gradient, divergence, and curl collectively as del operators, or even to consider the del operator as equivalent to the gradient. More details will follow below.

Points of Attention

f\nabla f is not the product of \nabla and ff

An important aspect of understanding the gradient is recognizing that f\nabla f is not the product of the vector =(x,y,z)\nabla = \left( \frac{\partial }{\partial x}, \frac{\partial }{\partial y}, \frac{\partial }{\partial z} \right) and the scalar ff. It may seem intuitive and appealing to interpret it this way, but it’s actually the opposite. \nabla is presented as a vector like (x,y,z)\left( \frac{\partial }{\partial x}, \frac{\partial }{\partial y}, \frac{\partial }{\partial z} \right) to make it appear like a product of a vector and a scalar. If f\nabla f were really the product of the vector \nabla and the scalar ff, then, since the product of a vector and a scalar is commutative, the following strange equation would hold:

f=(fx,fy,fz)=?(fx,fy,fz)=f \nabla f = \left( \dfrac{\partial f}{\partial x}, \dfrac{\partial f}{\partial y}, \dfrac{\partial f}{\partial z} \right) \overset{?}{=} \left( f\dfrac{\partial }{\partial x}, f\dfrac{\partial }{\partial y}, f\dfrac{\partial }{\partial z} \right) = f\nabla

This odd result arises because \nabla is not actually a vector, and f\nabla f is not a product of a vector and a scalar. \nabla is an operator that maps the scalar function f(x,y,z)f(x,y,z) to the vector function (f(x,y,z)x,f(x,y,z)y,f(x,y,z)z)\left( \frac{\partial f(x,y,z)}{\partial x}, \frac{\partial f(x,y,z)}{\partial y}, \frac{\partial f(x,y,z)}{\partial z} \right). Let’s define a function grad\operatorname{grad} that takes ff as a variable like this:

grad(f)=(fx,fy,fz),f=f(x,y,z) \begin{equation} \operatorname{grad} (f) = \left( \dfrac{\partial f}{\partial x}, \dfrac{\partial f}{\partial y}, \dfrac{\partial f}{\partial z} \right), \quad f=f(x,y,z) \end{equation}

In this definition, there is no need for explanations about the product of a vector and a scalar. grad\operatorname{grad} is just a function (operator) that, when given the variable ff, follows the rule in (1)(1) to determine its function value. However, grad(f)\operatorname{grad} (f)’s function value, when denoted as grad=\operatorname{grad} = \nabla, becomes a convenient and intuitive notation, and it’s helpful to explain it as a vector =(x,y,z)\nabla = \left( \frac{\partial }{\partial x}, \frac{\partial }{\partial y}, \frac{\partial }{\partial z} \right).

Similarly to how Leibniz’s notation for differentiation isn’t an exact explanation of the underlying principle but is used for convenience and ease of understanding, f\nabla f also appears as a product of a vector and a scalar for computational convenience, though that’s not its true nature.

What about ff\nabla?

Following the explanation above, since \nabla is a function, f=(f)\nabla f = \nabla(f) represents the function value obtained when the variable ff is substituted into the function \nabla. On the other hand, ff \nabla is a function in itself, which when another function gg is substituted as a variable, maps to a function value as follows:

(f)(g)=f(gx,gy,gz)=(fgx,fgy,fgz) (f\nabla) (g) = f\left( \dfrac{\partial g}{\partial x}, \dfrac{\partial g}{\partial y}, \dfrac{\partial g}{\partial z} \right) = \left( f\dfrac{\partial g}{\partial x}, f\dfrac{\partial g}{\partial y}, f\dfrac{\partial g}{\partial z} \right)

Of course, when looking at the function value fgf \nabla g, it can be seen as substituting gg into ff \nabla, or as the product of the scalar function ff and the vector function g\nabla g.

Derivation

1-Dimension

1.png

Look at the above picture. The differential coefficient of f1f_{1} at point x=2x=2 is 44. This value not only tells how much the function f1f_{1} is inclined at x=2x=2, but also indicates that the graph of f1f_{1} increases in the direction where xx increases, as suggested by the ++ sign in front of 44. Therefore, the differential coefficient 44 should be understood not merely as a scalar, but as a 1-dimensional vector 4x^4\hat{\mathbf{x}}.

2.png

Similarly, the differential coefficient of f2f_{2} at x=2x=2 is 3-3. This includes the meaning that the inclination is 33 and also implies that as xx increases, the graph of f2f_{2} decreases. In other words, if we think of the sign as indicating direction, the direction of the differential coefficient points towards where the graph of the function increases. Put differently, following the direction indicated by the differential coefficient leads to the peak of the graph.

Before extending to 3 dimensions, recall that the differential coefficient of yy at xx, dydx=a\dfrac{ d y}{ d x}=a, can be written as if it were a fraction. Although this is not a mathematically rigorous way to handle differentiation, it helps understand the geometric meaning and has its advantages. Leibniz thought of dydy and dxdx as very small changes in yy and xx, respectively, and called the ratio between these changes the differential coefficient.

dy=adx dy=adx

As an aside, this helps understand why aa is called a differential ‘coefficient’.

3D

Now, let’s assume a 3D scalar function f=f(x,y,z)f=f(x,y,z) and a position vector r=xx^+yy^+zz^\mathbf{r}=x\hat{\mathbf{x}}+y\hat{\mathbf{y}}+z\hat{\mathbf{z}} are given. The change in ff is expressed through total differentiation.

df=fxdx+fydy+fzdz \begin{equation} df=\frac{ \partial f}{ \partial x }dx + \frac{ \partial f}{ \partial y}dy+\frac{ \partial f}{ \partial z}dz \end{equation}

The change in r\mathbf{r} is as follows:

dr=dxx^+dyy^+dzz^ d\mathbf{r}=dx\hat{\mathbf{x}}+dy\hat{\mathbf{y}}+dz\hat{\mathbf{z}}

Now, as in the 1D case, let’s find something that represents the ratio between dfdf and drd\mathbf{r}. Since dfdf is a scalar and drd\mathbf{r} is a vector, that ‘something’ must be a vector, and dfdf can be imagined as the dot product of that ‘something’ with drd\mathbf{r}. Therefore, let’s denote that ‘something’ as a=a1x^+a2y^+a3z^\mathbf{a}=a_{1}\hat{\mathbf{x}}+a_{2}\hat{\mathbf{y}}+a_{3}\hat{\mathbf{z}} and express it as follows:

df=adr=(a1x^+a2y^+a3z^)(dxx^+dyy^+dzz^)=a1dx+a2dy+a3dz \begin{align*} df=\mathbf{a}\cdot d\mathbf{r}&=(a_{1}\hat{\mathbf{x}}+a_{2}\hat{\mathbf{y}}+a_{3}\hat{\mathbf{z}})\cdot(dx\hat{\mathbf{x}}+dy\hat{\mathbf{y}}+dz\hat{\mathbf{z}}) \\ &= a_{1}dx+a_{2}dy+a_{3}dz \end{align*}

Comparing this with (2)(2) yields the following result:

a=fxx^+fyy^+fzz^ \mathbf{a}=\frac{ \partial f}{ \partial x}\hat{\mathbf{x}}+\frac{ \partial f}{ \partial y}\hat{\mathbf{y}}+\frac{ \partial f}{ \partial z}\hat{\mathbf{z}}

From now on, let’s denote this vector a\mathbf{a} as f\nabla f and call it the gradient of ff. The direction of the gradient points to where the function ff increases most significantly, and its magnitude represents the extent of this increase.

  • Linearity:

    (f+g)=f+g \nabla (f + g) = \nabla f + \nabla g

  • Product Rule:

    (fg)=fg+gf \nabla{(fg)}=f\nabla{g}+g\nabla{f} (AB)=A×(×B)+B×(×A)+(A)B+(B)A \nabla(\mathbf{A} \cdot \mathbf{B}) = \mathbf{A} \times (\nabla \times \mathbf{B}) + \mathbf{B} \times (\nabla \times \mathbf{A})+(\mathbf{A} \cdot \nabla)\mathbf{B}+(\mathbf{B} \cdot \nabla) \mathbf{A}

  • Second Derivative:

    (T)=2Tx2+2Ty2+2Tz2 \nabla \cdot (\nabla T) = \dfrac{\partial^{2} T}{\partial x^{2}} + \dfrac{\partial ^{2} T} {\partial y^{2}} + \dfrac{\partial ^{2} T}{\partial z^{2}} ×(T)=0 \nabla \times (\nabla T)= \mathbf{0} (A)\nabla (\nabla \cdot \mathbf{A} )

  • Fundamental Theorem of Gradient

    T(b)T(a)=ab(T)dl T(b)-T(a) = \int _{a}^{b} (\nabla T) \cdot d\mathbf{l}

  • Integration Formulas

    V(T)dτ=STda \int_{\mathcal{V}} (\nabla T) d \tau = \oint_{\mathcal{S}} T d \mathbf{a} V[T2U+(T)(U)]dτ=S(TU)da \int_{\mathcal{V}} \left[ T \nabla^{2} U + (\nabla T) \cdot (\nabla U) \right] d \tau = \oint_{\mathcal{S}} (T \nabla U) \cdot d \mathbf{a} V(T2UU2T)dτ=S(TUUT)da \int_{\mathcal{V}} \left( T \nabla^{2} U - U \nabla^{2} T \right) d \tau = \oint_{\mathcal{S}} \left( T \nabla U - U \nabla T \right) \cdot d \mathbf{a} ST×da=PTdl \int_{\mathcal{S}} \nabla T \times d \mathbf{a} = - \oint_{\mathcal{P}} T d \mathbf{l}

  • Partial Integration

    VA(f)dτ=SfAdaVf(A)dτ \int_{\mathcal{V}}\mathbf{A} \cdot (\nabla f)d\tau = \oint_{\mathcal{S}}f\mathbf{A} \cdot d \mathbf{a}-\int_{\mathcal{V}}f(\nabla \cdot \mathbf{A})d\tau Sf(×A)Ada=S[A×(f)]da+PfAdl \int_{\mathcal{S}} f \left( \nabla \times \mathbf{A} \right)\mathbf{A} \cdot d \mathbf{a} = \int_{\mathcal{S}} \left[ \mathbf{A} \times \left( \nabla f \right) \right] \cdot d\mathbf{a} + \oint_{\mathcal{P}} f\mathbf{A} \cdot d\mathbf{l}

See Also