Gradient of Scalar Function in Cartesian Coordinate System
Definition
For a scalar function $f=f(x,y,z)$, the following vector function is defined as the gradient of $f$, denoted by $\nabla f$:
$$ \nabla f := \frac{ \partial f}{ \partial x }\hat{\mathbf{x}}+\frac{ \partial f}{ \partial y}\hat{\mathbf{y}}+\frac{ \partial f}{ \partial z}\hat{\mathbf{z}} = \left( \dfrac{\partial f}{\partial x}, \dfrac{\partial f}{\partial y}, \dfrac{\partial f}{\partial z} \right) $$
Explanation
The gradient is translated into English as gradient, slope, or incline. The terms ‘slope’ and ‘incline’ are old translations of the gradient and are not commonly used nowadays. Also, ‘slope’ is a Sino-Korean word for gradient, so it’s essentially the same. The gradient is actually a vector, so the term ‘slope’ seems insufficient to fully capture the meaning of the gradient. Here at Sashimi Sushi, we prefer to use the term ‘gradient’ consistently.
Geometrically, $\nabla f$ represents the direction in which $f$ changes most rapidly. In other words, the direction in which the rate of increase of $f$ is highest at the point $(x,y,z)$ is the vector $\left( \dfrac{\partial f(x,y,z)}{\partial x}, \dfrac{\partial f(x,y,z)}{\partial y}, \dfrac{\partial f(x,y,z)}{\partial z} \right)$. This is just an extension of the concept of differential coefficients to multiple dimensions. If $f$ is increasing, the differential coefficient is positive; if $f$ is decreasing, the coefficient is negative.
Meanwhile, it’s important to note that $\left( \dfrac{\partial f}{\partial x}, \dfrac{\partial f}{\partial y}, \dfrac{\partial f}{\partial z} \right)$ is denoted as $\nabla f$ in the definition. While $\nabla$ is called the del operator, thinking of it as having its own meaning can lead to misunderstandings, such as misinterpreting $\nabla \cdot \mathbf{F}$ or $\nabla \times \mathbf{F}$ as dot products or cross products. Thus, $\nabla$ should be understood merely as a convenient notation, and it’s better to think of the gradient, divergence, and curl collectively as del operators, or even to consider the del operator as equivalent to the gradient. More details will follow below.
Points of Attention
$\nabla f$ is not the product of $\nabla$ and $f$
An important aspect of understanding the gradient is recognizing that $\nabla f$ is not the product of the vector $\nabla = \left( \frac{\partial }{\partial x}, \frac{\partial }{\partial y}, \frac{\partial }{\partial z} \right)$ and the scalar $f$. It may seem intuitive and appealing to interpret it this way, but it’s actually the opposite. $\nabla$ is presented as a vector like $\left( \frac{\partial }{\partial x}, \frac{\partial }{\partial y}, \frac{\partial }{\partial z} \right)$ to make it appear like a product of a vector and a scalar. If $\nabla f$ were really the product of the vector $\nabla$ and the scalar $f$, then, since the product of a vector and a scalar is commutative, the following strange equation would hold:
$$ \nabla f = \left( \dfrac{\partial f}{\partial x}, \dfrac{\partial f}{\partial y}, \dfrac{\partial f}{\partial z} \right) \overset{?}{=} \left( f\dfrac{\partial }{\partial x}, f\dfrac{\partial }{\partial y}, f\dfrac{\partial }{\partial z} \right) = f\nabla $$
This odd result arises because $\nabla$ is not actually a vector, and $\nabla f$ is not a product of a vector and a scalar. $\nabla$ is an operator that maps the scalar function $f(x,y,z)$ to the vector function $\left( \frac{\partial f(x,y,z)}{\partial x}, \frac{\partial f(x,y,z)}{\partial y}, \frac{\partial f(x,y,z)}{\partial z} \right)$. Let’s define a function $\operatorname{grad}$ that takes $f$ as a variable like this:
$$ \begin{equation} \operatorname{grad} (f) = \left( \dfrac{\partial f}{\partial x}, \dfrac{\partial f}{\partial y}, \dfrac{\partial f}{\partial z} \right), \quad f=f(x,y,z) \end{equation} $$
In this definition, there is no need for explanations about the product of a vector and a scalar. $\operatorname{grad}$ is just a function (operator) that, when given the variable $f$, follows the rule in $(1)$ to determine its function value. However, $\operatorname{grad} (f)$’s function value, when denoted as $\operatorname{grad} = \nabla$, becomes a convenient and intuitive notation, and it’s helpful to explain it as a vector $\nabla = \left( \frac{\partial }{\partial x}, \frac{\partial }{\partial y}, \frac{\partial }{\partial z} \right)$.
Similarly to how Leibniz’s notation for differentiation isn’t an exact explanation of the underlying principle but is used for convenience and ease of understanding, $\nabla f$ also appears as a product of a vector and a scalar for computational convenience, though that’s not its true nature.
What about $f\nabla$?
Following the explanation above, since $\nabla$ is a function, $\nabla f = \nabla(f)$ represents the function value obtained when the variable $f$ is substituted into the function $\nabla$. On the other hand, $f \nabla$ is a function in itself, which when another function $g$ is substituted as a variable, maps to a function value as follows:
$$ (f\nabla) (g) = f\left( \dfrac{\partial g}{\partial x}, \dfrac{\partial g}{\partial y}, \dfrac{\partial g}{\partial z} \right) = \left( f\dfrac{\partial g}{\partial x}, f\dfrac{\partial g}{\partial y}, f\dfrac{\partial g}{\partial z} \right) $$
Of course, when looking at the function value $f \nabla g$, it can be seen as substituting $g$ into $f \nabla$, or as the product of the scalar function $f$ and the vector function $\nabla g$.
Derivation
1-Dimension
Look at the above picture. The differential coefficient of $f_{1}$ at point $x=2$ is $4$. This value not only tells how much the function $f_{1}$ is inclined at $x=2$, but also indicates that the graph of $f_{1}$ increases in the direction where $x$ increases, as suggested by the $+$ sign in front of $4$. Therefore, the differential coefficient $4$ should be understood not merely as a scalar, but as a 1-dimensional vector $4\hat{\mathbf{x}}$.
Similarly, the differential coefficient of $f_{2}$ at $x=2$ is $-3$. This includes the meaning that the inclination is $3$ and also implies that as $x$ increases, the graph of $f_{2}$ decreases. In other words, if we think of the sign as indicating direction, the direction of the differential coefficient points towards where the graph of the function increases. Put differently, following the direction indicated by the differential coefficient leads to the peak of the graph.
Before extending to 3 dimensions, recall that the differential coefficient of $y$ at $x$, $\dfrac{ d y}{ d x}=a$, can be written as if it were a fraction. Although this is not a mathematically rigorous way to handle differentiation, it helps understand the geometric meaning and has its advantages. Leibniz thought of $dy$ and $dx$ as very small changes in $y$ and $x$, respectively, and called the ratio between these changes the differential coefficient.
$$ dy=adx $$
As an aside, this helps understand why $a$ is called a differential ‘coefficient’.
3D
Now, let’s assume a 3D scalar function $f=f(x,y,z)$ and a position vector $\mathbf{r}=x\hat{\mathbf{x}}+y\hat{\mathbf{y}}+z\hat{\mathbf{z}}$ are given. The change in $f$ is expressed through total differentiation.
$$ \begin{equation} df=\frac{ \partial f}{ \partial x }dx + \frac{ \partial f}{ \partial y}dy+\frac{ \partial f}{ \partial z}dz \end{equation} $$
The change in $\mathbf{r}$ is as follows:
$$ d\mathbf{r}=dx\hat{\mathbf{x}}+dy\hat{\mathbf{y}}+dz\hat{\mathbf{z}} $$
Now, as in the 1D case, let’s find something that represents the ratio between $df$ and $d\mathbf{r}$. Since $df$ is a scalar and $d\mathbf{r}$ is a vector, that ‘something’ must be a vector, and $df$ can be imagined as the dot product of that ‘something’ with $d\mathbf{r}$. Therefore, let’s denote that ‘something’ as $\mathbf{a}=a_{1}\hat{\mathbf{x}}+a_{2}\hat{\mathbf{y}}+a_{3}\hat{\mathbf{z}}$ and express it as follows:
$$ \begin{align*} df=\mathbf{a}\cdot d\mathbf{r}&=(a_{1}\hat{\mathbf{x}}+a_{2}\hat{\mathbf{y}}+a_{3}\hat{\mathbf{z}})\cdot(dx\hat{\mathbf{x}}+dy\hat{\mathbf{y}}+dz\hat{\mathbf{z}}) \\ &= a_{1}dx+a_{2}dy+a_{3}dz \end{align*} $$
Comparing this with $(2)$ yields the following result:
$$ \mathbf{a}=\frac{ \partial f}{ \partial x}\hat{\mathbf{x}}+\frac{ \partial f}{ \partial y}\hat{\mathbf{y}}+\frac{ \partial f}{ \partial z}\hat{\mathbf{z}} $$
From now on, let’s denote this vector $\mathbf{a}$ as $\nabla f$ and call it the gradient of $f$. The direction of the gradient points to where the function $f$ increases most significantly, and its magnitude represents the extent of this increase.
Related Formulas
Linearity:
$$ \nabla (f + g) = \nabla f + \nabla g $$
$$ \nabla{(fg)}=f\nabla{g}+g\nabla{f} $$ $$ \nabla(\mathbf{A} \cdot \mathbf{B}) = \mathbf{A} \times (\nabla \times \mathbf{B}) + \mathbf{B} \times (\nabla \times \mathbf{A})+(\mathbf{A} \cdot \nabla)\mathbf{B}+(\mathbf{B} \cdot \nabla) \mathbf{A} $$
$$ \nabla \cdot (\nabla T) = \dfrac{\partial^{2} T}{\partial x^{2}} + \dfrac{\partial ^{2} T} {\partial y^{2}} + \dfrac{\partial ^{2} T}{\partial z^{2}} $$ $$ \nabla \times (\nabla T)= \mathbf{0} $$ $$\nabla (\nabla \cdot \mathbf{A} ) $$
Fundamental Theorem of Gradient
$$ T(b)-T(a) = \int _{a}^{b} (\nabla T) \cdot d\mathbf{l} $$
$$ \int_{\mathcal{V}} (\nabla T) d \tau = \oint_{\mathcal{S}} T d \mathbf{a} $$ $$ \int_{\mathcal{V}} \left[ T \nabla^{2} U + (\nabla T) \cdot (\nabla U) \right] d \tau = \oint_{\mathcal{S}} (T \nabla U) \cdot d \mathbf{a} $$ $$ \int_{\mathcal{V}} \left( T \nabla^{2} U - U \nabla^{2} T \right) d \tau = \oint_{\mathcal{S}} \left( T \nabla U - U \nabla T \right) \cdot d \mathbf{a} $$ $$ \int_{\mathcal{S}} \nabla T \times d \mathbf{a} = - \oint_{\mathcal{P}} T d \mathbf{l} $$
$$ \int_{\mathcal{V}}\mathbf{A} \cdot (\nabla f)d\tau = \oint_{\mathcal{S}}f\mathbf{A} \cdot d \mathbf{a}-\int_{\mathcal{V}}f(\nabla \cdot \mathbf{A})d\tau $$ $$ \int_{\mathcal{S}} f \left( \nabla \times \mathbf{A} \right)\mathbf{A} \cdot d \mathbf{a} = \int_{\mathcal{S}} \left[ \mathbf{A} \times \left( \nabla f \right) \right] \cdot d\mathbf{a} + \oint_{\mathcal{P}} f\mathbf{A} \cdot d\mathbf{l} $$