Why Notation of Partial Differential is Different? 📂Vector Analysis

Why Notation of Partial Differential is Different?

Question

In partial derivatives, unlike the usual derivatives, expressions like $\displaystyle {{ \partial f } \over { \partial t }}$ are used instead of $\displaystyle {{ d f } \over { d t }}$ . $\partial$ is read as “Round Dee” or “Partial,” and historically, it originated from “Curly Dee,” which is a cursive form of $d$ ¹. In code, it’s \partial, and in Korea, some people even shorten it to just “Round,” considering “Round Dee” too long.

Why is $d$ written as $\partial$ ?

The problem is that it’s hard to understand why a different symbol is used for partial differentiation, which is just differentiating with respect to another variable. At the undergraduate level, this question inevitably arises whenever partial differentiation is introduced, but the answer could be, if you’re not in a math department, “That’s something for mathematicians to worry about,” or even if you are, “It’s okay to accept it as just a notational difference.” This isn’t necessarily wrong, as whether it’s written as $d$ or $\partial$ , if you’re not in a math department, it’s not particularly important, and even if you are, it doesn’t change the meaning of the equation itself. For example, in the context of studying heat equations, ${{ \partial u } \over { \partial t }} = {{ \partial u } \over { \partial x^{2} }}$ changing $\partial$ to an ordinary differential expression $d$ and writing it as ${{ d u } \over { d t }} = {{ d u } \over { d x^{2} }}$ could raise the question of whether these two equations are the same. Confusingly, the answer would be ’they are indeed the same,’ leading many students to feel that there’s no difference between $d$ and $\partial$ , or to just accept the definition and move on.

Answer

Newton and Leibniz

Before diving into partial differentiation, let’s start with an interesting read about the two fathers of differentiation, Newton and Leibniz. Nowadays, both are recognized for independently inventing the concept and notation of differentiation, with function $y = f(x)$ and its derivative, where Newton used notation like $y ' = f ' (x)$ and Leibniz used ${{ dy } \over { dx }} = {{ d f(x) } \over { dx }}$ The reason for the difference in expression, despite being the same differentiation, is due to their different thought processes and perspectives on calculus. It’s fortunate that there were two people independently inventing differentiation at the same time, as it could have been beneficial if there had been another person as well. Newton, as a master of [classical mechanics](../../categories/Classical Mechanics), often discussed ‘differentiating position once gives velocity, and twice gives acceleration,’ making expressions like $\begin{align*} v =& x ' \\ a =& v ' = x '' \end{align*}$ very neat and efficient. Leibniz, on the other hand, had a more logical approach from a geometric perspective, where the slope of a line is defined as the ratio of changes in horizontal and vertical directions, so for a curve, one could naturally approach the slope of the tangent by giving very small units as ${{ \Delta y } \over { \Delta x }} \approx {{ d y } \over { d x }}$ Interestingly, despite being about ordinary differentiation, the field allows for such divergence in notation, where Newton’s and Leibniz’s notations can coexist.

In differential geometry, the notation for differentiation with respect to $s$ and $t$ : ${{ df } \over { ds }} = f^{\prime} \quad \text{and} \quad {{ df } \over { dt }} = \dot{f}$ Dot $\dot{}$ or prime $'$ , although both denote differentiation, can be distinguished in the context of differential geometry. Typically, $s$ represents the parameter of a unit speed curve, and $t = t(s)$ represents the parameter of the curve reparameterized by arc length.

This notation didn’t arise because the concept of differentiation was transformed. In differential geometry, differentiation is often performed with both $s$ and $t$ , but Newton’s notation doesn’t allow distinguishing what is being differentiated, and Leibniz’s notation makes the expression too complicated, leading to the creation of an additional notation to take advantage of both.

What’s fascinating is that, despite $s$ and $t$ being just variables used as parameters, in the context of ordinary differential equations involving time, the derivative with respect to $t$ began to be written not as $v '$ but as $\dot{v}$ , borrowing the first letter. As a result, in almost all systems describing changes over time, dynamics prefer to use expressions like $\dot{v} = f(v)$ instead of $v '$ . The point is that the concern to clearly and neatly represent ‘what is being differentiated’ can naturally arise even outside the context of partial differentiation.

Implication of Multivariable Functions

In the previous section, it was noted that $f '$ and $\dot{f}$ could be distinguished just by their expressions, and especially in dynamical systems, even without the appearance of time $t$ in the expression, it could be implied as differentiation with respect to time due to universal conventions and context. Let’s discuss a bit more about the implicit information conveyed by expressions.

Returning to partial differentiation, the reason why the notation $d$ and $\partial$ doesn’t seem different is that there’s no difference in the partial derivatives they represent. For instance, if the derivative of $f$ with respect to $t$ is $g$ , then that $g$ could be represented as $g = {{ d f } \over { d t }} = {{ \partial f } \over { \partial t }}$ either way as $d$ or $\partial$ , and it wouldn’t matter much. Regardless of the notation, it’s differentiated with respect to $t$ , and the ‘result’ $g$ is the same. In fact, the implicit information given by $\partial$ is not about $g$ but about $f$ . When a function $h$ is said to be differentiated with respect to $H$ giving $x$ , consider the following two expressions:

Without $\partial$ : It seems $\displaystyle h = {{ d H } \over { d x }} \implies$ $H$ is differentiated to $h$ .
With $\partial$ : Why is it only this one? There must be some $y$ , so it should be $H = H (x , y)$ .

In other words, the notation $\partial$ itself implies that the given function is a multivariable function. Often, the first serious encounter with partial differentiation is usually through partial differential equations, and in equations like ${{ \partial u } \over { \partial t }} = {{ \partial u } \over { \partial x^{2} }}$ we are not interested in the first-order partial derivative $u_{t}$ of $u$ with respect to $t$ , nor the second-order partial derivative $u_{xx}$ of $u$ with respect to $x$ , but what function $u = u (t,x)$ of $t$ and $x$ satisfies them being equal. From this perspective, using $\partial$ in partial differential equations is justifiable and natural.

On the other hand, such conventions being widely accepted also changes the meaning of $d$ itself. If a function isn’t multivariable, it’s pointless to differentiate it with $\partial$ , so if the derivative expression uses $d$ , it implies that it’s not a multivariable function. For instance, if we fix the location of a bivariate function $u = u (t,x)$ to a point and set $u = u \left( t , x_{0} \right)$ , then $\left. {{ \partial u } \over { \partial t }} \right|_{x = x_{0} } = {{ d u } \over { d t }} = \dot{u}$ such an expression makes excellent use of the implicit information transmission of $\partial$ and $d$ . This goes beyond just a difference in expression and influences the way we handle equations, leading to ideas like transforming partial differential equation problems into relatively simple ordinary differential equations to solve them.

✅ To Avoid Confusion in Total Differentiation

$df = \frac{ \partial f}{ \partial x_{1} }dx_{1} + \frac{ \partial f}{ \partial x_{2} }dx_{2} + \cdots + \frac{ \partial f}{ \partial x_{n} }dx_{n}$ For a multivariable function $f : \mathbb{R}^{n} \to \mathbb{R}$ , the total differentiation used in fields like mathematical physics is often represented as above, and to write it more intuitively when $n = 3$ , we only write $t,x,y,z$ like this, assuming $x,y,z$ are independent. $df = {{ \partial f } \over { \partial x }} dx + {{ \partial f } \over { \partial y }} dy + {{ \partial f } \over { \partial z }} dz$ At first glance, this expression might seem complicated with a mix of $d$ and $\partial$ , but by applying Leibniz’s legacy of ‘dividing both sides by $dt$ or $dx$ ,’ we can see $\begin{align*} df =& {{ \partial f } \over { \partial x }} dx + {{ \partial f } \over { \partial y }} dy + {{ \partial f } \over { \partial z }} dz \\ {{ d f } \over { d t }} =& {{ \partial f } \over { \partial x }} {{ d x } \over { d t }} + {{ \partial f } \over { \partial y }} {{ d y } \over { d t }} + {{ \partial f } \over { \partial z }} {{ d z } \over { d t }} \\ {{ d f } \over { d x }} =& {{ \partial f } \over { \partial x }} {{ d x } \over { d x }} + {{ \partial f } \over { \partial y }} {{ d y } \over { d x }} + {{ \partial f } \over { \partial z }} {{ d z } \over { d x }} = {{ \partial f } \over { \partial x }} \end{align*}$ that it clearly represents both the meaning of differentiating $f$ with respect to $t$ and partial differentiating with respect to $x$ . This shows how useful the form of total differentiation can be in handling equations, and if we eliminate $\partial$ altogether and unify it with $d$ to rewrite it, we get $df = {{ d f } \over { d x }} dx + {{ d f } \over { d y }} dy + {{ d f } \over { d z }} dz$ Of course, Leibniz’s differential notation is incredibly intuitive when dealing with numerator and denominator like fractions, but if you’re reading this, you should know not to treat $dx$ , $dy$ , or $dz$ so carelessly. Despite this, your inner instincts will scream to simplify it like this. $\begin{align*} df =& {{ d f } \over { dx }} dx + {{ d f } \over { dy }} dy + {{ d f } \over { dz }} dz \\ \overset{?}{=} & {{ d f } \over { \cancel{dx} }} \cancel{dx} + {{ d f } \over { \cancel{dy} }} \cancel{dy} + {{ d f } \over { \cancel{dz} }} \cancel{dz} \\ =& df + df + df \\ \overset{???}{=}& 3 df \end{align*}$ This disaster can be seen as a circular argument caused by overlooking what conditions make $d$ equal to $\partial$ . The progression that casually assumes ’eliminate $\partial$ and unify with $d$ to rewrite’ is too bold, especially since the very thought of replacing $\partial$ with $d$ came from $df = {{ \partial f } \over { \partial x }} dx + {{ \partial f } \over { \partial y }} dy + {{ \partial f } \over { \partial z }} dz \implies {{ d f } \over { d x }} = {{ \partial f } \over { \partial x }} \implies d \equiv \partial$ assuming $x,y,z$ are independent. While tampering with $df = {{ \partial f } \over { \partial x }} dx + {{ \partial f } \over { \partial y }} dy + {{ \partial f } \over { \partial z }} dz$ , which forms the basis for $d \equiv \partial$ , it’s natural that some form of error would occur. For $d$ and $\partial$ to be equal, as assumed in the example, either the variables of a multivariable function must be independent, or some special condition or remarkable theorem must truly equate $d$ and $\partial$ .

From the considerations so far, we can summarize that the reason for using $d$ instead of $\partial$ for partial differentiation is because they are indeed different. All the examples we’ve seen where $d$ and $\partial$ were the same always implicitly assume certain conditions. Within those good assumptions, $\partial$ might essentially be the same as $d$ , but that doesn’t mean we necessarily have to rewrite $\partial$ as $d$ .

❌ Treating Variables Other Than the Differentiated One as Constants?

To put it bluntly, this is a wrong answer.

More precisely, the causal relationship explaining the phenomenon is reversed. For example, if $f(t,x) = \left( t^{2} + x^{2} \right)$ , then $\partial t$ is not formally treating variables other than $t$ as constants because of ${{ \partial f } \over { \partial t }} = 2t + 0 = 2t = {{ d f } \over { d t }}$ but, as seen in the previous section, it’s because of the assumption $\displaystyle {{ dx } \over { dt }} = 0$ that $\begin{align*} & df = {{ \partial f } \over { \partial t }} dt + {{ \partial f } \over { \partial x }} dx \\ \implies & {{ d f } \over { d t }} = {{ \partial f } \over { \partial t }} {{ dt } \over { dt }} + {{ \partial f } \over { \partial x }} {{ dx } \over { dt }} \\ \implies & {{ d f } \over { d t }} = {{ \partial f } \over { \partial t }} \cdot 1 + {{ \partial f } \over { \partial x }} \cdot 0 \\ \implies & {{ d f } \over { d t }} = {{ \partial f } \over { \partial t }} \end{align*}$ holds. Partial differentiation $\partial$ itself didn’t produce the result $\displaystyle {{ dx } \over { dt }} = 0$ ; it’s the cause $\displaystyle {{ dx } \over { dt }} = 0$ that led to the result $\partial \equiv d$ . Thus, the explanation that ‘partial differentiation treats variables other than the differentiated one as constants’ gives a misleading impression and misconception that partial differentiation $\partial$ is somehow a more powerful operator than ordinary differentiation $d$ . Moreover, if $x$ were treated as a constant, it should disappear after differentiation with respect to $t$ , but as can be simply seen with $f(t,x) = t^{2} + x^{2} + 2tx$ , $\dfrac{\partial f}{\partial t}$ remains a bivariate function with variable $(t,x)$ .

The persistence of this fallacy is due to its plausibility. Practically, when assuming relationships between variables like $x = x(t)$ , there’s no need to use the expression ‘differentiate with respect to $t$ ’ in the first place, as according to the chain rule, $\begin{align*} {{ d f } \over { d t }} =& {{ d } \over { d t }} \left( t^{2} + x^{2} \right) \\ =& 2t + {{ d x^{2} } \over { d x }} {{ dx } \over { dt }} \\ =& 2t + 2x \dot{x} \end{align*}$ the equation can be unfolded without ambiguity right from the start. At least in this example, $f = f(t,x)$ is essentially the same as a univariate function like $f = f(t)$ , or it’s unnecessarily complex, and eventually, textbooks only include cases that are clean, independent, and yet still multivariable. Typically, people study with clean examples, time passes, familiarity with partial differentiation grows, incorrect intuitions settle, and everyone else does the same. However, what’s wrong is wrong. Merely changing the notation of differentiation can’t arbitrarily alter the dependencies of the given function.

https://math.stackexchange.com/a/2000353/459895 ↩︎