
자동미분과 이원수

자동미분과 이원수


Dual numbers are numbers that can be expressed in the following form for two real numbers $a, b \in \mathbb{R}$.

$$ a + b\epsilon, \quad (\epsilon^{2} = 0,\ \epsilon \neq 0) $$

The addition and multiplication system of dual numbers is useful for implementing the forward mode of automatic differentiation.


In automatic differentiation, especially the forward mode, when computing the function value $f$, the derivative is calculated simultaneously. For example, if we want to compute the derivative of $y(x) = \ln (x^{2} + \sin x)$, we can calculate it using the following formula, which we will call $\dot{w} = \dfrac{dw}{dx}$.

$$ \begin{array}{|l|l|} \hline \textbf{Forward calulations} & \textbf{Derivatives} \\ \hline w_{1} = x & \dot{w}_{1} = 1 \\[0.5em] w_{2} = w_{1}^{2} & \dot{w}_{2} = 2w_{1} = 2x \\[0.5em] w_{3} = \sin w_{1} & \dot{w}_{3} = \cos w_{1} = \cos x \\[0.5em] w_{4} = w_{2} + w_{3} & \dot{w}_{4} = \dot{w}_{2} + \dot{w}_{3} = 2x + \cos x \\[0.5em] w_{5} = \ln (w_{4}) & \dot{w}_{5} = \dfrac{\dot{w}_{4}}{w_{4}} = \dfrac{2x + \cos x}{x^{2} + \sin x} \\[1em] \hline \end{array} $$

In this case, by using operations on dual numbers, it is possible to simultaneously and naturally compute the function value and the derivative. Let us express the dual number $a + b\epsilon$ as an ordered pair $(a, b)$.

Addition of Dual Numbers

$$ (a, b) + (c, d) = (a + c, b + d) $$

Multiplication of Dual Numbers

$$ (a, b)(c, d) = (ac, ab+bc) $$

Differentiable Functions Defined on Dual Numbers

For a differentiable function $f : \mathbb{R} \to \mathbb{R}$, $$ f(a + b\epsilon) := f(a) + f^{\prime}(a)b\epsilon = \big( f(a), b f^{\prime}(a) \big) $$

Composition of Functions Defined on Dual Numbers

For $f, g : \mathbb{R} \to \mathbb{R}$, $$ (f \circ g)(a + b\epsilon) := f(g(a)) + f^{\prime}(g(a))g^{\prime}(a)b\epsilon = \big( f(g(a)), bf^{\prime}(g(a))g^{\prime}(a) \big) $$


Consider a variable to be differentiated $x$, represented as a dual number $(x, 1)$, and a constant $\alpha$, represented as $(\alpha, 0)$. Then, the addition of dual numbers itself represents the first component as the function value and the second component as the derivative. For example, consider a function (constant addition) $x \mapsto x + \alpha$. The function value from $x = x_{0}$ is $x_{0} + \alpha$, and its derivative is $\left. \dfrac{d(x + \alpha)}{dx}\right|_{x = x_{0}} = 1$. Expressed in dual numbers, it is as follows:

$$ (x, 1) + (\alpha, 0) = (x + \alpha, 1) $$

The first component is the function value $x + \alpha$, and the second component is the derivative $1$. Of course, this also holds for $x + x$.

$$ \dfrac{d(x+x)}{dx} = 2x, \qquad (x, 1) + (x, 1) = (2x, 2) $$

Now, consider the function $x \mapsto \alpha x$ (multiplication). From $x = x_{0}$, the function value is $\alpha x_{0}$, and the derivative is $\left. \dfrac{d(\alpha x)}{dx} \right|_{x = x_{0}} = \alpha$. In dual numbers, it is expressed as follows:

$$ (x, 1)(\alpha, 0) = (\alpha x, x\cdot0 + 1\cdot\alpha) = (\alpha x, \alpha) $$

Similarly, the first component is the function value, and the second component is the derivative. This also holds for exponentiation $x \mapsto x^{2}$.

$$ (x, 1)(x, 1) = (x^{2}, 2x) $$

The derivative is preserved when substituted into the differentiable function $f$ and the composite function $f \circ g$.

$$ f(x, 1) = \big( f(x), f^{\prime}(x) \big), \qquad (f \circ g)(x, 1) = \big( f(g(x)), f^{\prime}(g(x))g^{\prime}(x) \big) $$

Now, let’s revisit the example $y(x) = \ln (x^{2} + \sin x)$ mentioned earlier. If we substitute the dual number $(x, 1)$ instead of the real number $x$, the calculation is as follows:

$$ \begin{align*} (x, 1)^{2} &= (x^{2}, 2x) \\ \sin(x, 1) &= (\sin x, \cos x) \\ (x, 1)^{2} + \sin(x, 1) &= (x^{2} + \sin x, 2x + \cos x) \\ \ln( (x, 1)^{2} + \sin(x, 1) ) &= \ln(x^{2} + \sin x, 2x + \cos x) \\ &= \Big( \ln(x^{2} + \sin x), (2x + \cos x) \dfrac{1}{x^{2} + \sin x} \Big) \\ &= \Big( \ln(x^{2} + \sin x), \dfrac{2x + \cos x}{x^{2} + \sin x} \Big) \\ \end{align*} $$

Computing the derivative of $y$ actually results in the following, confirming that it matches the second component of the dual number.

$$ \dfrac{dy}{dx} = \dfrac{d}{dx} \ln (x^{2} + \sin x) = \dfrac{2x + \cos x}{x^{2} + \sin x} $$

Refer to the guide below for implementing automatic differentiation in Julia.

See Also

