Taylor's Theorem for Multivariable Functions
Theorem1
Let $f : \mathbb{R}^{n} \to \mathbb{R}$ be $C^{k}$ function, and call it $\mathbf{a} = (a_{1}, \dots, a_{n}) \in \mathbb{R}^{n}$. Then, there exists $C^{k-2}$ function $h_{ij}$ that satisfies the following.
$$ f(\mathbf{x}) = f(\mathbf{a}) + \sum_{i} (x_{i} - a_{i})\dfrac{\partial f}{\partial x_{i}}(\mathbf{a}) + \sum_{i,j}h_{ij}(\mathbf{x})(x_{i} - a_{i}) (x_{j} - a_{j}) $$
Description
It generalizes the Taylor theorem to functions of several variables.
second-order
$$ \begin{align*} f(\mathbf{x}) &= f(\mathbf{a}) + \sum\limits_{i=1}^{n} (x_{i} - a_{i}) \dfrac{\partial f}{\partial x_{i}}(\mathbf{a}) + \dfrac{1}{2!}\sum\limits_{i,j=1}^{n} (x_{i} - a_{i})^{2} \dfrac{\partial^{2} f}{\partial x_{i} \partial x_{j}}(\mathbf{a}) + \text{Remainder} \\ &= f(\mathbf{a}) + (\mathbf{x} - \mathbf{a})^{T} \nabla f (\mathbf{a}) + \dfrac{1}{2!}(\mathbf{x} - \mathbf{a})^{T} (H(\mathbf{a})) (\mathbf{x} - \mathbf{a}) + \text{Remainder} \end{align*} $$
Here, $\nabla f$ is the $f$ gradient, and $H$ is the Hessian of $f$.
For the remainder term, the following form is also usefully employed.
$$ f(\mathbf{x} + \mathbf{p}) = f(\mathbf{x}) + \mathbf{p}^{T}\nabla f(\mathbf{x} + t \mathbf{p}) \quad \text{for some } t \in (0,1) $$ $$ f(\mathbf{x} + \mathbf{p}) = f(\mathbf{x}) + \mathbf{p}^{T}\nabla f(\mathbf{x}) + \dfrac{1}{2!}\mathbf{p}^{T} H(\mathbf{x} + t \mathbf{p}) \mathbf{p} \quad \text{for some } t \in (0,1) $$
$$ f(\mathbf{x} + \mathbf{p}) = f(\mathbf{x}) + \int_{0}^{1}\mathbf{p}^{T}\nabla f (\mathbf{x} + t\mathbf{p})dt $$
Proof
$$ \begin{align*} f(\mathbf{x}) - f(\mathbf{a}) =&\ \int_{0}^{1} \dfrac{d}{dt} \left[ f(t(\mathbf{x} - \mathbf{a}) + \mathbf{a}) \right]dt \\ =&\ \int_{0}^{1} \left( \sum_{i} \dfrac{\partial f}{\partial x_{i}}\left( t(\mathbf{x} - \mathbf{a}) + \mathbf{a} \right)(x_{i}-a_{i}) \right) dt & \text{by } \href{https://freshrimpsushi.github.io/posts/3134}{\text{chain rule}} \\ =&\ \sum_{i}(x_{i} - a_{i}) \int_{0}^{1} \left( \dfrac{\partial f}{\partial x_{i}}\left( t(\mathbf{x} - \mathbf{a}) + \mathbf{a} \right) \right) dt \end{align*} $$
Let the integral part be denoted as $g_{i}(\mathbf{x})$. If we denote $g_{i}(\mathbf{x}) = \displaystyle \int_{0}^{1} \left( \dfrac{\partial f}{\partial x_{i}}\left( t(\mathbf{x} - \mathbf{a}) + \mathbf{a} \right) \right) dt$,
$$ \begin{equation} f(\mathbf{x}) - f(\mathbf{a}) = \sum_{i}(x_{i} - a_{i}) \int_{0}^{1} \left( \dfrac{\partial f}{\partial x_{i}}\left( t(\mathbf{x} - \mathbf{a}) + \mathbf{a} \right) \right) dt = \sum_{i} g_{i}(\mathbf{x}) (x_{i} - a_{i}) \end{equation} $$
The value of $g_{i}(\mathbf{a})$ is as follows.
$$ g_{i}(\mathbf{a}) = \int_{0}^{1} \dfrac{\partial f}{\partial x_{i}} \left(t(\mathbf{a} - \mathbf{a}) + \mathbf{a} \right) dt = \int_{0}^{1} \dfrac{\partial f}{\partial x_{i}}\left( \mathbf{a} \right) dt = \dfrac{\partial f}{\partial x_{i}}\left( \mathbf{a} \right) $$
Then, using the same method that led to $(1)$, we can obtain the following equation.
$$ g_{i}(\mathbf{x}) - g_{i}(\mathbf{a}) = \sum_{j} h_{ij}(\mathbf{x}) (x_{j}-a_{j}) $$
Now, summarizing,
$$ \begin{align*} f(\mathbf{x}) =&\ f(\mathbf{a}) + \sum_{i}g_{i}(\mathbf{x})(x_{i}-a_{i}) \\ =&\ f(\mathbf{a}) + \sum_{i}\left( g_{i}(\mathbf{a}) + \sum_{j} h_{ij}(\mathbf{x}) (x_{j}-a_{j}) \right)(x_{i}-a_{i}) \\ =&\ f(\mathbf{a}) + \sum_{i} g_{i}(\mathbf{a})(x_{i}-a_{i}) + \sum_{i,j} h_{ij}(\mathbf{x})(x_{i}-a_{i})(x_{j}-a_{j}) \\ =&\ f(\mathbf{a}) + \sum_{i} \dfrac{\partial f}{\partial x_{i}}\left( \mathbf{a} \right)(x_{i}-a_{i}) + \sum_{i,j} h_{ij}(\mathbf{x})(x_{i}-a_{i})(x_{j}-a_{j}) \end{align*} $$
■
See Also
Richard S. Millman and George D. Parker, Elements of Differential Geometry (1977), p213-214 ↩︎