Polynomial Interpolation 📂Numerical Analysis

Polynomial Interpolation

Definition ¹

For data $(x_{0}, y_{0}) , \cdots , (x_{n} , y_{n})$ corresponding to distinct $x_{0} , \cdots , x_{n}$, a polynomial function $p$ that is an interpolation satisfying $p (x_{i} ) = y_{i}$ and $\deg p \le n$ is called a polynomial interpolation.

Theorem

Existence and uniqueness

[1]: For the given data, $p$ exists uniquely.

Lagrange formula

[2]: $$p_{n} (x) = \sum_{i=0}^{n} y_{i} l_{i} (X)$$

Newton divided-differences formula

[3]: $$p_{n} (x) = f(x_{0}) + \sum_{i=1}^{n} f [ x_{0} , \cdots , x_{i} ] \prod_{j=0}^{i-1} (x - x_{j} )$$

Error analysis

[4]: For a $(n+1)$-times differentiable $f : \mathbb{R} \to \mathbb{R}$ and some $\xi \in \mathscr{H} \left\{ x_{0} , \cdots , x_{n} \right\}$, the polynomial interpolant $p_{n}$ of $f$ satisfies the following for some $t \in \mathbb{R}$. $$\displaystyle f(t) - p_{n} (t) = {{ (t - x_{0}) \cdots (t - x_{n}) } \over { (n+1)! }} f^{(n+1)} ( \xi )$$

$\mathscr{H} \left\{ a,b,c, \cdots \right\}$ denotes the smallest interval containing $a,b,c, \cdots$.

Remarks

The term “polynomial interpolation” corresponds in Korean to the simplified term “다항식 보간법”.

Condition $\deg p \le n$

$20190403\_105653.png$

The condition $\deg p \le n$ simply reflects that there is no guarantee that a $p$ satisfying $\deg p = n$ always exists. As a simple example, when $n=2$ and three points lie on a straight line as above, there is no $p_{2} (x) = a_{2} x^{2} + a_{1} x + a_{0}$ passing through $(n+1)=3$ points, although there does exist a polynomial of lower degree $p_{1} (x) = a_{1} x + a_{0}$. In fact this corresponds to the case where one has found $p_{2} (x) = a_{2} x^{2} + a_{1} x + a_{0}$ but it is degenerate $a_{2} = 0$.

Lagrange formula and Newton divided-differences formula are equivalent

Formulas [2] and [3], although different in appearance, are in fact the same by [1]. Essentially the two formulas differ only in how they represent $p_{n}$; practically, the Newton divided-differences form has the advantage that it is convenient to update when new data are added.

Error relative to the true function

Theorem [4] shows how much the interpolant $p_{n}$ of a function $f$ deviates from $f$. In typical situations the growth rate of $(n+1)!$ is very rapid, so the accuracy of the interpolant $p_{n}$ increases as more data are provided. Note, however, that this formula does not explicitly address convergence. For example, consider the case where $t$ is very, very far from $\mathscr{H} \left\{ x_{0} , \cdots , x_{n} \right\}$. Also, if $f$ behaves poorly so that its derivatives grow with each differentiation—even faster than the growth rate of $(n+1)!$—the behavior can become pathological.

Proof

[1]

Strategy: represent the $(n+1)$ linear equations as a matrix and deduce existence and uniqueness via invertibility.

It suffices to show that the coefficients $a_{0} , a_{1} , \cdots , a_{n}$ of $p_{n}$ satisfying $y_{i} = p_{n} (x_{i}) = a_{0} + a_{1} x_{i} + \cdots + a_{n} x_{i}^{n}$ for all $i = 0, \cdots , n$ are unique. $$ \mathbf{y} := \begin{bmatrix} y_{0} \\ y_{1} \\ \vdots \\ y_{n} \end{bmatrix}, X := \begin{bmatrix} 1 & x_{0} & \cdots & x_{0}^{n} \\ 1 & x_{1} & \cdots & x_{1}^{n} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_{n} & \cdots & x_{n}^{n} \end{bmatrix} , \mathbf{a} := \begin{bmatrix} a_{0} \\ a_{1} \\ \vdots \\ a_{n} \end{bmatrix} $$ Defining and writing the linear system in matrix form gives $$ \begin{bmatrix} y_{0} \\ y_{1} \\ \vdots \\ y_{n} \end{bmatrix} = \begin{bmatrix} 1 & x_{0} & \cdots & x_{0}^{n} \\ 1 & x_{1} & \cdots & x_{1}^{n} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_{n} & \cdots & x_{n}^{n} \end{bmatrix} \begin{bmatrix} a_{0} \\ a_{1} \\ \vdots \\ a_{n} \end{bmatrix} $$ Now the problem is reduced to finding a solution $\mathbf{a}$ satisfying $\mathbf{y} = X \mathbf{a}$.

Determinant of the Vandermonde matrix: the determinant of $X$ is $\displaystyle \det X = \prod_{1 \le i < j \le n } (x_{j} - x_{i})$

By assumption the $x_{0} , \cdots , x_{n}$ are distinct, so $\det X \ne 0$ and therefore $X^{-1}$ exists. Hence $\mathbf{a} = X^{-1} \mathbf{y}$ also exists. Since the inverse of a nonsingular matrix is unique, $\mathbf{a}$ is unique.

■

[2]

Seen via the Kronecker delta function.

■

[3]

Compute directly by using divided differences honestly.

■

[4]

Strategy: introduce auxiliary (dummy) functions and use their differentiability to avoid direct computation. The setup is intricate, so it is actually easier to understand it from the end backwards.

Claim: For $E (x) := f(x) - p_{n} (X)$ the following holds. $$ E(t) = {{ (t - x_{0}) \cdots (t - x_{n}) } \over { (n+1)! }} f^{(n+1)} ( \xi ) $$

Part 1.

If $t$ coincides with one of the node points $x_{0} , \cdots , x_{n}$ the claim is trivial, so assume $t$ is different from these node points. $$ \begin{align*} \Psi (x) :=& (x - x_{0} ) (x - x_{1}) \cdots (x - x_{n}) \\ G (x) :=& E(x) - {{ \Psi (x) } \over { \Psi (t) }} E(t) \end{align*} $$ By defining the auxiliary functions as in $$ \begin{align*} \Psi (x) :=& (x - x_{0} ) (x - x_{1}) \cdots (x - x_{n}) \\ G (x) :=& E(x) - {{ \Psi (x) } \over { \Psi (t) }} E(t) \end{align*} $$, since $f$, $p_{n}$, and $\Psi$ are $(n+1)$-times differentiable, $G$ is also $(n+1)$-times differentiable.

Part 2.

Substituting $x = x_{i}$ into $G$ yields $E(x_{i}) = f(x_{i}) - p_{n} (x_{i}) = 0$ and $\Psi (x_{i} ) = 0$, hence $$ G(x_{i}) = E(x_{i} ) - {{ \Psi (x_{i}) } \over { \Psi (t) }} E(t) = 0 $$ Substituting $x = t$ into $G$ yields $\displaystyle {{ \Psi (t) } \over { \Psi (t) }} = 1$, hence $$ G(t) = E(t) - E(t) =0 $$ Therefore $G$ has $(n+2)$ distinct zeros $x_{0} , \cdots , x_{n} , t$.

Part 3.

For convenience denote $x_{n+1} :=t$.

Rolle’s theorem: If a function $f(x)$ is continuous on $[a,b]$ and differentiable on $(a,b)$ and $f(a)=f(b)$, then there exists at least one $c$ in $(a,b)$ satisfying $f ' (c)=0$.

For every $i=0, \cdots , n$ we have $$ G(x_{i}) = 0 = G(x_{i+1}) $$ so by Rolle’s theorem there exists $x'_{i} \in \mathscr{H} \left\{ x_{i} , x_{i+1} \right\}$ satisfying $g ' ( x’_{i}) = 0$, and similarly for every $i=0, \cdots , (n-1)$ $$ g ' (x’_{i}) = 0 = g ' (x’_{i+1}) $$ hence by Rolle’s theorem there exists $x''_{i} \in \mathscr{H} \left\{ x’_{i} , x’_{i+1} \right\}$ satisfying $G''( x’’_{i}) = 0$. Repeating this application of Rolle’s theorem $(n+1)$ times inductively guarantees the existence of $\xi \in \mathscr{H} \left\{ x_{0} , \cdots , x_{n+1} \right\}$ satisfying $$ G^{(n+1)} ( \xi ) = 0 $$

On the other hand, since $p_{n}$ is a degree-$n$ polynomial, $$ E^{(n+1)} (x) = f^{(n+1)} ( x) $$ and the leading term of $\Psi$ is $x^{n+1}$, hence $$ \Psi^{(n+1)} (x) = (n+1)! $$ Differentiating both sides of $\displaystyle G (x) = E(x) - {{ \Psi (x) } \over { \Psi (t) }} E(t)$ with respect to $x$ $(n+1)$ times yields $$ G^{(n+1)} (x) = f^{(n+1)} ( x) - {{ (n+1)! } \over { \Psi (t) }} E(t) $$ Substituting $x=\xi$ into $G^{(n+1)}$ and $f^{(n+1)}$ gives $$ 0 = f^{(n+1)} ( \xi ) - {{ (n+1)! } \over { \Psi (t) }} E(t) $$ Since $E (t) = f(t) - p_{n} (t)$, we obtain $$ f(t) - \sum_{j=0}^{n} f( x_{j} ) l_{j} (t) = {{ (t - x_{0}) \cdots (t - x_{n}) } \over { (n+1)! }} f^{(n+1)} ( \xi ) $$

■

Code

alt text

The following implements the polynomial approximation $p : \mathbb{R} \to \mathbb{R}$ in Julia in the same manner as shown algebraically in the post.

using Plots, Polynomials

function interpolate(xs, ys)
    V = [x^(i-1) for x in xs, i in 1:length(xs)]
    coeffs = V \ ys
    return Polynomial(coeffs)
end
foo = sort(rand(5))
bar = randn(5)
p = interpolate(foo, bar)
scatter(foo, bar, legend = :none)
plot!(0:0.01:1, p.(0:0.01:1))

Atkinson. (1989). An Introduction to Numerical Analysis(2nd Edition): p131. ↩︎