The Abstract Interpolation Problem

2020-10-21 4 min read Mathematics

Let $\mathcal{F}$ be a vector space of functions from $\mathbb{R}$ to $\mathbb{R}$ (e.g. smooth functions, polynomials, …) and let $\ell_0, \ell_1, \ldots, \ell_n$ be functionals defined on this space $\mathcal{F}$ that map to $\mathbb{R}$. We assume further that $\mathcal{F_{n}}\subseteq\mathcal{F}$ is an $n+1$ dimensional subspace of $\mathcal{F}$ with a basis $\set{\varphi_i}{i=0}^n$. That implies $\mathcal{F{n}}$ can be written in the following way.

$$\begin{equation} \mathcal{F_n} := \left\lbrace \varphi\in \mathcal{F} \ \Biggl\vert\Biggr.\ \varphi = \sum_{j=0}^n \alpha_j \varphi_j{,}\quad \alpha_j \in \mathbb{R}\ \forall j \right\rbrace \end{equation}$$

Nearly all interpolation and approximation problems can be reduced to the following setting: we are given the quantities $\ell_i\left(f\right)$ for an unknown function $f\in\mathcal{F}$ and seek a $\varphi\in\mathcal{F}_n$ such that $\ell_i\left(\varphi\right) = \ell_i\left(f\right)$ holds for every $i=0, \ldots, n$.

You may think of $f$ as a physical process or a huge data set. The functionals $\ell_i$ can then be interpreted as measurements on your process (e.g. temperature, speed, …) or sample extractions from your data set. In any case, $\ell_i\left(f\right)$ is data that we are given.

Since $\varphi=\sum_j\alpha_j\varphi_j$ the interpolation requirement can be rewritten as follows.

$$\begin{equation} \ell_i\left(\varphi\right) = \ell_i\left( \sum_j\alpha_j\varphi_j\right) \overset{!}{=} \ell_i\left(f\right)\quad \forall i \end{equation}$$

The previous problem is sometimes referred to as the nonlinear interpolation and without further assumptions on $\mathcal{F}$ or the $\ell_i$ it will be difficult to solve. A common restriction is to assume that all the $\ell_i$ are linear functionals. This leads us then to the following formulation.

$$\begin{equation} \sum_{j=0}^{n}\alpha_j\ell_i\left(\varphi_j\right) = \ell_i\left(f\right)\quad \forall i = 0, \ldots, n \end{equation}$$

Let us remark that the $\ell_i\left(\varphi_j\right)$ are known for all $i$ and $j$ and that the $\ell_i\left(f\right)$ are given and known as well. In view of our examples above, the $\ell_i\left(\varphi_j\right)$ can be understood as reference measurements or reference samples.

The interpolation problem has now been reduced to a linear system of $n+1$ equations with $n+1$ unknowns $\alpha_j$. It can also be written in matrix vector form as follows.

$$\begin{equation} \begin{pmatrix} \ell_0\left(\varphi_0\right) & \ell_0\left(\varphi_1\right) & \ldots & \ell_0\left(\varphi_n\right) \\ \ell_1\left(\varphi_0\right) & \ell_1\left(\varphi_1\right) & \ldots & \ell_1\left(\varphi_n\right) \\ \vdots & \vdots & \ddots & \vdots \\ \ell_n\left(\varphi_0\right) & \ell_n\left(\varphi_1\right) & \ldots & \ell_n\left(\varphi_n\right) \\ \end{pmatrix} \begin{pmatrix} \alpha_0 \\ \alpha_1 \\ \vdots \\ \alpha_n \end{pmatrix} = \begin{pmatrix} \ell_0\left(f\right) \\ \ell_1\left(f\right) \\ \vdots \\ \ell_n\left(f\right) \end{pmatrix} \end{equation}$$

Thus, the interpolation problem with linear functionals is solvable, whenever the linear system above has a solution. If the system does not have a solution, we may still consider solving it in a least squares sense. In the latter case we would have an approximation task.

Let us shortly remind the difference between an interpolation and approximation problem. In an interpolation problem we can reach the given data, e.g. we have $\ell_i\left(\sum_j\alpha_j\varphi_j\right) = \ell_i\left(f\right)$ for all $i$. In an approximation task we won’t necessarily have equality but minimize the distance between all the $\ell_i\left( \sum_j\alpha_j\varphi_j\right)$ and $\ell_i\left(f\right)$.

It goes without saying, that a good choice for the $\ell_i$ and $\varphi_j$ has a tremendous impact on the existence and properties of the solutions. Ideally, we want to chose the $\ell_i$ such that the system matrix is either diagonal or at least sparse and banded. The $\varphi_j$ should reflect the underlying process/data as good as possible. It doesn’t make sense to use smooth functions if your underlying data has known discontinuities!

Example

We assume that $\ell_i(f):=f(x_i)$ for some known and given numbers $x_i$. This means that $\ell_i$ is the point evaluation functional at $x_i$. Note that evaluating functions at a fixed point is a linear operation. For well-posedness reasons we assume that all the $x_i$ are pairwise different. Also, for $\mathcal{F}$ we select the space of all polynomials and $\mathcal{F_n}$ shall be the space of all polynomials up to degree $n$. In addition, we set $\varphi_j(x) := x^j$ (i.e. our basis functions are the monomials). The abstract interpolation problem can then be formulated as the following linear system.

$$\begin{equation} \begin{pmatrix} 1 & x_0 & \ldots & x_0^n \\ 1 & x_1 & \ldots & x_1^n \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & \ldots & x_n^n \\ \end{pmatrix} \begin{pmatrix} \alpha_0 \\ \alpha_1 \\ \vdots \\ \alpha_n \end{pmatrix} = \begin{pmatrix} f(x_0) \\ f(x_1) \\ \vdots \\ f(x_n) \end{pmatrix} \end{equation}$$

The linear system above does polynomial interpolation and the system matrix is the famous Vandermonde matrix. The Vandermonde matrix is invertible when all $x_i$ are pairwise different (hence our assumption above!).

Small remark: if you use not just the function evaluation but also the evaluation of their derivatives you obtain Hermite interpolation.

Final remarks

If the presentation in this article reminded you of neural networks and deep learning models, there’s a reason for this. The underlying idea is the same. Neural networks are approximation problems with fancy functionals $\ell_i$ and spaces $\mathcal{F}$.