whittaker2: Implementation notes for Whittaker smoothing and interpolation of surfaces

Description

Smooth albatross:::.Rdcite('Eilers2003') or estimate the baseline albatross:::.Rdcite('Eilers2004-PTW') of a surface measured on an arbitrary grid by minimising a sum of penalties. Combined difference orders and two different methods of preventing negative values in the output are supported.

This is not a public interface. Subject to change without further notice. Please do not call from outside albatross.

Usage

whittaker2(x, y, z, lambda, d, p, logscale, nonneg)
  diffmat(x, y, d)
  vandermonde(x0)

Value

whittaker2: A matrix of the same shape as z, with values smoothed, interpolated, or baseline estimated.
diffmat: A difference matrix $\mathbf D$ that, when multiplied by $\mathrm{vec}(\mathbf{Z})$, returns a vector of estimated derivatives of $\mathbf Z$ of given order by $x$ and $y$ in every available point (see below).
vandermonde: A vector $\mathbf c$ of length $n =$ length(x0) such that $ \mathbf{c}^\top f(\mathbf{x_0}) \approx f^{(n-1)}(x) $. The $x$ is either the median point of x0 or the average of two points in the middle, depending on whether $n$ is odd or even.

Arguments

x

Grid values along the rows of z.

y

Grid values along the columns of z.

z

Matrix containing the surface values to smooth or NAs to interpolate.

lambda

A vector of smoothness penalties, one for every difference order. Must be of the same length as d.

d

whittaker2: A vector of difference orders corresponding to elements of lambda.

diffmat

Difference order, an integer scalar.

If not missing, use the asymmetric penalty method albatross:::.Rdcite('Eilers2004-PTW') to estimate the baseline by penalising the differences with weight $p$ if $\hat{z} < z$ and $1 - p$ otherwise. Typically, values around $10^{-3}$ are used.

logscale

If not NA, rescale z between logscale and $1$ and take its logarithm before smoothing, then undo it.

Such transformation prevents the resulting values from getting lower than $ \mathrm{min}(x) - (\mathrm{max}(x) - \mathrm{min}(x)) \frac{\mathtt{logscale}}{1 - \mathtt{logscale}} $, which is approximately $ -\mathtt{logscale} \cdot \mathrm{max}(x) $ if logscale and $\mathrm{min}(x)$ are both close to $0$.

A typical value would be $10^{-4}$. Disabled by default because it may damage the surface shape in the process.

nonneg

If not $0$, for every resulting negative value in the interpolated surface, add a penalty of $ \mathtt{nonneg} \cdot \sum_i \mathbf{1}_{\hat{z}_i < 0} \, \hat{z}_i^2 $ to pull it towards $0$ and repeat the process until no new penalties are introduced.

A vector specifying the grid where the function to be differentiated is measured. Must be sorted.

Details

Finite difference approximation

How to differentiate a function tabulated on a fixed, potentially nonuniform grid before you even know its values? Use its Taylor series.

First derivative is special because it's possible to use central differences and get a second-order accurate result, even on a non-uniform grid, by carefully choosing the points where the derivative is calculated. Let $x + \frac{h}{2}$ and $x - \frac{h}{2}$ be a pair of adjacent points from the grid. Here's the Taylor series expansion for $f$ around $x$, with the Lagrange form of the reminder:

$$ f \left(x + \frac{h}{2}\right) = f(x) + \frac{h}{2} f'(x) + \frac{h^2}{8} f''(x) + \frac{h^3}{48} f'''(\zeta) $$

$$ f \left(x - \frac{h}{2}\right) = f(x) - \frac{h}{2} f'(x) + \frac{h^2}{8} f''(x) - \frac{h^3}{48} f'''(\eta) $$

$$ f \left(x + \frac{h}{2}\right) - f \left(x - \frac{h}{2}\right) = h f'(x) + \frac{h^3}{48} \left(f'''(\zeta) + f'''(\eta)\right) $$

$$ f'(x) = \frac{ f\left(x + \frac{h}{2}\right) - f\left(x - \frac{h}{2}\right) }{h} - \frac{h^2}{48} \left(f'''(\zeta) + f'''(\eta)\right) $$

$$ |\delta f'(x)| \le \max_{ \xi \in [x - \frac{h}{2}, x + \frac{h}{2}] } \frac{h^2}{24} f'''(\xi) $$

Suppose the three grid points $\xi_1 = x_1 - \frac{h_1}{2}$, $ \xi_2 = x_1 + \frac{h_1}{2} = x_2 - \frac{h_2}{2} $, $\xi_3 = x_2 + \frac{h_2}{2}$ are adjacent on the grid, and we know the $f'$ estimates in $x_1$ and $x_2$:

On the one hand, Taylor series expansion for $f'(x)$ around $\frac{x_1 + x_2}{2}$ gives:

$$ f'(x_2) = f'\left(\frac{x_1 + x_2}{2}\right) + f''\left(\frac{x_1 + x_2}{2}\right)\frac{x_2 - x_1}{2} + f'''\left(\frac{x_1 + x_2}{2}\right)\frac{(x_2 - x_1)^2}{8} + f''''(\zeta)\frac{(x_2 - x_1)^3}{48} $$ $$ f'(x_1) = f'\left(\frac{x_1 + x_2}{2}\right) - f''\left(\frac{x_1 + x_2}{2}\right)\frac{x_2 - x_1}{2} + f'''\left(\frac{x_1 + x_2}{2}\right)\frac{(x_2 - x_1)^2}{8} - f''''(\eta)\frac{(x_2 - x_1)^3}{48} $$ $$ f''\left(\frac{x_1 + x_2}{2}\right) = \frac{f'(x_2) - f'(x_1)}{x_2 - x_1} - \frac{(x_2 - x_1)^2}{48}(f''''(\zeta) + f''''(\eta)) $$ $$ |\delta f''(x)| \le \max_{ \xi \in [x_1, x_2] } \frac{(x_2 - x_1)^2}{24} f''''(\xi) $$

On the other hand, if we substitute the estimations of $f'(x)$ from above, we get:

$$ f''\left(\frac{x_1 + x_2}{2}\right) = \frac{ h_1 f(\xi_3) - h_1 f(\xi_2) - h_2 f(\xi_2) + h_2 f(\xi_1) }{h_1 h_2 (x_2 - x_1)} - \frac{ h_2^2 f'''(\zeta_2) - h_1^2 f'''(\zeta_1) }{24(x_2 - x_1)} - \frac{(x_2 - x_1)^2}{24} f''''(\eta) $$

This is why we can't just keep using central differences and get $n+1$th order accurate results.

What are the general methods of finding the coefficients for the $\mathbf D$ matrix? Start with a system of Taylor series expansions for every grid point:

$$ f(x_i) = \sum_{k=0}^{n-1} f^{(k)}(x) \frac{(x_i - x)^k}{k!} + f^{(n)}(\xi) \frac{(x_i - x)^{n}}{n!} \; \forall i = 1 \dots n $$

We can solve this system for coefficients $c_i$ giving the desired $l$-th derivative estimate with highest accuracy $p$ possible albatross:::.Rdcite('LeVeque2007-ch1'):

$$ \sum_{i = 1}^n c_i f(x_i) = f^{(l)}(x) + o(h^p) $$

Substituting the approximations for $f(x_i)$ into the equation, we get the following condition for the multiplier in front of each $f^{(k)}(x)$:

$$ \frac{1}{k!} \sum_{i = 1}^n c_i (x_i - x)^k = \mathbf{1}_{k = l} \; \forall k = 0 \dots n-1 $$

In the matrix form, this becomes a Vandermonde system:

$$ V_{k,i} = \frac{(x_i - x)^k}{k!} $$ $$ b_k = \mathbf{1}_{k = l} $$ $$ \mathbf c = \mathbf{V}^{-1} \mathbf b $$

Unfortunately, this system becomes ill-conditioned for “large” numbers of points. (Experiment shows noticeably growing $c_i$ even for third derivative from $10$ points and no solution for $32$ points on a uniform grid.) Fornberg albatross:::.Rdcite('Fornberg1988') suggests a more numerically stable procedure, but it still breaks down for $1000$ points.

It is important to note that the performance of the method depends on the matrix $\mathbf D$ being sparse. While the methods described above could give more accurate results, they do so at the cost of providing nonzero weights for a lot of points, and the weights get larger as the number of points increases. Therefore, with the knowledge that difference orders above $3$ are used very rarely and the interest in simplicity and performance, we'll minimise the number of coefficients and their values by solving the Vandermonde system for the minimally accurate derivative estimations, taking exactly $k + 1$ points for $k$-th derivative.

What is the error of such estimation? Substituting the Lagrange form of the remainder into $ \mathbf{c}^\top f(\mathbf x) $, we get:

$$ \sum_{i = 1}^n c_i f(x_i) = f^{(n-1)}(x) + \sum_{i = 1}^n c_i f^{(n)}(\xi_i) \frac{(x_i - x)^n}{n!}, \; \xi_i \in [ x_i, x ] $$

Our choice of $x$ (middle point for odd $n$, average of middle points for even $n$) lets us shave off one term from the sum above for odd $n$ and get second order accurate results for $n = 2$, but other than that, the method is $n$-th order accurate.

Whittaker smoothing

Whittaker smoothing works by minimising a sum of penalties albatross:::.Rdcite('Eilers2003'). Interpolation can be achieved by setting weights $\mathbf w$ to $0$ for the missing points.

$$ \min_{\mathbf{\hat z}} \: (\mathbf{\hat z} - \mathbf{z})^\top \mathrm{diag}(\mathbf w) (\mathbf{\hat z} - \mathbf{z}) + \lambda | \mathbf D \mathbf{\hat z} |^2 $$

By writing down the derivatives over $\mathbf{\hat z}$ and equating them to $0$, we get the normal equation:

$$ ( \mathrm{diag}(\mathbf w) + \lambda \mathbf{D}^\top \mathbf{D} ) \mathbf{\hat z} = \mathrm{diag}(\mathbf w) \mathbf z $$

The equation is then solved using solve.

Given a one-dimensional penalty matrix $\mathbf{D}_d$ of order $d$ obtained by solving a Vandermonde system for every successive group of $d+1$ points, we can adapt it for every applicable group of points from a fluorescence excitation-emission matrix unfolded into a vector $\mathbf z = \mathrm{vec}\, \mathbf F$ by taking Kronecker products with unit matrices: $$ \mathbf D = \begin{pmatrix} \mathbf I \otimes \mathbf{D}_\mathrm{em} \\ \mathbf{D}_\mathrm{ex} \otimes \mathbf I \end{pmatrix} $$

Penalties of different orders are concatenated by rows in a similar manner (which is equivalent to adding the corresponding $\mathbf{D}^\top\mathbf D$ matrices together in the normal equation).

It has been shown in albatross:::.Rdcite('Eilers2004-density') that a combination of first- and second-order penalty ($ 2 \lambda \mathbf{D}_1 + \lambda^2 \mathbf{D}_2 $) results in a non-negative impulse response, but the resulting peak shape may be sub-optimal.

References

albatross:::.Rdbibliography()

Examples

Run this code

  data(feems)
  z <- feemscatter(feems$a, rep(25, 4), 'omit')
  str(albatross:::whittaker2(
    attr(z, 'emission'), attr(z, 'excitation'), z,
    c(1, 1e-3), 1:2, logscale = NA, nonneg = 1
  ))

Run the code above in your browser using DataLab