Smooth [1] or estimate the baseline [2] of surface measured on an arbitrary grid by minimising a sum of penalties. Combined difference orders and two different methods of preventing negative values in the output are supported.
This is not a public interface. Subject to change without further notice. Please do not call from outside albatross.
whittaker2(x, y, z, lambda, d, p, logscale, nonneg)
diffmat(x, y, d)
vandermonde(x0)
whittaker2A matrix of the same shape as z, with values smoothed,
interpolated, or baseline estimated.
diffmatA difference matrix \(\mathbf D\) that, when multiplied by \(\mathrm{vec}(\mathbf{Z})\), returns a vector of estimated derivatives of \(\mathbf Z\) of given order by \(x\) and \(y\) in every available point (see below).
vandermondeA vector \(\mathbf c\) of length \(n =\)
length(x0) such that
\(
\mathbf{c}^\top f(\mathbf{x_0}) \approx f^{(n-1)}(x)
\). The
\(x\) is either the median point of x0 or the average of
two points in the middle, depending on whether \(n\) is odd or even.
Grid values along the rows of z.
Grid values along the columns of z.
Matrix containing the surface values to smooth or NAs to
interpolate.
A vector of smoothness penalties, one for every difference order.
Must be of the same length as d.
whittaker2A vector of difference orders corresponding to elements of
lambda.
diffmatDifference order, an integer scalar.
If not missing, use the asymmetric penalty method [2] to estimate the baseline by penalising the differences with weight \(p\) if \(\hat{z} < z\) and \(1 - p\) otherwise. Typically, values around \(10^{-3}\) are used.
If not NA, rescale z between logscale and
\(1\) and take its logarithm before smoothing, then undo it. This
is supposed to prevent negative numbers from appearing in the
interpolation, but may damage the surface shape in the process.
If not 0, for every resulting negative value in the interpolated
surface, add a penalty of nonneg pulling it to \(0\) and
repeat the process until no new penalties are introduced.
A vector specifying the grid where the function to be differentiated is measured. Must be sorted.
Whittaker smoothing works by minimising a sum of penalties [1]. Taking weights into account, we get:
$$ \min_{\mathbf{\hat z}} \: (\mathbf{\hat z} - \mathbf{z})^\top \mathrm{diag}(\mathbf w) (\mathbf{\hat z} - \mathbf{z}) + \lambda | \mathbf D \mathbf{\hat z} |^2 $$
By writing down the derivatives over \(\mathbf{\hat z}\) and equating them to \(0\), we get the normal equation:
$$ ( \mathrm{diag}(\mathbf w) + \lambda \mathbf{D}^\top \mathbf{D} ) \mathbf{\hat z} = \mathrm{diag}(\mathbf w) \mathbf z $$
The equation is then solved using solve.
Equations above deal with vectors, not matrices. We can't just
smooth the matrix row-by-row (or column-by-column): that wouldn't be
any better than feemscatter(method = 'pchip'). One way
to generalise this to matrices would be [3], but it only works for
uniform grids. Instead, we unfold the z matrix into a vector,
then strategically position the coefficients in the difference
matrix to obtain both row-wise and column-wise differences in the
resulting vector.
How to differentiate a function tabulated on a fixed, potentially nonuniform grid before you even know its values? Use its Taylor series.
First derivative is special because it's possible to use central differences and get a second-order accurate result, even on a non-uniform grid, by carefully choosing the points where the derivative is calculated. Let \(x + \frac{h}{2}\) and \(x - \frac{h}{2}\) be a pair of adjacent points from the grid. Here's the Taylor series expansion for \(f\) around \(x\), with the Lagrange form of the reminder:
$$ f \left(x + \frac{h}{2}\right) = f(x) + \frac{h}{2} f'(x) + \frac{h^2}{8} f''(x) + \frac{h^3}{48} f'''(\zeta) $$
$$ f \left(x - \frac{h}{2}\right) = f(x) - \frac{h}{2} f'(x) + \frac{h^2}{8} f''(x) - \frac{h^3}{48} f'''(\eta) $$
$$ f \left(x + \frac{h}{2}\right) - f \left(x - \frac{h}{2}\right) = h f'(x) + \frac{h^3}{48} \left(f'''(\zeta) + f'''(\eta)\right) $$
$$ f'(x) = \frac{ f\left(x + \frac{h}{2}\right) - f\left(x - \frac{h}{2}\right) }{h} - \frac{h^2}{48} \left(f'''(\zeta) + f'''(\eta)\right) $$
$$ |\delta f'(x)| \le \max_{ \xi \in [x - \frac{h}{2}, x + \frac{h}{2}] } \frac{h^2}{24} f'''(\xi) $$
Suppose the three grid points \(\xi_1 = x_1 - \frac{h_1}{2}\), \( \xi_2 = x_1 + \frac{h_1}{2} = x_2 - \frac{h_2}{2} \), \(\xi_3 = x_2 + \frac{h_2}{2}\) are adjacent on the grid, and we know the \(f'\) estimates in \(x_1\) and \(x_2\):
On the one hand, Taylor series expansion for \(f'(x)\) around \(\frac{x_1 + x_2}{2}\) gives:
$$ f'(x_2) = f'\left(\frac{x_1 + x_2}{2}\right) + f''\left(\frac{x_1 + x_2}{2}\right)\frac{x_2 - x_1}{2} + f'''\left(\frac{x_1 + x_2}{2}\right)\frac{(x_2 - x_1)^2}{8} + f''''(\zeta)\frac{(x_2 - x_1)^3}{48} $$ $$ f'(x_1) = f'\left(\frac{x_1 + x_2}{2}\right) - f''\left(\frac{x_1 + x_2}{2}\right)\frac{x_2 - x_1}{2} + f'''\left(\frac{x_1 + x_2}{2}\right)\frac{(x_2 - x_1)^2}{8} - f''''(\eta)\frac{(x_2 - x_1)^3}{48} $$ $$ f''\left(\frac{x_1 + x_2}{2}\right) = \frac{f'(x_2) - f'(x_1)}{x_2 - x_1} - \frac{(x_2 - x_1)^2}{48}(f''''(\zeta) + f''''(\eta)) $$ $$ |\delta f''(x)| \le \max_{ \xi \in [x_1, x_2] } \frac{(x_2 - x_1)^2}{24} f''''(\xi) $$
On the other hand, if we substitute the estimations of \(f'(x)\) from above, we get:
$$ f''\left(\frac{x_1 + x_2}{2}\right) = \frac{ h_1 f(\xi_3) - h_1 f(\xi_2) - h_2 f(\xi_2) + h_2 f(\xi_1) }{h_1 h_2 (x_2 - x_1)} - \frac{ h_2^2 f'''(\zeta_2) - h_1^2 f'''(\zeta_1) }{24(x_2 - x_1)} - \frac{(x_2 - x_1)^2}{24} f''''(\eta) $$
This is why we can't just keep using central differences and get \(n+1\)th order accurate results.
What are the general methods of finding the coefficients for the \(\mathbf D\) matrix? Start with a system of Taylor series expansions for every grid point:
$$ f(x_i) = \sum_{k=0}^{n-1} f^{(k)}(x) \frac{(x_i - x)^k}{k!} + f^{(n)}(\xi) \frac{(x_i - x)^{n}}{n!} \; \forall i = 1 \dots n $$
We can solve this system for coefficients \(c_i\) giving the desired \(l\)-th derivative estimate with highest accuracy \(p\) possible [4]:
$$ \sum_{i = 1}^n c_i f(x_i) = f^{(l)}(x) + o(h^p) $$
Substituting the approximations for \(f(x_i)\) into the equation, we get the following condition for the multiplier in front of each \(f^{(k)}(x)\):
$$ \frac{1}{k!} \sum_{i = 1}^n c_i (x_i - x)^k = \mathbf{1}_{k = l} \; \forall k = 0 \dots n-1 $$
In the matrix form, this becomes a Vandermonde system:
$$ V_{k,i} = \frac{(x_i - x)^k}{k!} $$ $$ b_k = \mathbf{1}_{k = l} $$ $$ \mathbf c = \mathbf{V}^{-1} \mathbf b $$
Unfortunately, this system becomes ill-conditioned for “large” numbers of points. (Experiment shows noticeably growing \(c_i\) even for third derivative from \(10\) points and no solution for \(32\) points on a uniform grid.) Fornberg [5] suggests a more numerically stable procedure, but it still breaks down for \(1000\) points.
It is important to note that the performance of the method depends on the matrix \(\mathbf D\) being sparse. While the methods described above could give more accurate results, they do so at the cost of providing nonzero weights for a lot of points, and the weights get larger as the number of points increases. Therefore, with the knowledge that difference orders above \(3\) are used very rarely and the interest in simplicity and performance, we'll minimise the number of coefficients and their values by solving the Vandermonde system for the minimally accurate derivative estimations, taking exactly \(k + 1\) points for \(k\)-th derivative.
What is the error of such estimation? Substituting the Lagrange form of the remainder into \( \mathbf{c}^\top f(\mathbf x) \), we get:
$$ \sum_{i = 1}^n c_i f(x_i) = f^{(n-1)}(x) + \sum_{i = 1}^n c_i f^{(n)}(\xi_i) \frac{(x_i - x)^n}{n!}, \; \xi_i \in [ x_i, x ] $$
Our choice of \(x\) (middle point for odd \(n\), average of middle points for even \(n\)) lets us shave off one term from the sum above for odd \(n\) and get second order accurate results for \(n = 2\), but other than that, the method is \(n\)-th order accurate.
tools::toRd(bibentry('Article', title = 'A Perfect Smoother', volume = 75, doi = '10.1021/ac034173t', number = 14, journal = 'Analytical Chemistry', author = person(c('Paul', 'H.', 'C.'), 'Eilers'), year = 2003, pages = '3631-3636', ))
tools::toRd(bibentry('Article', title = 'Parametric Time Warping', volume = 76, doi = '10.1021/ac034800e', number = 2, journal = 'Analytical Chemistry', author = person(c('Paul', 'H.', 'C.'), 'Eilers'), year = 2004, pages = '404-411', ))
tools::toRd(bibentry('Article', title = 'Robust smoothing of gridded data in one and higher dimensions with missing values', volume = 54, doi = '10.1016/j.csda.2009.09.020', number = 4, journal = 'Computational Statistics & Data Analysis', author = person('Damien', 'Garcia'), year = 2010, pages = '1167-1178' ))
tools::toRd(bibentry('InBook', author = person(c('Randall', 'J.'), 'LeVeque'), booktitle = 'Finite Difference Methods for Ordinary and Partial Differential Equations', chapter = 1, pages = '3-11', publisher = 'Society for Industrial and Applied Mathematics (SIAM)', title = 'Finite Difference Approximations', url = 'https://faculty.washington.edu/rjl/fdmbook/', year = 2007 ))
tools::toRd(bibentry('Article', title = 'Generation of finite difference formulas on arbitrarily spaced grids', volume = 51, doi = '10.1090/S0025-5718-1988-0935077-0', number = 184, journal = 'Mathematics of Computation', author = person('Bengt', 'Fornberg'), year = 1988, pages = '699-706' ))
feemscatter
data(feems)
z <- feemscatter(feems$a, rep(25, 4), 'omit')
str(albatross:::whittaker2(
attr(z, 'emission'), attr(z, 'excitation'), z,
c(1, 1e-3), 1:2, logscale = NA, nonneg = 1
))
Run the code above in your browser using DataLab