feemscatter: Handle scattering signal in FEEMs

Description

Remove or interpolate scattering signal in individual FEEM objects, FEEM cube objects, or lists of them.

Usage

feemscatter(x, ...)
  # S3 method for list
feemscatter(x, ..., cl, progress = TRUE)
  # S3 method for feemcube
feemscatter(x, ..., cl, progress = TRUE)
  # S3 method for feem
feemscatter(
    x, widths, method = c("omit", "pchip", "loess", "kriging", "whittaker"),
    add.zeroes = 30, Raman.shift = 3400, ...
  )

Value

An object of the same kind (FEEM object / FEEM cube / list of them) with scattering signal handled as requested.

Arguments

x

An individual FEEM object, FEEM cube object, or a list of them, to handle the scattering signal in.

widths

A numeric vector or a list containing the half-widths of the scattering bands, in nm. Rayleigh scattering is followed by Raman scattering, followed by second diffraction order for Rayleigh and Raman, and so on. (Typically, there's no need for anything higher than third order, and even that is rare.) For example:

Rayleigh scattering
Raman scattering
Rayleigh scattering, $2\lambda$
Raman scattering, $2\lambda$
...

For higher diffraction orders, the peak widths are proportionally scaled, making it possible to provide the same number for all kinds of scattering visible in the EEM. Set a width to $0$ if you don't want to handle this particular kind of scattering signal.

It's possible to specify the bands asymmetrically. If the area to be corrected should range from x nm to the left of the scattering peak to y nm to the right of it, pass a list instead of a vector, and put a two-element vector c(x, y) for the appropriate kind of scattering. For example, passing widths = list(c(30, 20), 20) means “handle $-30$ nm to the left and $+20$ nm to the right of Rayleigh peak and $\pm 20$ nm around Raman peak”.

To sum up, given two half-widths $W_1$ and $W_2$, the test for being inside a $k$th diffraction order of a scattering band is as follows:

$$ -W_1 < \frac{\lambda_\mathrm{center}}{k} - \lambda_\mathrm{em} < +W_2 $$ if (!dir.exists('man/figures')) dir.create('man/figures'); { pdf('man/figures/scatter-widths.pdf', 6, 3.2, pointsize = 10) dev.control(displaylist = 'enable') em.range <- c(300, 700) ex.range <- c(210, 550) widths <- list( list(c(50, 20), 20), list(c(20, 20), 20), list(c(10, 10), 0) ) Raman.shift <- 3400 # 1. The rectangle itself image( matrix(0, 1, 1), col = NA, xlim = em.range, ylim = ex.range, xlab = quote(lambda[em] * ', nm'), ylab = quote(lambda[ex] * ', nm'), main = 'widths = list(c(50, 20), 20, 20, 20, 10), Raman.shift = 3400' ) # 2. The scattering areas and their centers sc <- c(min(em.range, ex.range), max(em.range, ex.range)) Map(function(p, k) { polygon( c(sc * k - p[[1]][[1]] * k, rev(sc * k) + p[[1]][[2]] * k), c(sc, rev(sc)), border = NA, col = '#D0D0D0' ) if (p[[2]] > 0) { wl.ex <- seq(sc[1], sc[2], 1) wl.em <- 1/(1/wl.ex - Raman.shift/1e7) polygon( c(wl.em * k - p[[2]] * k, rev(wl.em * k) + p[[2]] * k), c(wl.ex, rev(wl.ex)), border = NA, col = '#D0D0D0' ) lines(wl.em * k, wl.ex, lty = 3) } lines(sc * k, sc, lty = 3) }, widths, seq_along(widths)) # arrow length dp <- 15 # Raman shift arrow p <- 400 arrows( p, p, x1 = 1/(1/p - Raman.shift/1e7), len = .1 ) text( 1/(1/p - Raman.shift/1e7) + dp / 2, p, bquote( (lambda[ex]^-1 - lambda[em]^-1) == .(Raman.shift) * ' cm'^-1 ), adj = c(0, .6) ) # scatter width arrows p <- 530 segments(p, p, x1 = p - widths[[1]][[1]][1]) arrows( c(p - widths[[1]][[1]][1] - dp, p + dp), p, x1 = c(p - widths[[1]][[1]][1], p), len = .1 ) text( p - widths[[1]][[1]][1] - dp, p, paste('widths[[1]][1] = ', widths[[1]][[1]][1], 'nm'), pos = 2 ) p <- 470 segments(p, p, x1 = p + widths[[1]][[1]][2]) arrows( c(p + widths[[1]][[1]][2] + dp, p - dp), p, x1 = c(p + widths[[1]][[1]][2], p), len = .1 ) text( p - dp, p, paste('widths[[1]][2] =', widths[[1]][[1]][2], 'nm'), pos = 2 ) p <- 500 pR <- 1/(1/p - Raman.shift/1e7) segments(pR, p, x1 = pR + widths[[1]][[2]]) arrows( c(pR - dp, pR + widths[[1]][[2]] + dp), p, x1 = c(pR, pR + widths[[1]][[2]]), len = .1 ) text( pR + widths[[1]][[2]], p, paste('widths[[2]] =', widths[[2]][[1]][1], 'nm'), pos = 1 ) p <- 310 segments(2 * p, p, x1 = 2 * (p - widths[[2]][[1]][1])) arrows( c(2 * (p - widths[[2]][[1]][1]) - dp, 2 * p + dp), p, x1 = c(2 * (p - widths[[2]][[1]][1]), 2 * p), len = .1 ) text( 2 * (p - widths[[2]][[1]][1]) - dp, p, bquote('widths[[3]]' / 2 == .(widths[[2]][[1]][1]) * ' nm'), pos = 2 ) p <- 230 segments(3 * p, p, x1 = 3 * (p - widths[[2]][[1]][1])) arrows( c(3 * (p - widths[[3]][[1]][1]) - dp, 3 * p + dp), p, x1 = c(3 * (p - widths[[3]][[1]][1]), 3 * p), len = .1 ) text( 3 * (p - widths[[2]][[1]][1]), p, bquote('widths[[5]]' / 3 == .(widths[[3]][[1]][1]) * ' nm'), pos = 2 ) dev.print( svg, 'man/figures/scatter-widths.svg', width = 6, height = 3.2, pointsize = 10 ) dev.off() }; ' '

method

A string choosing how to handle the scattering signal:

omit: Replace it with NA.

pchip

Interpolate it line-by-line using piecewise cubic Hermitean polynomials (pchip). Pass a by argument to choose the direction of interpolation; see Details.

loess

Interpolate it by fitting a locally weighted polynomial surface (loess). In this case the remaining parameters are passed verbatim to loess, which may be used to set parameters such as span.

kriging

Interpolate it by means of ordinary or simple Kriging, as implemented in pracma function kriging. Pass a type argument to choose between the two methods. This method is not recommended due to its high CPU time and memory demands: it has to invert a dense $O(N^2)$ matrix (which easily reaches multiple gigabytes for some EEMs), then take its products with vectors $O(N)$ times, with $N =$ length(x).

whittaker

Interpolate it by minimising a weighted sum of squared residuals (for known part of the spectrum) and roughness penalty (squared central difference approximations for derivatives by $\lambda_\mathrm{em}$ and $\lambda_\mathrm{ex}$). See Details for more information and parameters.

add.zeroes

Set intensities at $ \lambda_\mathrm{em} < \lambda_\mathrm{ex} - \mathtt{add.zeroes}\:\mathrm{nm} $ to $0$ unless they have been measured. Set to NA to disable this behaviour.

Raman.shift

Raman shift of the scattering signal of water, $\textrm{cm}^{-1}$.

...

Passed verbatim from feemscatter generics to feemscatter.feem.

If “pchip” method is selected, the by parameter chooses between interpolating by row, by column, or averaging both, see Details.

If “loess” method is selected, remaining options are passed to loess (the span parameter is of particular interest there).

If “kriging” method is selected, remaining options are passed to kriging.

If “whittaker” method is selected, available parameters include d, lambda, nonneg and logscale, see Details.

If not missing, a parallel cluster object to run the scattering correction code on or NULL for the default cluster object registered via setDefaultCluster.

progress

Set to FALSE to disable the progress bar.

Details

The “pchip” method works by default as described in [1]: each emission spectrum at different excitation wavelengths is considered one by one. Zeroes are inserted in the corners of the spectrum if they are undefined (NA) to prevent extrapolation from blowing up, then the margins are interpolated using the corner points, then the rest of the spectrum is interpolated line by line. Since pchip requires at least 3 points to interpolate, the function falls back to linear interpolation if it has only two defined points to work with. The by argument controls whether the function proceeds by rows of the matrix (“emission”, default), by columns of the matrix (“excitation”), or does both (“both”) and averages the results to make the resulting artefacts less severe [2, see the staRdom package itself].

The “loess” method feeds the whole FEEM except the area to be interpolated to loess, then asks it to predict the remaining part of the spectrum. Any negative values predicted by loess are replaced by $0$.

The “kriging” method [3] is much more computationally expensive than the previous two, but, on some spectra, provides best results, not affected by artefacts resulting from line-by-line one-dimensional interpolation (pchip) or varying degrees of smoothness in different areas of the spectrum (loess). Any negative values returned by kriging are replaced by $0$.

The “whittaker” method [4] works by unfolding each FEEM into a vector $\mathbf z$ and looking for a vector $\hat{\mathbf z}$ that is close enough to $\mathbf z$, but also smooth, by minimising a sum of penalties:

$$ (\mathbf z - \hat{\mathbf z})^\top \mathrm{diag}(\mathbf w) (\mathbf z - \hat{\mathbf z}) + \sum_k \lambda_k |\mathbf{D}_{n_k} \hat{\mathbf z}|^2 $$

The weights $\mathbf w$ are set to $0$ for missing (NA) points and for those to be interpolated and to $1$ otherwise. The matrix $\mathbf D_n$ is constructed in such a way that multiplying it by $\hat{\mathbf z}$ results in a vector of $n$-th order derivative estimates in both directions and in all applicable points of $\hat{\mathbf z}$ as a matrix. The wavelength grid is correctly taken into account by solving a Vandermonde system for every $n+1$ consecutive points.

The parameters d and lambda should be numeric vectors of the same length, choosing the difference orders ($n_k$) and their weights ($\lambda_k$). It has been shown in [5] that a combination of first- and second-order penalty ($ 2 \lambda \mathbf{D}_1 + \lambda^2 \mathbf{D}_2 $) results in non-negative impulse response, but the resulting peak shape may be sub-optimal. Instead, the default penalty is $ 10^{-2} \mathbf{D}_1 + 10 \mathbf{D}_2 $, and resulting negative values are pulled to $0$ with weight nonneg (default $1$, same as fidelity weight) by adding a penalty of $ \mathtt{nonneg} \cdot \sum_i \mathbf{1}_{\hat{z}_i < 0} \, \hat{z}_i^2 $ and retrying until no new penalty weights are added. Set nonneg to $0$ to disable this behaviour.

It is also possible to deal with resulting negative values by scaling and shifting the signal between logscale (typically $10^{-4}$) and $1$, interpolating the logarithm of the signal, then undoing the transformation. This prevents the resulting values from getting lower than $ \mathrm{min}(x) - (\mathrm{max}(x) - \mathrm{min}(x)) \frac{\mathtt{logscale}}{1 - \mathtt{logscale}} $, which is approximately $ -\mathtt{logscale} \cdot \mathrm{max}(x) $ if logscale and $\mathrm{min}(x)$ are both close to $0$. By default logscale is NA, disabling this behaviour, since it may negatively affect the shape of interpolated signal.

References

tools::toRd(bibentry('Article', author = c( person('Morteza', 'Bahram'), person('Rasmus', 'Bro'), person('Colin', 'Stedmon'), person('Abbas', 'Afkhami') ), title = paste( 'Handling of Rayleigh and Raman scatter for PARAFAC modeling of', 'fluorescence data using interpolation' ), journal = 'Journal of Chemometrics', volume = 20, number = '3-4', pages = '99-105', year = 2006, doi = '10.1002/cem.978', ))
tools::toRd(bibentry('Article', author = c( person('Matthias', 'Pucher'), person('Urban', paste0(rawToChar(as.raw(0x5c)), 'enc{Wünsch}{Wuensch}')), person('Gabriele', 'Weigelhofer'), person('Kathleen', 'Murphy'), person('Thomas', 'Hein'), person('Daniel', 'Graeber') ), title = paste( 'staRdom: Versatile Software for Analyzing Spectroscopic Data', 'of Dissolved Organic Matter in R' ), journal = 'Water', volume = 11, number = 11, year = 2019, pages = 2366, doi = '10.3390/w11112366' ))
tools::toRd(bibentry('InBook', author = c( person(c('William', 'H.'), 'Press'), person(c('Saul', 'A.'), 'Teukolsky'), person(c('William', 'T.'), 'Vetterling'), person(c('Brian', 'P.'), 'Flannery') ), booktitle = 'Numerical recipes: The Art of Scientific Computing (3rd Ed.)', publisher = 'Cambridge University Press, New York', chapter = '3.7.4', title = 'Interpolation by Kriging', pages = '144-147', year = 2007 ))
tools::toRd(bibentry('Article', author = person(c('Paul', 'H.', 'C.'), 'Eilers'), title = 'A Perfect Smoother', journal = 'Analytical Chemistry', volume = 75, number = 14, pages = '3631-3636', year = 2003, doi = '10.1021/ac034173t' ))
tools::toRd(bibentry('Article', author = c( person(c('Paul', 'H.', 'C.'), 'Eilers'), person(c('Jelle', 'J.'), 'Goeman') ), title = 'Enhancing scatterplots with smoothed densities', journal = 'Bioinformatics', volume = 20, number = 5, pages = '623-628', year = 2004, doi = '10.1093/bioinformatics/btg454' ))

Examples

Run this code

  data(feems)
  plot(x <- feemscatter(
    feems[[1]], widths = c(25, 25, 20, 20),
    method = 'whittaker', Raman.shift = 3500
  ))

Run the code above in your browser using DataLab