photobiology (version 0.11.2)

despike: Remove spikes from spectrum

Description

Function that returns an R object with observations corresponding to spikes replaced by values computed from neighboring pixels. Spikes are values in spectra that are unusually high compared to neighbors. They are usually individual values or very short runs of similar "unusual" values. Spikes caused by cosmic radiation are a frequent problem in Raman spectra. Another source of spikes are "hot pixels" in CCD and diode array detectors.

Usage

despike(x, z.threshold, max.spike.width, window.width, method, na.rm, ...)

# S3 method for default despike( x, z.threshold = NA, max.spike.width = NA, window.width = NA, method = "run.mean", na.rm = FALSE, ... )

# S3 method for numeric despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ... )

# S3 method for data.frame despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ..., y.var.name = NULL, var.name = y.var.name )

# S3 method for generic_spct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, y.var.name = NULL, var.name = y.var.name, ... )

# S3 method for source_spct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, unit.out = getOption("photobiology.radiation.unit", default = "energy"), ... )

# S3 method for response_spct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, unit.out = getOption("photobiology.radiation.unit", default = "energy"), ... )

# S3 method for filter_spct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, filter.qty = getOption("photobiology.filter.qty", default = "transmittance"), ... )

# S3 method for reflector_spct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ... )

# S3 method for solute_spct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ... )

# S3 method for cps_spct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ... )

# S3 method for raw_spct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ... )

# S3 method for generic_mspct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ..., y.var.name = NULL, var.name = y.var.name, .parallel = FALSE, .paropts = NULL )

# S3 method for source_mspct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, unit.out = getOption("photobiology.radiation.unit", default = "energy"), ..., .parallel = FALSE, .paropts = NULL )

# S3 method for response_mspct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, unit.out = getOption("photobiology.radiation.unit", default = "energy"), ..., .parallel = FALSE, .paropts = NULL )

# S3 method for filter_mspct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, filter.qty = getOption("photobiology.filter.qty", default = "transmittance"), ..., .parallel = FALSE, .paropts = NULL )

# S3 method for reflector_mspct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ..., .parallel = FALSE, .paropts = NULL )

# S3 method for solute_mspct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ..., .parallel = FALSE, .paropts = NULL )

# S3 method for cps_mspct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ..., .parallel = FALSE, .paropts = NULL )

# S3 method for raw_mspct despike( x, z.threshold = 9, max.spike.width = 8, window.width = 11, method = "run.mean", na.rm = FALSE, ..., .parallel = FALSE, .paropts = NULL )

Value

A copy of the object passed as argument to x with values detected as spikes replaced by a local average of adjacent neighbors outside the spike.

Arguments

x

an R object

z.threshold

numeric Modified Z values larger than z.threshold are considered to correspond to spikes.

max.spike.width

integer Wider regions with high Z values are not detected as spikes.

window.width

integer. The full width of the window used for the running mean used as replacement.

method

character The name of the method: "run.mean" is running mean as described in Whitaker and Hayes (2018); "adj.mean" is mean of adjacent neighbors (isolated bad pixels only).

na.rm

logical indicating whether NA values should be treated as spikes and replaced.

...

Arguments passed by name to find_spikes().

var.name, y.var.name

character Names of columns where to look for spikes to remove.

unit.out

character One of "energy" or "photon"

filter.qty

character One of "transmittance" or "absorbance"

.parallel

if TRUE, apply function in parallel, using parallel backend provided by foreach

.paropts

a list of additional options passed into the foreach function when parallel computation is enabled. This is important if (for example) your code relies on external data or packages: use the .export and .packages arguments to supply them so that all cluster nodes have the correct environment set up for computing.

Methods (by class)

  • despike(default): Default returning always NA.

  • despike(numeric): Default function usable on numeric vectors.

  • despike(data.frame): Method for "data.frame" objects.

  • despike(generic_spct): Method for "generic_spct" objects.

  • despike(source_spct): Method for "source_spct" objects.

  • despike(response_spct): Method for "response_spct" objects.

  • despike(filter_spct): Method for "filter_spct" objects.

  • despike(reflector_spct): Method for "reflector_spct" objects.

  • despike(solute_spct): Method for "solute_spct" objects.

  • despike(cps_spct): Method for "cps_spct" objects.

  • despike(raw_spct): Method for "raw_spct" objects.

  • despike(generic_mspct): Method for "generic_mspct" objects.

  • despike(source_mspct): Method for "source_mspct" objects.

  • despike(response_mspct): Method for "cps_mspct" objects.

  • despike(filter_mspct): Method for "filter_mspct" objects.

  • despike(reflector_mspct): Method for "reflector_mspct" objects.

  • despike(solute_mspct): Method for "solute_mspct" objects.

  • despike(cps_mspct): Method for "cps_mspct" objects.

  • despike(raw_mspct): Method for "raw_mspct" objects.

Details

Spikes are detected based on a modified Z score calculated from the differenced spectrum. The Z threshold used should be adjusted to the characteristics of the input and desired sensitivity. The lower the threshold the more stringent the test becomes, resulting in most cases in more spikes being detected. A modified version of the algorithm is used if a value different from NULL is passed as argument to max.spike.width. In such a case, an additional step filters out broader spikes (or falsely detected steep slopes) from the returned values.

Simple interpolation replaces values of isolated bad pixels by the mean of their two closest neighbors. The running mean approach allows the replacement of short runs of bad pixels by the running mean of neighboring pixels within a window of user-specified width. The first approach works well for spectra from array spectrometers to correct for hot and dead pixels in an instrument. The second approach is most suitable for Raman spectra in which spikes triggered by radiation are wider than a single pixel but usually not more than five pixels wide.

When the argument passed to x contains multiple spectra, the spikes are searched for and replaced in each spectrum independently of other spectra.

See Also

See the documentation for find_spikes and replace_bad_pixs for details of the algorithm and implementation.

Examples

Run this code

white_led.raw_spct[120:125, ]

# find and replace spike at 245.93 nm
despike(white_led.raw_spct,
        z.threshold = 10,
        window.width = 25)[120:125, ]

Run the code above in your browser using DataLab