Learn R Programming

MSclassifR (version 0.5.0)

PeakDetection: Detection of peaks in MassSpectrum objects

Description

Detects peaks on a list of MALDIquant MassSpectrum objects, with an optional preliminary averaging step and an optional discrete-bin alignment of detected peaks. Per-spectrum peak detection can be parallelized on Unix-alike systems (Linux/macOS) for very large inputs; on Windows a serial/vectorized path is used. The result is a list of MALDIquant MassPeaks ready for downstream matrix building (e.g., with MALDIquant::intensityMatrix or build_X_from_peaks_fast).

Usage

PeakDetection(
  x,
  averageMassSpec = TRUE,
  labels = NULL,
  averageMassSpectraMethod = "median",
  SNRdetection = 3,
  binPeaks = TRUE,
  PeakDetectionMethod = "MAD",
  halfWindowSizeDetection = 11,
  AlignMethod = "strict",
  Tolerance = 0.002,
  n_workers = NULL,
  verbose = TRUE,
  min_parallel_n = 2000L,
  chunk_size = 1000L,
  ...
)

Value

A list of MALDIquant::MassPeaks objects (one per input or averaged spectrum). If binPeaks = TRUE, all MassPeaks are aligned to the same discrete m/z bins (shared centers), facilitating fast matrix construction.

Arguments

x

List of MALDIquant::MassSpectrum objects (one per sample). These are typically obtained after preprocessing (baseline, smoothing, normalization).

averageMassSpec

Logical; if TRUE, average spectra using MALDIquant::averageMassSpectra before peak detection. If labels is provided and its length equals length(x), a groupwise averaging is performed; otherwise all spectra are averaged. Default TRUE.

labels

Optional factor/character vector for groupwise averaging (same semantics as MALDIquant::averageMassSpectra). Ignored if averageMassSpec = FALSE.

averageMassSpectraMethod

Character, "median" (default) or "mean". Passed to MALDIquant::averageMassSpectra when averageMassSpec = TRUE.

SNRdetection

Numeric; signal-to-noise ratio threshold for peak detection (MALDIquant::detectPeaks argument SNR). Default 3.

binPeaks

Logical; if TRUE, align detected peaks into discrete bins using MALDIquant::binPeaks (method/tolerance set by AlignMethod/Tolerance). Default TRUE.

PeakDetectionMethod

Character; MALDIquant::detectPeaks method, e.g., "MAD" (default) or "SuperSmoother".

halfWindowSizeDetection

Integer; half window size for local maxima (MALDIquant::detectPeaks argument halfWindowSize). Default 11.

AlignMethod

Character; MALDIquant::binPeaks method, "strict" (default) or "relaxed".

Tolerance

Numeric; MALDIquant::binPeaks tolerance (units consistent with your m/z axis). Default 0.002.

n_workers

Integer or NULL; requested number of parallel workers for the Unix mclapply path. On Windows, this is ignored (serial path). The effective number is sanitized by an internal helper (.safe_n_workers) to avoid oversubscription and R CMD check issues. Default NULL (auto).

verbose

Logical; if TRUE, print progress messages. Default TRUE.

min_parallel_n

Integer; minimum number of spectra at which the function will attempt Unix parallelization via mclapply. Defaults to 2000. Increase to be more conservative, decrease to parallelize more aggressively. Set to Inf to effectively disable Unix parallelization regardless of n_workers.

chunk_size

Integer; number of spectra per chunk/task submitted to mclapply on Unix. Larger chunks reduce scheduling overhead but use more memory per task. Default 1000.

...

Reserved for future extensions or pass-through to MALDIquant/MALDIrppa.

Details

  • Averaging: if averageMassSpec = TRUE and labels is provided with length(labels) == length(x), MALDIquant::averageMassSpectra performs a groupwise averaging by labels. Otherwise, all spectra are averaged. If averageMassSpec = FALSE, the input list is used as-is.

  • Peak detection: peak finding uses MALDIquant::detectPeaks with the given SNR/method/half-window. This is applied per spectrum (serial or parallel).

  • Discrete-bin alignment: when binPeaks = TRUE, MALDIquant::binPeaks aligns detected peaks to a shared discrete grid (method AlignMethod, tolerance Tolerance), enabling consistent feature columns across spectra.

  • Parallelization: on Windows, a single serial/vectorized call to MALDIquant::detectPeaks is used (fast enough for small/medium inputs). On Unix-alike systems, when length(x) >= min_parallel_n and n_workers > 1, the list is split into chunks of size chunk_size and processed with parallel::mclapply using the requested number of workers.

  • Meta-data: if labels is provided, label information is appended best-effort to MassPeaks metaData (file/fullName), preserving existing fields where possible.

See Also

MALDIquant::averageMassSpectra, MALDIquant::detectPeaks, MALDIquant::binPeaks; MALDIquant::intensityMatrix; build_X_from_peaks_fast for a fast matrix builder from MassPeaks.

Examples

Run this code
if (requireNamespace("MALDIquant", quietly = TRUE)) {
  # Two toy spectra with peaks near 1000, 1500, 2000 Da
  mass <- seq(900, 2100, by = 1)
  make_spectrum <- function(shift) {
    inten <- dnorm(mass, 1000 + shift, 2) * 50 +
             dnorm(mass, 1500 - shift, 2) * 80 +
             dnorm(mass, 2000 + shift, 2) * 40 +
             rnorm(length(mass), 0, 0.2)
    MALDIquant::createMassSpectrum(mass = mass, intensity = inten)
  }
  spectra <- list(make_spectrum(0.3), make_spectrum(-0.3))

  # Detect peaks without averaging; align in strict bins
  peaks <- PeakDetection(
    x = spectra,
    averageMassSpec = FALSE,
    SNRdetection = 3,
    PeakDetectionMethod = "MAD",
    binPeaks = TRUE,
    AlignMethod = "strict",
    Tolerance = 0.5,
    verbose = TRUE
  )

  # Build an intensity matrix (rows = spectra, cols = aligned m/z bins)
  X <- MALDIquant::intensityMatrix(peaks)
  dim(X)
}

Run the code above in your browser using DataLab