Learn R Programming

PepSAVIms (version 0.9.1)

filterMS: Filter compounds from mass spectrometry data

Description

Filters mass spectrometry data using a set of criteria, described in Details. Returns an object of classes msDat and filterMS.

Usage

filterMS(msObj, region, border = "all", bord_ratio = 0.05, min_inten = 1000, max_chg = 7L)

Arguments

msObj
An object class msDat. Note that this includes objects created by the functions binMS and msDat.
region
A vector either of mode character or mode numeric. If numeric then the entries should provide the indices for the region of interest in the mass spectrometry data provided as the argument for msObj. If character then the entries should uniquely specify the region of interest through partial string matching (see criterion 1, 4).
border
Either a character string "all", or a character string "none", or a length-1 or length-2 numeric value specifying the number of fractions to either side of the region of interest to comprise the bordering region. If a single numeric value, then this is the number of fractions to each side of the region of interest; if it is two values, then the first value is the number of fractions to the left, and the second value is the number of fractions to the right. If there are not enough fractions in either direction to completely span the number of specified fractions, then all of the available fractions to the side in question are considered to be part of the bordering region (see criterion 2).
bord_ratio
A single nonnegative numeric value. A value of 0 will not admit any compounds, while a value greater than 1 will admit all compounds (see criterion 2).
min_inten
A single numeric value. A value less than the minimum mass spectrometry value in the data will admit all compounds (see criterion 4).
max_chg
A single numeric value specifying the maximum charge which a compound may exhibit (see criterion 5)

Value

Returns an object of class filterMS which inherits from msDat. This object is a list with elements described below. The class is equipped with a print, summary, and extractMS function.

Details

Attempts to filter out candidate compounds via subject-matter knowledge, with the goal of removing spurious noise from downstream models. The criteria for the downstream inclusion of a candidate compound is listed below.

  1. The m/z intensity maximum must fall inside the range of the bioactivity region of interest

  • The ratio of the m/z intensity of a species in the areas bordering the region of interest and the species maximum intensity must be less than bord_ratio. When there is no bordering area then it is taken to mean that all observations satisfy this criterion.
  • The immediately right adjacent fraction to its maximum intensity fraction for a species must have a non-zero abundance. In the case of ties for the maximum, it is the fraction immediately to the right of the rightmost maximum fraction which cannot have zero abundance. When the fraction with maximum intensity is the rightmost fraction in the data for an observation, then it is taken to mean that the observation satisfies this criterion.
  • At least 1 fraction in the region of interest must have intensity greater than min_inten
  • Compound charge state must be less than or equal to max_chg
  • Examples

    Run this code
    
    # Load mass spectrometry data
    data(mass_spec)
    
    # Convert mass_spec from a data.frame to an msDat object
    ms <- msDat(mass_spec = mass_spec,
                mtoz = "m/z",
                charge = "Charge",
                ms_inten = c(paste0("_", 11:43), "_47"))
    
    # Filter out potential candidate compounds
    filter_out <- filterMS(msObj = ms,
                           region = paste0("VO_", 17:25),
                           border = "all",
                           bord_ratio = 0.01,
                           min_inten = 1000,
                           max_chg = 7)
    
    # print, summary function
    filter_out
    summary(filter_out)
    
    # Extract filtered mass spectrometry data as a matrix or msDat object
    filter_matr <- extractMS(msObj = filter_out, type = "matrix")
    filter_msDat <- extractMS(msObj = filter_out, type = "matrix")
    
    

    Run the code above in your browser using DataLab