Learn R Programming

MSnID (version 1.6.0)

optimize_filter: Filter criteria optimization to maximize the number of identifications given the FDR upper threshold

Description

Adjusts parameters in the "MSnIDFilter" instance to achieve the most number of spectra, peptides or proteins/accessions within pre-set FDR upper limit.

Usage

optimize_filter(filterObj, msnidObj, fdr.max, method, level, n.iter, mc.cores=NULL)

Arguments

filterObj
An instance of class "MSnIDFilter".
msnidObj
An instance of class "MSnID".
fdr.max
Upper limit on acceptable false discovery rate (FDR).
method
Optimization method. Possible values are "Grid" or same values as for the method argument of the optim function. Tested and recommended arguments (besides "Grid") of method are "Nelder-Mead" or "SANN".
level
Determines at what level to perform optimization. Possible values are "PSM", "Peptides" or "Accession".
n.iter
For method "Grid" is approxomate number of evaluation point. For "Nelder-Mean" and "SANN" methods see optim.
mc.cores
Controls the number of enabled cores for parallel processing. Make sense only for "Grid" optimizaton. Default is getOption("mc.cores", 2L). See mclapply for details.

Value

Returns an instance of "MSnIDFilter" that is maximized to provide the most number of identifications while maintaining a pre-set confidence (FDR).

Details

The "Grid" method is brute-force optimization through evaluation of approximately n.iter combinations of the parameters set in the "MSnIDFilter" object. The enumeration of the parameter combinations proceeds as follows. The n.iter number is getting split given the dimensionality of the problem (that is the number of filter parameters in the "MSnIDFilter" object. For each parameter the evaluation points are equally spaced according to quantiles of the parameter distribution. This way we enumerate the grid that has more evaluation points in relatively more dense areas.

Note, optimization is computationally expensive. Thus, the optimize_filter call is memoised using facilities from the R.cache package. Once the same call of optime_filter function issued second time the results will be retrieved from cache.

See Also

MSnID evaluate_filter apply_filter

Examples

Run this code
data(c_elegans)

# explicitely adding parameters that will be used for data filtering
msnidObj$msmsScore <- -log10(msnidObj$`MS-GF:SpecEValue`)
msnidObj$absParentMassErrorPPM <- abs(mass_measurement_error(msnidObj))

# Setting up filter object
filtObj <- MSnIDFilter(msnidObj)
filtObj$absParentMassErrorPPM <- list(comparison="<", threshold=10.0)
filtObj$msmsScore <- list(comparison=">", threshold=10.0)

system.time({
    filtObj.grid <- optimize_filter(filtObj, msnidObj, fdr.max=0.01, 
                                    method="Grid", level="peptide", n.iter=50)})
show(filtObj.grid)

# Fine tuning. Nelder-Mead optimization.
system.time({
    filtObj.nm <- optimize_filter(filtObj.grid, msnidObj, fdr.max=0.01, 
                                    method="Nelder-Mead", level="peptide", 
                                    n.iter=50)})
show(filtObj.nm)

# Fine tuning. Simulated Annealing optimization.
system.time({
    filtObj.sann <- optimize_filter(filtObj.grid, msnidObj, fdr.max=0.01, 
                                    method="SANN", level="peptide", n.iter=50)})
show(filtObj.sann)

Run the code above in your browser using DataLab