Remove all molecular formulas that were detected in one or more blank analyses
(identified via blank_file_ids). Matching is always on mf. If a
retention-time column is present (or provided using ret_time_col), removal
is restricted to the corresponding LC segment.
remove_blanks(
mfd,
blank_file_ids = NULL,
blank_prevalence = 0.5,
ret_time_col = NULL,
verbose = FALSE,
...
)data.table; subset of the original molecular formula table (mfd)
with blank formulas removed (globally or LC-segment-wise).
data.table with molecular formula data as derived from
ume::assign_formulas. Column names of elements/isotopes must match names in
the isotope column of ume::masses; values are integers representing
counts per formula.
Integer vector of file_id values that represent blank analyses.
Numeric between 0 and 1. Threshold for blank filtering:
the proportion of blanks in which a molecular formula must occur before it is
excluded from the sample data. For example, blank_prevalence = 0 (default)
removes any formula detected in at least one blank, while blank_prevalence = 0.5
removes formulas detected in 50% or more of the blanks.
Character scalar. Name of the retention-time column that
contains the beginning of the retention time segment that corresponds to the
mass spectrum.
If NULL (default), the function will auto-detect the first column in
c("ret_time_min","retention_time","rt","RT") that exists in mfd.
If none is found, blanks are removed ignoring retention time.
logical; if TRUE, show progress messages.
Additional arguments passed to methods.
The argument LCMS is deprecated and no longer used. Retention-time-aware
removal is now enabled automatically when a retention-time column is present
or explicitly provided via ret_time_col.
Boris P. Koch
Requires a unique integer file_id per analysis in mfd.
Minimal required columns in mfd: mf, file_id.
Optional column: a retention-time column (e.g. "ret_time_min").
If a retention-time column is used, formulas present in blanks are only
removed for rows whose mf and retention time match
The input mfd is not modified by reference; a subset is returned.
Other Formula subsetting:
filter_int(),
filter_mass_accuracy(),
filter_mf_data(),
subset_known_mf(),
ume_assign_formulas(),
ume_filter_formulas()
# Presence/absence removal, no retention time:
remove_blanks(mfd = mf_data_demo,
remove_blank_list = "Blank",
verbose = TRUE)
Run the code above in your browser using DataLab