Learn R Programming

readyomics (version 0.1.1)

process_ms: Process MS-like omics data

Description

This function performs common preprocessing steps for mass spectrometry (MS)-like omics datasets, including QC sample removal, zero-to-NA conversion, feature prevalence filtering, transformation, and feature-wise value imputation.

Usage

process_ms(
  X,
  remove_ids = NULL,
  min_prev = 0.8,
  rename_feat = TRUE,
  transform = c("none", "log", "sqrt"),
  log_base_num = 10,
  impute = c("none", "min_val", "QRILC"),
  min_val_factor = 1,
  platform = c("ms", "nmr"),
  seed = NULL,
  verbose = TRUE,
  ...
)

Value

A list:

X_names

Feature mapping original vs. new names.

X_processed

Processed numeric matrix.

Arguments

X

A numeric data frame or matrix (samples in rows, features in columns).

remove_ids

A regex or character vector to filter out rows in X (e.g. QCs). Set to NULL to skip.

min_prev

Numeric between 0 and 1. Minimum non-missing prevalence threshold. Zeros are first converted to NA.

rename_feat

Logical. If TRUE, features will be renamed as "feat_n" and original labels stored.

transform

One of "none", "log", or "sqrt".

log_base_num

Numeric logarithm base. Required if transform = "log".

impute

One of "none", "min_val", or "QRILC". Note: imputeLCMD::impute.QRILC() requires log-transformed data. Log-transform will be forced internally regardless of transform = setting.

min_val_factor

Numeric >= 1. Scaling factor for min value imputation.

platform

whether data was generated by mass spectrometry ("ms") or nuclear magnetic resonance spectroscopy ("nmr"), the latter allowing negative values in the matrix.

seed

Optional integer. If provided, sets the random seed for reproducible imputeLCMD::imputeQRILC() permutation results.

verbose

Logical. Show messages about the processing steps.

...

Extra arguments passed to imputeLCMD::impute.QRILC().

References

Lazar, C., Gatto, L., Ferro, M., Bruley, C., & Burger, T. (2016). Accounting for the multiple natures of missing values in label-free quantitative proteomics data sets to compare imputation strategies. Journal of Proteome Research, 15(4), 1116–1125. tools:::Rd_expr_doi("10.1021/acs.jproteome.5b00981")

Wei, R., Wang, J., Su, M., Jia, E., Chen, S., Chen, T., & Ni, Y. (2018). Missing value imputation approach for mass spectrometry-based metabolomics data. Scientific Reports, 8, 663. tools:::Rd_expr_doi("10.1038/s41598-017-19120-0")

See Also

imputeLCMD::impute.QRILC() for imputing missing values.

Examples

Run this code
X <- matrix(sample(c(0:10), size = 80, replace = TRUE),
            nrow = 20, ncol = 4,
            dimnames = list(paste0("sample", 1:20),
                            paste0("feat", 1:4)))

result <- process_ms(X, verbose = FALSE) # Generates NA warning

Run the code above in your browser using DataLab