Learn R Programming

metamorphr (version 0.2.0)

filter_cv: Filter Features based on their coefficient of variation

Description

Filters Features based on their coefficient of variation (CV). The CV is defined as \(CV = \frac{s_i}{\overline{x_i}}\) with \(s_i\) = Standard deviation of sample \(i\) and \(\overline{x_i}\) = Mean of sample \(i\).

Usage

filter_cv(
  data,
  reference_samples,
  max_cv = 0.2,
  ref_as_group = FALSE,
  group_column = NULL,
  na_as_zero = TRUE
)

Value

A filtered tibble.

Arguments

data

A tidy tibble created by read_featuretable.

reference_samples

The names of the samples or group which will be used to calculate the CV of a feature. Usually Quality Control samples.

max_cv

The maximum allowed CV. 0.2 is a reasonable start.

ref_as_group

A logical indicating if reference_samples are the names of samples or group(s).

group_column

Only relevant if ref_as_group = TRUE. Which column should be used for grouping reference and non-reference samples? Usually group_column = Group. Uses args_data_masking.

na_as_zero

Should NA be replaced with 0 prior to calculation? Under the hood filter_cv calculates the CV by stats::sd(..., na.rm = TRUE) / mean(..., na.rm = TRUE). If there are 3 samples to calculate the CV from and 2 of them are NA for a specific feature, then the CV for that Feature will be NA if na_as_zero = FALSE. This might lead to problems. na_as_zero = TRUE is the safer pick. Zeros will be replaced with NA after calculation no matter if it is TRUE or FALSE.

References

Coefficient of Variation on Wikipedia

Examples

Run this code
# Example 1: Define reference samples by sample names
toy_metaboscape %>%
  filter_cv(max_cv = 0.2, reference_samples = c("QC1", "QC2", "QC3"))

# Example 2: Define reference samples by group name
toy_metaboscape %>%
  join_metadata(toy_metaboscape_metadata) %>%
  filter_cv(max_cv = 0.2, reference_samples = "QC", ref_as_group = TRUE, group_column = Group)

Run the code above in your browser using DataLab