Learn R Programming

NOISeq (version 2.16.0)

FilterLowCounts: Methods to filter out low count features

Description

Function to filter out the low count features according to three different methods.

Usage

filtered.data(dataset, factor, norm = TRUE, depth = NULL, method = 1, cv.cutoff = 100, cpm = 1, p.adj = "fdr")

Arguments

dataset
Matrix or data.frame containing the expression values for each sample (columns) and feature (rows).
factor
Vector or factor indicating which condition each sample (column) in dataset belongs to.
norm
Logical value indicating whether the data are already normalized (TRUE) or not (FALSE).
depth
Sequencing depth of samples (column totals before normalizing the data). Depth only needs to be provided when method = 3 and norm = TRUE.
method
Method must be one of 1,2 or 3. Method 1 (CPM) removes those features that have an average expression per condition less than cpm value and a coefficient of variation per condition higher than cv.cutoff (in percentage) in all the conditions. Method 2 (Wilcoxon) performs a Wilcoxon test per condition and feature where in the null hypothesis the median expression is 0 and in the alternative the median is higher than 0. Those features with p-value greater than 0.05 in all the conditions are removed. Method 3 (Proportion test) performs a proportion test on the counts per condition and feature (or pseudo-counts if data were normalized) where null hypothesis is that the feature relative expression (count proportion) is equal to cpm/10^6 and higher than cpm/10^6 for the alternative. Those features with p-value greater than 0.05 in all the conditions are removed.
cv.cutoff
Cutoff for the coefficient of variation per condition to be used in method 1 (in percentage).
cpm
Cutoff for the counts per million value to be used in methods 1 and 3.
p.adj
Method for the multiple testing correction. The same methods as in the p.adjust function in stats package can be chosen: "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".

Examples

Run this code

## Simulate some count data
datasim = matrix(sample(0:100, 2000, replace = TRUE), ncol = 4)

## Filtering low counts (method 1)
myfilt1 = filtered.data(datasim, factor = c("cond1", "cond1", "cond2", "cond2"), norm = FALSE, depth = NULL, method = 1, cv.cutoff = 100, cpm = 1)

## Filtering low counts (method 2)
myfilt2 = filtered.data(datasim, factor = c("cond1", "cond1", "cond2", "cond2"), norm = FALSE, method = 2)

## Filtering low counts (method 3)
myfilt3 = filtered.data(datasim, factor = c("cond1", "cond1", "cond2", "cond2"), norm = FALSE, method = 3, cpm = 1)

Run the code above in your browser using DataLab