Learn R Programming

wrMisc (version 2.0.0)

presenceFilt: Filter Lines Of Matrix For Max Number Of NAs

Description

This function produces a logical matrix to be used as filter for lines of 'dat' for sufficient presence of non-NA values (ie limit number of NAs per line). Filter abundance/expression data for min number and/or ratio of non-NA values in at east 1 of multiple groups. This type of procedure is common in proteomics and tanscriptomics, where a NA can many times be assocoaued with quantitation below detetction limit.

Usage

presenceFilt(
  dat,
  grp,
  useComparison = NULL,
  maxGrpMiss = 1,
  ratMaxNA = 0.8,
  minVal = NULL,
  sep = NULL,
  silent = FALSE,
  debug = FALSE,
  callFrom = NULL
)

Value

This function returns a logical matrix (with separate col for each pairwise combination of 'grp' levels) indicating if line of 'dat' acceptable based on NAs (and values minVal)

Arguments

dat

matrix or data.frame (abundance or expression-values which may contain some NAs).

grp

factor of min 2 levels describing which column of 'dat' belongs to which group (levels 1 & 2 will be used)

useComparison

(character or matrix) optional argument allowing to specify which pairwise comparions sould be performed, default useComparison=NULL will run all pairwise comparisons; may be character combining two group-names (from argument grp) separated by a '-' (eg 'A-B') or matrix where the rownames design the elements to be compared as pairwise; Note : the names of the groups may not contain any '-' to avoid confucing them with pairwise separators !

maxGrpMiss

(numeric) at least 1 group has not more than this number of NAs (otherwise marke line as bad)

ratMaxNA

(numeric) at least 1 group below this content of NA values

minVal

(default NULL or numeric), any value below will be treated like NA

sep

(character) in case useComparison is not given all pairwise comparisons will be done, the separator to be used when combining names of groups can be given using this argument

silent

(logical) suppress messages

debug

(logical) additional messages for debugging

callFrom

(character) allow easier tracking of messages produced

See Also

presenceGrpFilt, there are also other packages on CRAN and Bioconductor dedicated to filtering

Examples

Run this code
mat <- matrix(rep(8,150), ncol=15, dimnames=list(NULL,
  paste0(rep(LETTERS[4:2],each=6),1:6)[c(1:5,7:16)]))
mat[lower.tri(mat)] <- NA
mat[,15] <- NA
mat[c(2:3,9),14:15] <- NA
mat[c(1,10),13:15] <- NA
mat
presenceFilt(mat, substr(colnames(mat),1,1))
# custom 2 groups
presenceFilt(mat, rep(1:2,c(9,6)))         # D1- C4, C5 - B4

# one more example 
dat1 <- matrix(1:56, ncol=7)
dat1[c(2:6,10,12,18,19,20,22,23,26:28,30,31,34,38,39,50,54)] <- NA
grp3 <- letters[c(3,3,2,2,1,1,1)]
colnames(dat1) <- correctToUnique(grp3, sep="") 
dat1
## At least one group wo any NAs
presenceFilt(dat1, grp3, maxGr=0)
presenceFilt(dat1, gr=gl(2,4)[-1], maxGr=1, ratM=0.1)
presenceFilt(dat1, gr=gl(2,4)[-1], maxGr=2, rat=0.5)

Run the code above in your browser using DataLab