summarizeForROC: Summarize statistical test result for plotting ROC-curves

Description

This function takes statistical testing results (obtained using testRobustToNAimputation or moderTest2grp, based on limma) and calculates specifcity and sensitivity values for plotting ROC-curves along a panel of thresholds. Based on annotation (from test$annot) with the user-defined column for species (argument 'spec') the counts of TP (true positives), FP (false positves), FN (false negatives) and TN are determined. In addition, an optional plot may be produced.

Usage

summarizeForROC(
  test,
  useComp = 1,
  tyThr = "BH",
  thr = NULL,
  columnTest = NULL,
  FCthrs = NULL,
  spec = c("H", "E", "S"),
  annotCol = "Species",
  filterMat = "filter",
  batchMode = FALSE,
  tit = NULL,
  color = 1,
  plotROC = TRUE,
  pch = 1,
  bg = NULL,
  overlPlot = FALSE,
  silent = FALSE,
  debug = FALSE,
  callFrom = NULL
)

Value

This function returns a numeric matrix containing the columns 'alph', 'spec', 'sens', 'prec', 'accur', 'FD' plus two columns with absolute numbers of lines (genes/proteins) passing the current threshold level alpha (1st species, all other species)

Arguments

test: (list or class MArrayLM, S3-object from limma) from testing (eg testRobustToNAimputation or test2grp
useComp: (character or integer) in case multiple comparisons (ie multiple columns 'test$tyThr'); which pairwise comparison to used
tyThr: (character,length=1) type of statistical test-result to be used for sensitivity and specificity calculations (eg 'BH','lfdr' or 'p.value'), must be list-element of 'test'
thr: (numeric) stat test (FDR/p-value) threshold, if NULL a panel of 108 p-value threshold-levels values will be used for calculating specifcity and sensitivity
columnTest: depreciated, please use 'useComp' instead
FCthrs: (numeric) Fold-Change threshold (display as line) give as Fold-change and NOT as log2(FC), default at 1.5, set to NA for omitting
spec: (character) labels for those species which should be matched to column annotCol ('spec') of test$annot and used for sensitivity and specificity calculations. Important : 1st entry for species designed as constant (ie matrix) and subsequent labels for spike-ins (expected variable)
annotCol: (character, length=1) column name of test$annot to use to separate species
filterMat: (character) name (or index) of element of test containing matrix or vector of logical filtering results
batchMode: (logical) if batchMode=TRUE the function will return an empty matrix if no proteins qualify for computing ROC (eg all spike-proteins not passig filters), and plotROC will be set to FALSE
tit: (character) optinal custom title in graph
color: (character or integer) color in graph
plotROC: (logical) toogle plot on or off
pch: (integer) type of symbol to be used (see par)
bg: (character) backgroud in plot (see par)
overlPlot: (logical) overlay to existing plot if TRUE
silent: (logical) suppress messages
debug: (logical) additional messages for debugging
callFrom: (character) allows easier tracking of messages produced

Details

Determining TP and FP counts requires 'ground trouth' experiments, where it is known in advance which proteins are expected to change abundance between two groups of samples. Typically this is done by mixing proteins of different species origin, the first species noted by argument 'spec' designes the species to be considered constant (expected as FN in statistical tests). Then, one or mutiple additional spike-in species can be defined. As the spike-in cocentration should have been altered between different gruops of samples, they are expected as TP.

The main aim of this function consists in providing specifcity and sensitivity values, plus counts of TP (true positives), FP (false positves), FN (false negatives) and TN (true negatives), along various thrsholds (specified in column 'alph') for statistical tests preformed prior to calling this function.

Note, that the choice of species-annotation plays a crucial role who the counting results are obtained. In case of multiple spike-in species the user should pay attention if they all are expected to change abundance at the same ratio. If not, it is advised to run this function multiple times sperately only with the subset of those species expected to change at same ratio.

The dot on the plotted curve shows the results at the level of the single threshold alpha=0.05. For plotting multiple ROC curves as overlay and additional graphical parameters/options you may use plotROC.

See also ROC on Wkipedia for explanations of TP,FP,FN and TN as well as examples. Note that numerous other packages also provide support for building and plotting ROC-curves : Eg rocPkgShort, ROCR, pROC or ROCit

Examples

Run this code

set.seed(2019); test1 <- list(annot=cbind(Species=c(rep("b",35), letters[sample.int(n=3,
  size=150, replace=TRUE)])), BH=matrix(c(runif(35,0,0.01), runif(150)), ncol=1))
tail(roc1 <- summarizeForROC(test1, spec=c("a","b","c"), annotCol="Species"))