This function takes statistical testing results (obtained using testRobustToNAimputation
or moderTest2grp
,
based on limma) and calculates specifcity and sensitivity values for plotting ROC-curves along a panel of thresholds.
Based on annotation (from test$annot) with the user-defined column for species (argument 'spec') the counts of TP (true positives), FP (false positves), FN (false negatives) and TN are determined.
In addition, an optional plot may be produced.
summarizeForROC(
test,
useComp = 1,
tyThr = "BH",
thr = NULL,
columnTest = NULL,
FCthrs = NULL,
spec = c("H", "E", "S"),
annotCol = "Species",
filterMat = "filter",
batchMode = FALSE,
tit = NULL,
color = 1,
plotROC = TRUE,
pch = 1,
bg = NULL,
overlPlot = FALSE,
silent = FALSE,
debug = FALSE,
callFrom = NULL
)
This function returns a numeric matrix containing the columns 'alph', 'spec', 'sens', 'prec', 'accur', 'FD' plus two columns with absolute numbers of lines (genes/proteins) passing the current threshold level alpha (1st species, all other species)
(list or class MArrayLM
, S3-object from limma) from testing (eg testRobustToNAimputation
or test2grp
(character or integer) in case multiple comparisons (ie multiple columns 'test$tyThr'); which pairwise comparison to used
(character,length=1) type of statistical test-result to be used for sensitivity and specificity calculations (eg 'BH','lfdr' or 'p.value'), must be list-element of 'test'
(numeric) stat test (FDR/p-value) threshold, if NULL
a panel of 108 p-value threshold-levels values will be used for calculating specifcity and sensitivity
depreciated, please use 'useComp' instead
(numeric) Fold-Change threshold (display as line) give as Fold-change and NOT as log2(FC), default at 1.5, set to NA
for omitting
(character) labels for those species which should be matched to column annotCol
('spec') of test$annot and used for sensitivity and specificity calculations. Important : 1st entry for species designed as constant (ie matrix) and subsequent labels for spike-ins (expected variable)
(character, length=1) column name of test$annot
to use to separate species
(character) name (or index) of element of test
containing matrix or vector of logical filtering results
(logical) if batchMode=TRUE
the function will return an empty matrix if no proteins qualify for computing ROC (eg all spike-proteins not passig filters), and plotROC
will be set to FALSE
(character) optinal custom title in graph
(character or integer) color in graph
(logical) toogle plot on or off
(integer) type of symbol to be used (see par
)
(character) backgroud in plot (see par
)
(logical) overlay to existing plot if TRUE
(logical) suppress messages
(logical) additional messages for debugging
(character) allows easier tracking of messages produced
Determining TP and FP counts requires 'ground trouth' experiments, where it is known in advance which proteins are expected to change abundance between two groups of samples. Typically this is done by mixing proteins of different species origin, the first species noted by argument 'spec' designes the species to be considered constant (expected as FN in statistical tests). Then, one or mutiple additional spike-in species can be defined. As the spike-in cocentration should have been altered between different gruops of samples, they are expected as TP.
The main aim of this function consists in providing specifcity and sensitivity values, plus counts of TP (true positives), FP (false positves), FN (false negatives) and TN (true negatives), along various thrsholds (specified in column 'alph') for statistical tests preformed prior to calling this function.
Note, that the choice of species-annotation plays a crucial role who the counting results are obtained. In case of multiple spike-in species the user should pay attention if they all are expected to change abundance at the same ratio. If not, it is advised to run this function multiple times sperately only with the subset of those species expected to change at same ratio.
The dot on the plotted curve shows the results at the level of the single threshold alpha=0.05.
For plotting multiple ROC curves as overlay and additional graphical parameters/options you may use plotROC
.
See also ROC on Wkipedia for explanations of TP,FP,FN and TN as well as examples. Note that numerous other packages also provide support for building and plotting ROC-curves : Eg rocPkgShort, ROCR, pROC or ROCit
set.seed(2019); test1 <- list(annot=cbind(Species=c(rep("b",35), letters[sample.int(n=3,
size=150, replace=TRUE)])), BH=matrix(c(runif(35,0,0.01), runif(150)), ncol=1))
tail(roc1 <- summarizeForROC(test1, spec=c("a","b","c"), annotCol="Species"))
Run the code above in your browser using DataLab