Plot hit enrichment curves based on scores from multiple algorithms. Actual activities are required. Additionally plot the ideal hit enrichment curve that would result under perfect scoring, and the hit enrichment curve that would result under random scoring. Optionally, simultaneous confidence bands may also be requested.
HitEnrich(
S.df,
labels = NULL,
y,
x.max = NULL,
log = TRUE,
title = "",
conf = FALSE,
conf.level = 0.95,
method = "sup-t",
plus = TRUE,
band.frac = NULL
)Data frame where variables are numeric scores from different algorithms. Rows represent unique compounds.
Character vector of labels for the different algorithms in
S.df. If missing, variable names in S.df will be used.
Numeric vector of activity values. Activity values must be either 0
(inactive/undesirable) or 1 (active/desirable); no other values are
accepted. Compounds are assumed to be in the same order as in S.df.
Integer, the maximum number of tests allowed on the x axis.
Logical. TRUE plots the x axis on a log scale.
Character string
Logical. TRUE plots (simultaneous) confidence bands for
all hit enrichment curves.
Numeric, confidence coefficient
Character indicates the method used to obtain confidence bands.
The default is sup-t but other options (not recommended) are
"theta-proj" and "bonf".
Logical. TRUE uses plus-adjusted version of method.
Numeric vector of fractions tested to be used in obtaining
confidence bands. Vector should be no longer than y, and should have
at least 20 entries. Entries should be in (0,1]. It is recommended that
entries be consistent with between 1 and x.max tests.
By default, x.max is length(y), so that hit enrichment
curves are obtained for all observable fractions, i.e., fractions of
(1:length(y))/length(y). By default, confidence bands are evaluated
based on a smaller grid of 40 fractions. This smaller grid is evenly spaced
on either the original grid of (1:length(y))/length(y), or the log
scale of the original grid.