HitEnrichDiff: Plot differences between hit enrichment curves

Description

Plot differences between hit enrichment curves based on scores from multiple algorithms. Actual activities are required. Additionally plot simultaneous confidence bands for these differences. Plots may be used to determine if one algorithm is "better" than another algorithm.

Usage

HitEnrichDiff(
  S.df,
  labels = NULL,
  y,
  x.max = NULL,
  log = TRUE,
  title = "",
  conf.level = 0.95,
  method = "sup-t",
  plus = TRUE,
  band.frac = NULL,
  yrange = NULL
)

Arguments

S.df

Data frame where variables are numeric scores from at least 2 different algorithms. Rows represent unique compounds.

labels

Character vector of labels for the different algorithms in S.df. If missing, variable names in S.df will be used.

Numeric vector of activity values. Activity values must be either 0 (inactive/undesirable) or 1 (active/desirable); no other values are accepted. Compounds are assumed to be in the same order as in S.df.

x.max

Integer, the maximum number of tests allowed on the x axis.

log

Logical. TRUE plots the x axis on a log scale.

title

Character string

conf.level

Numeric, confidence coefficient

method

Character indicates the method used to obtain confidence bands. The default is sup-t but other options (not recommended) are "theta-proj" and "bonf".

plus

Logical. TRUE uses plus-adjusted version of method.

band.frac

Numeric vector of fractions tested to be used in obtaining confidence bands. Vector should be no longer than y, and should have at least 20 entries. Entries should be in (0,1]. It is recommended that entries be consistent with between 1 and x.max tests.

yrange

Numeric vector of length 2. The desired range for the y axis.

Details

By default, x.max is length(y), so that hit enrichment curves are obtained for all observable fractions, i.e., fractions of (1:length(y))/length(y). By default, confidence bands are evaluated based on a smaller grid of 40 fractions. This smaller grid is evenly spaced on either the original grid of (1:length(y))/length(y), or the log scale of the original grid.