rise.screen: Perform the screening stage of RISE: Two-Stage Rank-Based Identification of High-Dimensional Surrogate Markers

Description

A set of high-dimensional surrogate candidates are screened one-by-one to identify strong candidates. Strength of surrogacy is assessed through a rank-based measure of the similarity in treatment effects on a candidate surrogate and the primary response. P-values corresponding to hypothesis testing on this measure are corrected for the high number of statistical tests performed.

Usage

rise.screen(
  yone,
  yzero,
  sone,
  szero,
  alpha = 0.05,
  power.want.s = NULL,
  epsilon = NULL,
  u.y.hyp = NULL,
  p.correction = "BH",
  n.cores = 1,
  alternative = "less",
  paired = FALSE,
  return.all.screen = TRUE,
  return.all.weights = FALSE,
  weight.mode = "inverse.delta",
  normalise.weights = T
)

Value

a list with elements

screening.metrics : dataframe of screening results (for each candidate marker - number of observations n, u.y, u.s, delta, CI, sd, epsilon, p-values).
significant.markers: character vector of markers with p_adjusted < alpha
screening.weights: dataframe giving marker names and the inverse absolute value of the associated deltas.

Arguments

yone: numeric vector of primary response values in the treated group.
yzero: numeric vector of primary response values in the untreated group.
sone: matrix or dataframe of surrogate candidates in the treated group with dimension n1 x p where n1 is the number of treated samples and p the number of candidates. Sample ordering must match exactly yone.
szero: matrix or dataframe of surrogate candidates in the untreated group with dimension n0 x p where n0 is the number of untreated samples and p the number of candidates. Sample ordering must match exactly yzero.
alpha: significance level for determining surrogate candidates. Default is 0.05.
power.want.s: numeric in (0,1) - power desired for a test of treatment effect based on the surrogate candidate. Either this or epsilon argument must be specified.
epsilon: numeric in (0,1) - non-inferiority margin for determining surrogate validity. Either this or power.want.s argument must be specified.
u.y.hyp: hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations.
p.correction: character. Method for p-value adjustment (see p.adjust() function). Defaults to the Benjamini-Hochberg method ("BH").
n.cores: numeric giving the number of cores to commit to parallel computation in order to improve computational time through the pbmcapply() function. Defaults to 1.
alternative: character giving the alternative hypothesis type. One of c("less","two.sided"), where "less" corresponds to a non-inferiority test and "two.sided" corresponds to a two one-sided test procedure. Default is "less".
paired: logical flag giving if the data is independent or paired. If FALSE (default), samples are assumed independent. If TRUE, samples are assumed to be from a paired design. The pairs are specified by matching the rows of yone and sone to the rows of yzero and szero.
return.all.screen: logical flag. If TRUE (default), a dataframe will be returned giving the screening results for all candidates. Else, only the significant candidates will be returned.
return.all.weights: logical flag. If FALSE (default), a dataframe will be returned giving weights for significant markers screened. If TRUE, weights for all markers will be returned. Note that, if normalised weights are required, these will only be returned for significant markers, and raw weights will be returned in a second column.
weight.mode: character giving the type of weighting to return. One of c("inverse.delta","diff.epsilon", or "none"). The default is "inverse.delta", which means the weights are determined by taking the inverse of the absolute values of delta. If delta is exactly 0, this is uncomputable and the weight defaults to the inverse of the next closest absolute delta value. If delta is very close to 0, these estimates can be unstable and extreme. The "diff.epsilon" option seeks to aid this by calculating weights as the proportion of the interval between 0 and epsilon cut by the absolute value of delta, therefore giving delta = 0 a weight of 1 and delta = epsilon a weight of 0. When "none", the weights are set to 1 for every marker.
normalise.weights: logical flag. If TRUE (default), the weights are normalised by the sum of all the weights such that the maximum weight is 1, which can help with interpretability.

Author

Arthur Hughes

Examples

Run this code

# Load high-dimensional example data
data("example.data.highdim")
yone <- example.data.highdim$y1
yzero <- example.data.highdim$y0
sone <- example.data.highdim$s1
szero <- example.data.highdim$s0
# \donttest{
rise.screen.result <- rise.screen(yone, yzero, sone, szero, power.want.s = 0.8)
# }

Run the code above in your browser using DataLab