wrapper: Wrapper functions for some previous methods

Description

These functions provide an uniform interface to three existing methods: SVA, RUV, LEAPP The wrapper functions transform the data into desired forms and call the corresponding functions in the package sva, ruv, leapp

Usage

sva.wrapper(
  formula,
  X.data = NULL,
  Y,
  r,
  sva.method = c("irw", "two-step"),
  B = 5
)
ruv.wrapper(
  formula,
  X.data = NULL,
  Y,
  r,
  nc,
  lambda = 1,
  ruv.method = c("RUV2", "RUV4", "RUVinv")
)
leapp.wrapper(
  formula,
  X.data = NULL,
  Y,
  r,
  search.tuning = F,
  ipod.method = c("hard", "soft")
)

Arguments

formula

a formula indicating the known covariates including both primary variables and nuisance variables, which are seperated by |. The variables before | are primary variables and the variables after | are nuisance variables. It's OK if there is no nuisance variables, then | is not needed and formula becomes a typical formula with all the covariates considered primary. When there is confusion about where the intercept should be put, cate will include it in X.nuis.

X.data

the data frame used for formula

outcome, n*p matrix

number of latent factors, can be estimated using the function est.confounder.num

sva.method

parameter for sva. whether to use an iterative reweighted algorithm (irw) or a two-step algorithm (two-step).

parameter for sva. the number of iterations of the irwsva algorithm

parameter for ruv functions: position of the negative controls

lambda

parameter for RUVinv

ruv.method

either using RUV2, RUV4 or RUVinv functions

search.tuning

logical parameter for leapp, whether using BIC to search for tuning parameter of IPOD.

ipod.method

parameter for leapp. "hard": hard thresholding in the IPOD algorithm; "soft": soft thresholding in the IPOD algorithm

Value

All functions return beta.p.value which are the p-values after adjustment. For the other returned objects, refer to cate for their meaning.

Details

The beta.p.values returned is a length p vector, each for the overall effects of all the primary variables.

Only 1 variable of interest is allowed for leapp.wrapper. The method can be slow.

Examples

Run this code

# NOT RUN {
## this is the simulation example in Wang et al. (2015).
n <- 100
p <- 1000
r <- 2
set.seed(1)
data <- gen.sim.data(n = n, p = p, r = r,
                     alpha = rep(1 / sqrt(r), r),
                     beta.strength = 3 * sqrt(1 + 1) / sqrt(n),
                     Gamma.strength = c(seq(3, 1, length = r)) * sqrt(p),
                     Sigma = 1 / rgamma(p, 3, rate = 2),
                     beta.nonzero.frac = 0.05)
X.data <- data.frame(X1 = data$X1)
sva.results <- sva.wrapper(~ X1 | 1, X.data, data$Y,
                           r = r, sva.method = "irw")
ruv.results <- ruv.wrapper(~ X1 | 1, X.data, data$Y, r = r,
                           nc = sample(data$beta.zero.pos, 30), ruv.method = "RUV4")
leapp.results <- leapp.wrapper(~ X1 | 1, X.data, data$Y, r = r)
cate.results <- cate(~ X1 | 1, X.data, data$Y, r = r)

## p-values after adjustment
par(mfrow = c(2, 2))
hist(sva.results$beta.p.value)
hist(ruv.results$beta.p.value)
hist(leapp.results$beta.p.value)
hist(cate.results$beta.p.value)

## type I error
mean(sva.results$beta.p.value[data$beta.zero.pos] < 0.05)

## power
mean(sva.results$beta.p.value[data$beta.nonzero.pos] < 0.05)

## false discovery proportion for sva
discoveries.sva <- which(p.adjust(sva.results$beta.p.value, "BH") < 0.2)
fdp.sva <- length(setdiff(discoveries.sva, data$beta.nonzero.pos)) / max(length(discoveries.sva), 1)
fdp.sva

# }

Run the code above in your browser using DataLab