These functions provide an uniform interface to three existing methods: SVA, RUV, LEAPP The wrapper functions transform the data into desired forms and call the corresponding functions in the package sva, ruv, leapp
sva.wrapper(
formula,
X.data = NULL,
Y,
r,
sva.method = c("irw", "two-step"),
B = 5
)ruv.wrapper(
formula,
X.data = NULL,
Y,
r,
nc,
lambda = 1,
ruv.method = c("RUV2", "RUV4", "RUVinv")
)
leapp.wrapper(
formula,
X.data = NULL,
Y,
r,
search.tuning = F,
ipod.method = c("hard", "soft")
)
a formula indicating the known covariates including both primary variables and nuisance variables, which are seperated by |
. The variables before |
are primary variables and the variables after |
are nuisance variables. It's OK if there is no nuisance variables, then |
is not needed and formula
becomes a typical formula with all the covariates considered primary. When there is confusion about where the intercept should be put, cate
will include it in X.nuis.
the data frame used for formula
outcome, n*p matrix
number of latent factors, can be estimated using the function est.confounder.num
parameter for sva
.
whether to use an iterative reweighted algorithm (irw) or a two-step algorithm (two-step).
parameter for sva
. the number of iterations of the irwsva algorithm
parameter for ruv functions: position of the negative controls
parameter for RUVinv
logical parameter for leapp
, whether using BIC to search for tuning parameter of IPOD.
parameter for leapp
. "hard": hard thresholding in the IPOD algorithm;
"soft": soft thresholding in the IPOD algorithm
All functions return beta.p.value
which are the p-values after adjustment.
For the other returned objects, refer to cate for their meaning.
The beta.p.values
returned is a length p
vector, each for the overall effects of all the primary variables.
Only 1 variable of interest is allowed for leapp.wrapper
. The method can be slow.
# NOT RUN {
## this is the simulation example in Wang et al. (2015).
n <- 100
p <- 1000
r <- 2
set.seed(1)
data <- gen.sim.data(n = n, p = p, r = r,
alpha = rep(1 / sqrt(r), r),
beta.strength = 3 * sqrt(1 + 1) / sqrt(n),
Gamma.strength = c(seq(3, 1, length = r)) * sqrt(p),
Sigma = 1 / rgamma(p, 3, rate = 2),
beta.nonzero.frac = 0.05)
X.data <- data.frame(X1 = data$X1)
sva.results <- sva.wrapper(~ X1 | 1, X.data, data$Y,
r = r, sva.method = "irw")
ruv.results <- ruv.wrapper(~ X1 | 1, X.data, data$Y, r = r,
nc = sample(data$beta.zero.pos, 30), ruv.method = "RUV4")
leapp.results <- leapp.wrapper(~ X1 | 1, X.data, data$Y, r = r)
cate.results <- cate(~ X1 | 1, X.data, data$Y, r = r)
## p-values after adjustment
par(mfrow = c(2, 2))
hist(sva.results$beta.p.value)
hist(ruv.results$beta.p.value)
hist(leapp.results$beta.p.value)
hist(cate.results$beta.p.value)
## type I error
mean(sva.results$beta.p.value[data$beta.zero.pos] < 0.05)
## power
mean(sva.results$beta.p.value[data$beta.nonzero.pos] < 0.05)
## false discovery proportion for sva
discoveries.sva <- which(p.adjust(sva.results$beta.p.value, "BH") < 0.2)
fdp.sva <- length(setdiff(discoveries.sva, data$beta.nonzero.pos)) / max(length(discoveries.sva), 1)
fdp.sva
# }
Run the code above in your browser using DataLab