powerBuyseTest: Performing simulation studies with BuyseTest

Description

Performs a simulation studies for several sample sizes. Returns estimates, their standard deviation, the average estimated standard error, and the rejection rate.

Usage

powerBuyseTest(
  sim,
  sample.size,
  sample.sizeC = NULL,
  sample.sizeT = NULL,
  n.rep,
  null = c(netBenefit = 0),
  cpus = 1,
  seed = NULL,
  conf.level = NULL,
  power = NULL,
  max.sample.size = 5000,
  alternative = NULL,
  order.Hprojection = NULL,
  transformation = NULL,
  trace = 1,
  ...
)

Arguments

sim: [function] take two arguments: the sample size in the control group (n.C) and the sample size in the treatment group (n.C) and generate datasets. The datasets must be data.table objects.
sample.size: [integer vector, >0] the various sample sizes at which the simulation should be perform. Disregarded if any of the arguments sample.sizeC or sample.sizeT are specified.
sample.sizeC: [integer vector, >0] the various sample sizes in the control group.
sample.sizeT: [integer vector, >0] the various sample sizes in the treatment group.
n.rep: [integer, >0] the number of simulations.
null: [numeric vector] For each statistic of interest, the null hypothesis to be tested. The vector should be named with the names of the statistics.
cpus: [integer, >0] the number of CPU to use. Default value is 1.
seed: [integer, >0] the seed to consider for the simulation study.
conf.level: [numeric, 0-1] type 1 error level. Default value read from BuyseTest.options().
power: [numeric, 0-1] type 2 error level used to determine the sample size. Only relevant when sample.size, sample.sizeC, and sample.sizeT are not given. See details.
max.sample.size: [interger, 0-1] sample size used to approximate the sample size achieving the requested type 1 and type 2 error (see details). Can have length 2 to indicate the sample in each group when the groups have unequal sample size.
alternative: [character] the type of alternative hypothesis: "two.sided", "greater", or "less". Default value read from BuyseTest.options().
order.Hprojection: [integer 1,2] the order of the H-project to be used to compute the variance of the net benefit/win ratio. Default value read from BuyseTest.options().
transformation: [logical] should the CI be computed on the logit scale / log scale for the net benefit / win ratio and backtransformed. Otherwise they are computed without any transformation. Default value read from BuyseTest.options().
trace: [integer] should the execution of the function be traced?
...: other arguments (e.g. scoring.rule, method.inference) to be passed to initializeArgs.

Author

Brice Ozenne

Details

Sample size calculation: to approximate the sample size achieving the requested type 1 ($\alpha$) and type 2 error ($\beta$), GPC are applied on a large sample (as defined by the argument max.sample.size): $N^*=m^*+n^*$ where $m^*$ is the sample size in the control group and $n^*$ is the sample size in the active group. Then the effect ($\delta$) and the asymptotic variance of the estimator ($\sigma^2$) are estimated. The total sample size is then deduced as (two-sided case): $$\hat{N} = \hat{\sigma}^2\frac{(u_{1-\alpha/2}+u_{1-\beta})^2}{\hat{\delta}^2}$$ from which the group specific sample sizes are deduced: $\hat{m}=\hat{N}\frac{m^*}{N^*}$ and $\hat{n}=\hat{N}\frac{n^*}{N^*}$. Here $u_x$ denotes the x-quantile of the normal distribution. Since this is an approximation, a simulation study is performed with this sample size to provide the estimated power. It may not be exactly the requested power but should provide a reasonnable guess which can be refined with further simulation studies. Note that the maximumal sample size should be very high for the estimation to be accurate (ideally 10000 or more).

Examples

Run this code

library(data.table)

#### Using simBuyseTest ####
## only point estimate
if (FALSE) {
powerBuyseTest(sim = simBuyseTest, sample.size = c(10, 25, 50, 75, 100), 
               formula = treatment ~ bin(toxicity), seed = 10, n.rep = 1000,
               method.inference = "none", trace = 2, keep.pairScore = FALSE)
}

## point estimate with rejection rate
# \dontshow{
powerBuyseTest(sim = simBuyseTest, sample.size = c(10, 50, 100), 
               formula = treatment ~ bin(toxicity), seed = 10, n.rep = 10,
               method.inference = "u-statistic", trace = 4)
# }
if (FALSE) {
powerBuyseTest(sim = simBuyseTest, sample.size = c(10, 50, 100), 
               formula = treatment ~ bin(toxicity), seed = 10, n.rep = 1000,
               method.inference = "u-statistic", trace = 4)
}

#### Using user defined simulation function ####
## power calculation for Wilcoxon test
simFCT <- function(n.C, n.T){
    out <- rbind(cbind(Y=stats::rt(n.C, df = 5), group=0),
                 cbind(Y=stats::rt(n.T, df = 5), group=1) + 1)
    return(data.table::as.data.table(out))
}
simFCT2 <- function(n.C, n.T){
    out <- rbind(cbind(Y=stats::rt(n.C, df = 5), group=0),
                 cbind(Y=stats::rt(n.T, df = 5), group=1) + 0.25)
    return(data.table::as.data.table(out))
}

# \dontshow{
powerW <- powerBuyseTest(sim = simFCT, sample.size = c(5, 10,20,30,50,100),
                         n.rep = 10, formula = group ~ cont(Y))
summary(powerW)
# }
if (FALSE) {
powerW <- powerBuyseTest(sim = simFCT, sample.size = c(5,10,20,30,50,100),
                         n.rep = 1000, formula = group ~ cont(Y), cpus = "all")
summary(powerW)
} 

## sample size needed to reach (approximately) a power
## based on summary statistics obtained on a large sample 
if (FALSE) {
sampleW <- powerBuyseTest(sim = simFCT, power = 0.8, max.sample.size = 10000,
                         n.rep = 1000, formula = group ~ cont(Y), cpus = "all")
summary(sampleW) ## not very accurate but gives an order of magnitude

sampleW2 <- powerBuyseTest(sim = simFCT2,
                           power = 0.8, max.sample.size = 10000,
                           n.rep = 1000, formula = group ~ cont(Y), cpus = "all")
summary(sampleW2) ## more accurate with larger samples
}

Run the code above in your browser using DataLab