Sample size calculation for Cox proportional hazards regression with two covariates for Epidemiological Studies. The covariate of interest should be a binary variable. The other covariate can be either binary or non-binary. The formula takes into account competing risks and the correlation between the two covariates.
ssizeEpi(X1,
X2,
failureFlag,
power,
theta,
alpha = 0.05)
the total number of subjects required.
the proportion that \(X_1\) takes value one.
square of the correlation between \(X_1\) and \(X_2\).
proportion of subjects died of the disease of interest.
numeric. a nPilot
by 1 vector, where nPilot
is the number of subjects
in the pilot data set. This vector records the values of the covariate of
interest for the nPilot
subjects in the pilot study. X1
should
be binary and take only two possible values: zero and one.
numeric. a nPilot
by 1 vector, where nPilot
is the number of subjects
in the pilot study. This vector records the values of the second covariate
for the nPilot
subjects in the pilot study. X2
can be binary or
non-binary.
numeric. a nPilot
by 1 vector of indicators indicating if a subject is
failure (failureFlag=1
) or alive (failureFlag=0
).
numeric. postulated power.
numeric. postulated hazard ratio.
numeric. type I error rate.
This is an implementation of the sample size formula derived by Latouche et al. (2004) for the following Cox proportional hazards regression in the epidemiological studies: $$h(t|x_1, x_2)=h_0(t)\exp(\beta_1 x_1+\beta_2 x_2),$$ where the covariate \(X_1\) is of our interest. The covariate \(X_1\) has to be a binary variable taking two possible values: zero and one, while the covariate \(X_2\) can be binary or continuous.
Suppose we want to check if the hazard of \(X_1=1\) is equal to the hazard of \(X_1=0\) or not. Equivalently, we want to check if the hazard ratio of \(X_1=1\) to \(X_1=0\) is equal to \(1\) or is equal to \(\exp(\beta_1)=\theta\). Given the type I error rate \(\alpha\) for a two-sided test, the total number of subjects required to achieve a power of \(1-\beta\) is $$n=\frac{\left(z_{1-\alpha/2}+z_{1-\beta}\right)^2}{ [\log(\theta)]^2 p (1-p) \psi (1-\rho^2)},$$ where \(z_{a}\) is the \(100 a\)-th percentile of the standard normal distribution, \(\psi\) is the proportion of subjects died of the disease of interest, and $$\rho=corr(X_1, X_2)=(p_1-p_0)\times\sqrt{\frac{q(1-q)}{p(1-p)}},$$ and \(p=Pr(X_1=1)\), \(q=Pr(X_2=1)\), \(p_0=Pr(X_1=1|X_2=0)\), and \(p_1=Pr(X_1=1 | X_2=1)\).
\(p\), \(\rho^2\), and \(\psi\) will be estimated from a pilot study.
Schoenfeld DA. (1983). Sample-size formula for the proportional-hazards regression model. Biometrics. 39:499-503.
Latouche A., Porcher R. and Chevret S. (2004). Sample size formula for proportional hazards modelling of competing risks. Statistics in Medicine. 23:3263-3274.
ssizeEpi.default
# generate a toy pilot data set
X1 <- c(rep(1, 39), rep(0, 61))
set.seed(123456)
X2 <- sample(c(0, 1), 100, replace = TRUE)
failureFlag <- sample(c(0, 1), 100, prob = c(0.5, 0.5), replace = TRUE)
ssizeEpi(X1 = X1,
X2 = X2,
failureFlag = failureFlag,
power = 0.80,
theta = 2,
alpha = 0.05)
Run the code above in your browser using DataLab