Function to generate data with n observations of a primary
outcome Y, secondary outcome K, exposure X, and
measured as well as unmeasured confounders L and U, where
the primary outcome is a quantitative normally-distributed variable
(setting = "GLM") or censored time-to-event outcome under
an accelerated failure time (AFT) model (setting = "AFT").
Under the AFT setting, the observed time-to-event variable T=exp(Y)
as well as the censoring indicator C are also computed. X
is generated as a genetic exposure variable in the form of a single
nucleotide variant (SNV) in 0-1-2 additive coding with minor allele
frequency maf. X can be generated independently of U
(X_orth_U = TRUE) or dependent on U
(X_orth_U = FALSE). For more details regarding the underlying
model, see the vignette.
generate_data(setting = "GLM", n = 1000, maf = 0.2, cens = 0.3,
a = NULL, b = NULL, aXK = 0.2, aXY = 0.1, aXL = 0, aKY = 0.3,
aLK = 0, aLY = 0, aUY = 0, aUL = 0, mu_X = NULL, sd_X = NULL,
X_orth_U = TRUE, mu_U = 0, sd_U = 1, mu_K = 0, sd_K = 1, mu_L = 0,
sd_L = 1, mu_Y = 0, sd_Y = 1)String with value "GLM" or "AFT" indicating
whether the primary outcome is generated as a
normally-distributed quantitative outcome ("GLM") or
censored time-to-event outcome ("AFT").
Numeric. Sample size.
Numeric. Minor allele frequency of the genetic exposure variable.
Numeric. Desired percentage of censored individuals and has to be
specified under the AFT setting. Note that the actual censoring
rate is generated through specification of the parameters
a and b, and cens is mostly used as a check
whether the desired censoring rate is obtained through a
and b (otherwise, a warning is issued).
Integer for generating the desired censoring rate under the AFT setting. Has to be specified under the AFT setting.
Integer for generating the desired censoring rate under the AFT setting. Has to be specified under the AFT setting.
Numeric. Size of the effect of X on K.
Numeric. Size of the effect of X on Y.
Numeric. Size of the effect of X on L.
Numeric. Size of the effect of K on Y.
Numeric. Size of the effect of L on K.
Numeric. Size of the effect of L on Y.
Numeric. Size of the effect of U on Y.
Numeric. Size of the effect of U on L.
Numeric. Expected value of X.
Numeric. Standard deviation of X.
Logical. Indicator whether X should be generated
independently of U (X_orth_U = TRUE)
or dependent on U (X_orth_U = FALSE).
Numeric. Expected value of U.
Numeric. Standard deviation of U.
Numeric. Expected value of K.
Numeric. Standard deviation of K.
Numeric. Expected value of L.
Numeric. Standard deviation of L.
Numeric. Expected value of Y.
Numeric. Standard deviation of Y.
A dataframe containing n observations of the variables Y,
K, X, L, U. Under the AFT setting,
T=exp(Y) and the censoring indicator C (0 = censored,
1 = uncensored) are also computed.
# NOT RUN {
# Generate data under the GLM setting with default values
dat_GLM <- generate_data()
head(dat_GLM)
# Generate data under the AFT setting with default values
dat_AFT <- generate_data(setting = "AFT", a = 0.2, b = 4.75)
head(dat_AFT)
# }
Run the code above in your browser using DataLab