sim.fong: Data Simulation as in Fong and Gilbert (2014)

Description

Simulate data as in Fong and Gilbert (2014).

Usage

sim.fong (n, family=c("PH","PO","P2"), beta, 
    random.censoring=c("0%","20%","60%"), prevalence=0.1, non.adherence.ratio=0,
    design=c("FULL","CC"), auxiliary=c("weak","good","excellent","none"), 
    seed=NULL, var.S=1, var.W=1)

Arguments

integer. Sample size

family

string. Link functions in the semiparametric transformation model

beta

numerical vector. Coefficients of the linear model

random.censoring

string. Random censoring in addition to administrative censoring

prevalence

numerical. Proportion of cases among z==0 when there is no random censoring and non-adherence ratio is 0

design

string. Full cohort or case-cohort (finite population sampling)

auxiliary

string.

seed

integer. Random generator seed

var.S

numeric. Variance of the phase II covariate s

var.W

numeric. Variance of the baseline covariate w

non.adherence.ratio

ratio of non-adherent

Value

If design is FULL, returns a data frame of:

failure time

censoring time

smaller of the ft and C

event indicator

baseline covariate z

phase II covariate s

If design is CC, returns a data frame of:

failure time

censoring time

smaller of the ft and C

event indicator

baseline covariate z

phase II covariate s

baseline auxiliary covariate w

Details

The number of rows is the size of the full cohort. Adherence ratio works as a Bernoulli variable. Prevalence is used to compute baseline hazard function based on some empirical evidence.

Examples

Run this code

# NOT RUN {
dat = sim.fong(n=10000, family="PH", beta=c(log(.5), log(.7), log(1.2)), design="CC", 
    auxiliary="weak", seed=1, prevalence=0.1, non.adherence.ratio=0, random.censoring="0")
mean(dat$d[dat$z==0])

dat = sim.fong(n=10000, family="PH", beta=c(log(.5), log(.7), log(1.2)), design="CC", 
    auxiliary="weak", seed=1, prevalence=0.1, non.adherence.ratio=0.15, random.censoring="0")
sum(dat$d & !is.na(dat$s))
sum(!dat$d & !is.na(dat$s)) / sum(dat$d & !is.na(dat$s))

dat = sim.fong(n=10000, family="PH", beta=c(log(.5), log(.7), log(1.2)), design="CC", 
    auxiliary="weak", seed=1, prevalence=0.1, non.adherence.ratio=0.15, random.censoring="20")
sum(dat$d & !is.na(dat$s))
sum(!dat$d & !is.na(dat$s)) / sum(dat$d & !is.na(dat$s))

# }