qgcomp.boot: estimating the parameters of a marginal structural model (MSM) based on g-computation with quantized exposures

Description

This function yields population average effect estimates for both continuous and binary outcomes

Usage

qgcomp.boot(
  f,
  data,
  expnms = NULL,
  q = 4,
  breaks = NULL,
  id = NULL,
  alpha = 0.05,
  B = 200,
  rr = TRUE,
  degree = 1,
  seed = NULL,
  bayes = FALSE,
  parallel = FALSE,
  ...
)

Arguments

R style formula

data

data frame

expnms

character vector of exposures of interest

NULL or number of quantiles used to create quantile indicator variables representing the exposure variables. If NULL, then gcomp proceeds with un-transformed version of exposures in the input datasets (useful if data are already transformed, or for performing standard g-computation)

breaks

(optional) NULL, or a list of (equal length) numeric vectors that characterize the minimum value of each category for which to break up the variables named in expnms. This is an alternative to using 'q' to define cutpoints.

(optional) NULL, or variable name indexing individual units of observation (only needed if analyzing data with multiple observations per id/cluster)

alpha

alpha level for confidence limit calculation

integer: number of bootstrap iterations (this should typically be >=200, though it is set lower in examples to improve run-time).

logical: if using binary outcome and rr=TRUE, qgcomp.boot will estimate risk ratio rather than odds ratio

degree

polynomial basis function for marginal model (e.g. degree = 2 allows that the relationship between the whole exposure mixture and the outcome is quadratic.

seed

integer or NULL: random number seed for replicable bootstrap results

bayes

use underlying Bayesian model (`arm` package defaults). Results in penalized parameter estimation that can help with very highly correlated exposures. Note: this does not lead to fully Bayesian inference in general, so results should be interpereted as frequentist.

parallel

use (safe) parallel processing from the future and future.apply packages

...

arguments to glm (e.g. family)

Value

a qgcompfit object, which contains information about the effect measure of interest (psi) and associated variance (var.psi), as well as information on the model fit (fit) and information on the marginal structural model (msmfit) used to estimate the final effect estimates.

Details

Estimates correspond to the average expected change in the (log) outcome per quantile increase in the joint exposure to all exposures in `expnms'. Test statistics and confidence intervals are based on a non-parametric bootstrap, using the standard deviation of the bootstrap estimates to estimate the standard error. The bootstrap standard error is then used to estimate Wald-type confidence intervals. Note that no bootstrapping is done on estimated quantiles of exposure, so these are treated as fixed quantities

Examples

Run this code

# NOT RUN {
set.seed(30)
# continuous outcome
dat <- data.frame(y=rnorm(100), x1=runif(100), x2=runif(100), z=runif(100))
# Conditional linear slope
qgcomp.noboot(y ~ z + x1 + x2, expnms = c('x1', 'x2'), data=dat, q=4, family=gaussian())
# Marginal linear slope (population average slope, for a purely linear, 
#  additive model this will equal the conditional)
 
# }
# NOT RUN {
qgcomp.boot(f=y ~ z + x1 + x2, expnms = c('x1', 'x2'), data=dat, q=4, 
  family=gaussian(), B=200) # B should be at least 200 in actual examples
  
# Population average mixture slope which accounts for non-linearity and interactions
qgcomp.boot(y ~ z + x1 + x2 + I(x1^2) + I(x2*x1), family="gaussian", 
 expnms = c('x1', 'x2'), data=dat, q=4, B=200)
 
# binary outcome
dat <- data.frame(y=rbinom(50,1,0.5), x1=runif(50), x2=runif(50), z=runif(50))

# Conditional mixture OR
qgcomp.noboot(y ~ z + x1 + x2, family="binomial", expnms = c('x1', 'x2'), 
  data=dat, q=2)
  
#Marginal mixture OR (population average OR - in general, this will not equal the 
# conditional mixture OR due to non-collapsibility of the OR)
qgcomp.boot(y ~ z + x1 + x2, family="binomial", expnms = c('x1', 'x2'), 
  data=dat, q=2, B=3)
  
# Population average mixture RR
qgcomp.boot(y ~ z + x1 + x2, family="binomial", expnms = c('x1', 'x2'), 
  data=dat, q=2, rr=TRUE, B=3)
  
# Population average mixture RR, indicator variable representation of x2
# note that I(x==...) operates on the quantile-based category of x,
# rather than the raw value
res = qgcomp.boot(y ~ z + x1 + I(x2==1) + I(x2==2) + I(x2==3), 
  family="binomial", expnms = c('x1', 'x2'), data=dat, q=4, rr=TRUE, B=200)
res$fit  
plot(res)

# now add in a non-linear MSM
res2 = qgcomp.boot(y ~ z + x1 + I(x2==1) + I(x2==2) + I(x2==3), 
  family="binomial", expnms = c('x1', 'x2'), data=dat, q=4, rr=TRUE, B=200, 
  degree=2)
res2$fit  
res2$msmfit  
plot(res2)
# Log risk ratio per one IQR change in all exposures (not on quantile basis)
dat$x1iqr <- dat$x1/with(dat, diff(quantile(x1, c(.25, .75))))
dat$x2iqr <- dat$x2/with(dat, diff(quantile(x2, c(.25, .75))))
# note that I(x>...) now operates on the untransformed value of x,
# rather than the quantized value
res2 = qgcomp.boot(y ~ z + x1iqr + I(x2iqr>0.1) + I(x2>0.4) + I(x2>0.9), 
  family="binomial", expnms = c('x1iqr', 'x2iqr'), data=dat, q=NULL, rr=TRUE, B=200, 
  degree=2)
res2
# using parallel processing

qgcomp.boot(y ~ z + x1iqr + I(x2iqr>0.1) + I(x2>0.4) + I(x2>0.9), 
  family="binomial", expnms = c('x1iqr', 'x2iqr'), data=dat, q=NULL, rr=TRUE, B=200, 
  degree=2, parallel=TRUE)
# }