twoStageTMLEmsm: twoStageTMLEmsm

Description

Inverse probability of censoring weighted TMLE for evaluating MSM parameters when the full set of covariates is available on only a subset of observations, as in a 2-stage design.

Usage

twoStageTMLEmsm(
  Y,
  A,
  W,
  V,
  Delta.W,
  W.stage2,
  Delta = rep(1, length(Y)),
  pi = NULL,
  piform = NULL,
  pi.SL.library = c("SL.glm", "SL.gam", "SL.glmnet", "tmle.SL.dbarts.k.5"),
  V.pi = 10,
  pi.discreteSL = TRUE,
  condSetNames = c("A", "V", "W", "Y"),
  id = NULL,
  Q.family = "gaussian",
  augmentW = TRUE,
  augW.SL.library = c("SL.glm", "SL.glmnet", "tmle.SL.dbarts2"),
  rareOutcome = FALSE,
  verbose = FALSE,
  ...
)

Value

tmle: Treatment effect estimates and summary information from call to tmleMSM function

twoStage

IPCW weight estimation summary, pi are the probabilities,coef are SL weights or coefficients from glm fit, type of estimation procedure, discreteSL flag indicating whether discrete super learning was used

augW

Matrix of predicted outcomes based on stage 1 covariates only

Arguments

Y: outcome of interest (missingness allowed)
A: binary treatment indicator
W: matrix or data.frame of covariates measured on entire population
V: vector, matrix, or dataframe of covariates used to define MSM strata
Delta.W: Indicator of inclusion in subset with additional information
W.stage2: matrix or data.frame of covariates measured in subset population
Delta: binary indicator that outcome Y is observed
pi: optional vector of sampling probabilities
piform: parametric regression formula for estimating pi (see Details)
pi.SL.library: super learner library for estimating pi (see Details)
V.pi: optional number of cross-validation folds for super learning (ignored when piform or pi is provided)
pi.discreteSL: flag to indicate whether to use ensemble or discrete super learning (ignored when piform or pi is provided)
condSetNames: Variables to include as predictors of missingness in W.stage2, any combination of Y, A, and either W (for all covariates in W), or individual covariate names in W
id: optional indicator of independent units of observation
Q.family: outcome regression family, "gaussian" or "binomial"
augmentW: set to TRUE to augment W with predicted outcome values when A = 0 and A = 1
augW.SL.library: super learner library for preliminary outcome regression model (ignored when augmentW is FALSE)
rareOutcome: when TRUE sets V.Q = 20, Q.discreteSL = TRUE, Q.SL.library includes glm, glmnet, bart
verbose: when TRUE prints informative messages
...: other arguments passed to the tmleMSM function

Details

When using piform to specify a parametric model for pi that conditions on the outcome use Delta.W as the dependent variable and Y.orig on the right hand side of the formula instead of Y. When writing a user-defined SL wrapper for inclusion in pi.SL.library use Y on the left hand side of the formula. If specific covariate names are used on the right hand side use Y.orig to condition on the outcome.

Examples

Run this code

n <- 1000
set.seed(10)
W1 <- rnorm(n)
W2 <- rnorm(n)
W3 <- rnorm(n)
A <- rbinom(n, 1, plogis(-1 + .2*W1 + .3*W2 + .1*W3))
Y <- 10 + A + W1 + W2 + A*W1 + W3 + rnorm(n)
Y.bin <- rbinom(n, 1, plogis(-4.6 - 1.8* A + W1 + W2 -.3 *A*W1 + W3))
# Set 400 obs with data on W3, more likely if W1 > 1
n.sample <- 400
p.sample <- 0.5 + .2*(W1 > 1)
rows.sample <- sample(1:n, size = n.sample, p = p.sample)
Delta.W <- rep(0,n)
Delta.W[rows.sample] <- 1
W3.stage2 <- cbind(W3 = W3[Delta.W==1])

# 1. specify parametric models, misspecified outcome model (not recommended)
result1.MSM <- twoStageTMLEmsm(Y=Y, A=A, V= cbind(W1), W=cbind(W2), 
Delta.W = Delta.W, W.stage2 = W3.stage2, augmentW = FALSE,
piform = "Delta.W~ I(W1 > 0)", MSM = "A*W1", augW.SL.library = "SL.glm",
Qform = "Y~A+W1",gform="A~W1 + W2 +W3", hAVform = "A~1", verbose=TRUE)
summary(result1.MSM)

# 2. Call again, passing in previously estimated observation weights, 
# note that specifying a correct model for Q improves efficiency
result2.MSM <- twoStageTMLEmsm(Y=Y, A=A, V= cbind(W1), W=cbind(W2), 
Delta.W = Delta.W, W.stage2 = W3.stage2, augmentW = FALSE,
pi = result1.MSM$twoStage$pi, MSM = "A*W1",
Qform = "Y~ A + W1 + W2 + A*W1 + W3",gform="A~W1 + W2 +W3", hAVform = "A~1")
cbind(SE.Qmis = result1.MSM$tmle$se, SE.Qcor = result2.MSM$tmle$se)

# \donttest{
#Binary outcome, augmentW, rareOutcome
result3.MSM <- twoStageTMLEmsm(Y=Y.bin, A=A, V= cbind(W1), W=cbind(W2), 
Delta.W = Delta.W, W.stage2 = W3.stage2, augmentW = TRUE,
piform = "Delta.W~ I(W1 > 0)", MSM = "A*W1", gform="A~W1 + W2 +W3",
 Q.family = "binomial", rareOutcome=TRUE)
# }