BuildRule: Build a Treatment Rule

Description

Perform principled development of a treatment rule (using the IPW approach to account for potential confounding) on a development dataset (i.e. training set) that is independent of datasets used for model selection (i.e. validation set) and rule evaluation (i.e. test set).

Usage

BuildRule(
  development.data,
  study.design,
  prediction.approach,
  name.outcome,
  type.outcome,
  name.treatment,
  names.influencing.treatment = NULL,
  names.influencing.rule,
  desirable.outcome,
  rule.method = NULL,
  propensity.method,
  additional.weights = rep(1, nrow(development.data)),
  truncate.propensity.score = TRUE,
  truncate.propensity.score.threshold = 0.05,
  type.observation.weights = NULL,
  propensity.k.cv.folds = 10,
  rule.k.cv.folds = 10,
  lambda.choice = c("min", "1se"),
  OWL.lambda.seq = NULL,
  OWL.kernel = "linear",
  OWL.kparam.seq = NULL,
  OWL.cvFolds = 10,
  OWL.verbose = TRUE,
  OWL.framework.shift.by.min = TRUE,
  direct.interactions.center.continuous.Y = TRUE,
  direct.interactions.exclude.A.from.penalty = TRUE
)

Arguments

development.data

A data frame representing the *development* dataset (i.e. training set) used for building a treatment rule.

study.design

Either `observational', `RCT', or `naive'. For the observational design, the function uses inverse-probability-of-treatment observation weights (IPW) based on estimated propensity scores with predictors names.influencing.treatment; for the RCT design, the function uses IPW based on propensity scores equal to the observed sample proportions; for the naive design, all observation weights will be uniformly equal to 1.

prediction.approach

One of `split.regression', `direct.interactions', `OWL', or `OWL.framework'.

name.outcome

A character indicating the name of the outcome variable in development.data.

type.outcome

Either `binary' or `continuous', the form of name.outcome.

name.treatment

A character indicating the name of the treatment variable in development.data.

names.influencing.treatment

A character vector (or single element) indicating the names of the variables in development.data that are expected to influence treatment assignment in the current dataset. Required for study.design=`observational'.

names.influencing.rule

A character vector (or single element) indicating the names of the variables in development.data that may influence response to treatment and are expected to be observed in future clinical settings.

desirable.outcome

A logical equal to TRUE if higher values of the outcome are considered desirable (e.g. for a binary outcome, a 1 is more desirable than a 0). The OWL.framework and OWL prediction approaches require a desirable outcome.

rule.method

One of `glm.regression', `lasso', or `ridge'. For type.outcome=`binary', `glm.regression' leads to logistic regression; for a type.outcome=`continuous', `glm.regression' specifies linear regression. This is the underlying regression model used to develop the treatment rule.

propensity.method

One of `logistic.regression', `lasso', or `ridge'. This is the underlying regression model used to estimate propensity scores for study.design=`observational'.

additional.weights

A numeric vector of observation weights that will be multiplied by IPW weights in the rule development stage, with length equal to the number of rows in development.data. This can be used, for example, to account for a non-representative sampling design or to apply an IPW adjustment for missingness. The default is a vector of 1s.

truncate.propensity.score

A logical variable dictating whether estimated propensity scores less than truncate.propensity.score.threshold away from 0 or 1 should be truncated to be no more than truncate.propensity.score.threshold away from 0 or 1.

truncate.propensity.score.threshold

A numeric value between 0 and 0.25.

type.observation.weights

Default is NULL, but other choices are `IPW.L', `IPW.L.and.X', and `IPW.ratio', where L indicates names.influencing.treatment, X indicates names.influencing.rule. The default behavior is to use the `IPW.ratio' observation weights (propensity score based on X divided by propensity score based on L and X) for prediction.approach=`split.regression' and to use `IPW.L' observation weights (inverse of propensity score based on L) for the `direct.interactions', `OWL', and `OWL.framework' prediction approaches.

propensity.k.cv.folds

An integer specifying how many folds to use for K-fold cross-validation that chooses the tuning parameters when propensity.method is `lasso' or `ridge'. Default is 10.

rule.k.cv.folds

An integer specifying how many folds to use for K-fold cross-validation that chooses the tuning parameter when rule.method is lasso or `ridge'. Default is 10.

lambda.choice

Either `min' or `1se', corresponding to the s argument in predict.cv.glmnet() from the glmnet package. Only used when propensity.method or rule.method is `lasso' or `ridge'. Default is `min'.

OWL.lambda.seq

Used when prediction.approach=`OWL', a numeric vector that corresponds to the lambdas argument in the owl() function from the DynTxRegime package. Defaults to 2^seq(-5, 5, 1).

OWL.kernel

Used when prediction.approach=`OWL', a character equal to either `linear' or `radial'. Corresponds to the kernel argument in the owl() function from the DynTxRegime package. Default is `linear'.

OWL.kparam.seq

Used when prediction.approach=`OWL' and OWL.kernel=`radial'. Corresponds to the kparam argument in the owl() function from the DynTxRegime package. Defaults to 2^seq(-10, 10, 1).

OWL.cvFolds

Used when prediction.approach=`OWL', an integer corresponding to the cvFolds argument in the owl() function from the DynTxRegime package. Defaults to 10.

OWL.verbose

Used when prediction.approach=`OWL', a logical corresponding to the verbose argument in the owl() function from the DynTxRegime package. Defaults to TRUE.

OWL.framework.shift.by.min

Logical, set to TRUE by default in recognition of our empirical observation that, with a continuous outcome, OWL framework performs far better in simulation studies when the outcome was shifted to have a minimum of just above 0.

direct.interactions.center.continuous.Y

Logical, set to TRUE by default in recognition of our empirical observation that, with a continuous outcome, direct-interactions performed far better in simulation studies when the outcome was mean-centered.

direct.interactions.exclude.A.from.penalty

Logical, set to TRUE by default in recognition of our empirical observation that, with a continuous outcome and lasso/ridge used specified as the rule.method, direct-interactions performed far better in simulation studies when the coefficient corresponding to the treatment variable was excluded from the penalty function.

Value

A list with some combination of the following components (depending on specified prediction.approach)

type.outcome: The type.outcome specified above (used by other functions that are based on BuildRule())
prediction.approach: The prediction.approach specified above (used by other functions that are based on BuildRule())
rule.method: The rule.method specified above (used by other functions that are based on BuildRule())
lambda.choice: The lambda.choice specified above (used by other functions that are based on BuildRule())
propensity.score.object: A list containing the relevant regression object from propensity score estimation. The list has two elements for type.observation.weights=`IPW.ratio' (the default for prediction.approach=`split.regression'), has one element for type.observation.weights=`IPW.L' (the default for `OWL', `OWL.framework' and `direct.interactions'), has one element when type.observation.weights=`IPW.L.and.X', and is simply equal to NA if study.design=`RCT' (in which case propensity score would just be the inverse of sample proportion receiving treatment).
owl.object: For prediction.approach=`OWL' only, the object returned by the owl() function in the DynTxRegime package.
observation.weights: The observation weights used for estimating the treatment rule
rule.object: For prediction.approach=`OWL.framework' or prediction.approach=`direct.interactions', the regression object returned from treatment rule estimation (to which the coef() function could be applied, for example)
rule.object.control: For prediction.approach=`split.regression' the regression object returned from treatment rule estimation (to which the coef() function could be applied, for example) that estimates the outcome variable for individuals who do not receive treatment.
rule.object.treatment: For prediction.approach=`split.regression' the regression object returned from treatment rule estimation (to which the coef() function could be applied, for example) that estimates the outcome variable for individuals who do receive treatment.

References

Yingqi Zhao, Donglin Zeng, A. John Rush & Michael R. Kosorok (2012) Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association, 107:499 1106--1118.
Shuai Chen, Lu Tian, Tianxi Cai, Menggang Yu (2017) A general statistical framework for subgroup identification and comparative treatment scoring. Biometrics, 73:4: 1199--1209.
Lu Tian, Ash A. Alizadeh, Andrew J. Gentles, Robert Tibshirani (2014) A simple method for estimating interactions between a treatment and a large number of covariates. Journal of the American Statistical Association, 109:508: 1517--1532.
Jeremy Roth and Noah Simon (2019). Using propensity scores to develop and evaluate treatment rules with observational data (Manuscript in progress)
Jeremy Roth and Noah Simon (2019). Elucidating outcome-weighted learning and its comparison to split-regression: direct vs. indirect methods in practice. (Manuscript in progress)

Examples

Run this code

# NOT RUN {
set.seed(123)
example.split <- SplitData(data=obsStudyGeneExpressions,
                                     n.sets=3, split.proportions=c(0.5, 0.25, 0.25))
development.data <- example.split[example.split$partition == "development",]
one.rule <- BuildRule(development.data=development.data,
                     study.design="observational",
                     prediction.approach="split.regression",
                     name.outcome="no_relapse",
                     type.outcome="binary",
                     desirable.outcome=TRUE,
                     name.treatment="intervention",
                     names.influencing.treatment=c("prognosis", "clinic", "age"),
                     names.influencing.rule=c("age", paste0("gene_", 1:10)),
                     propensity.method="logistic.regression",
                     rule.method="glm.regression")
coef(one.rule$rule.object.control)
coef(one.rule$rule.object.treatment)
# }

Run the code above in your browser using DataLab