PAL: Penalized A-learning for optimal dynamic treatment regime

Description

Selects important variables that are involved in the optimal treatment regime based on penalized A-learning estimating equation. This function can be applied to two-stage studies where treatments are sequentially assigned at two different time points.

Usage

PAL(formula, data, subset, na.action, IC = c("BIC", "CIC", "VIC"), 
    lambda.list = exp(seq(-3.5, 2, 0.1)), refit = TRUE, control = PAL.control(...), 
	model = TRUE, y = TRUE, a1 = TRUE, x1 = TRUE, a2 = TRUE, x2 = TRUE, ...)
	
PAL.fit(y, x1, x2 = NULL, a1, a2 = NULL, IC = c("BIC", "CIC", "VIC"), 
    lambda.list = exp(seq(-3.5, 2, 0.1)), refit = TRUE, 
    control = PAL.control())

Arguments

formula

A symbolic description of the model to be fitted(of type y ~ x1 | a1 or y ~ x1 | a1 | x2 | a2. Details are given 'Details').

data

An optional list or environment containing variables in formula.

subset, na.action

Arguments controlling formula processing via model.frame.

Information criterion used in determining the regularization parameter. See 'Details'.

lambda.list

A list of regularization parameter values. Default is exp(seq(-3.5, 2, 0.1)).

refit

After variable selection, should the coefficients be refitted using A-learning estimating equation? Default is TRUE.

control

A list of control argument via PAL.control.

model

A logical value indicating whether model frame should be included as a component of the return value.

y, a1, x1, a2, x2

For PAL: logical values indicating whether the response, the first and second treatments, the baseline and intermediate covariates should be included as a component of the return value.

For PAL.fit: y is the response vector (the larger the better), a1 and a2 are the first and second treatments patients receive, x1 and x2 are the design matrices consisting of patients' baseline covariates and intermediate covariates.

…

Argument passed to PAL.control.

Value

beta2.est

Estimated coefficients in the second decision rule.

beta1.est

Estimated coefficients in the first decision rule.

pi2.est

Estimated propensity score at the second stage.

pi1.est

Estimated propensity score at the first stage.

h2.est

Estimated baseline function at the second stage.

h1.est

Estimated baseline function at the first stage.

alpha2.est

Regression coefficients in the estimated propensity score at the second stage.

alpha1.est

Regression coefficients in the estimated propensity score at the first stage.

theta2.est

Regression coefficients in the estimated baseline function at the second stage.

theta1.est

Regression coefficients in the estimated baseline function at the first stage.

model

The full model frame (if model = TRUE).

Response vector (if y = TRUE).

Baseline covariates (if x1 = TRUE).

A vector of first treatment (if a1 = TRUE).

Intermediate covariates (if x2 = TRUE).

A vector of second treatment (if a2 = TRUE).

Details

Penalized A-learning is developed to select important variables involved in the optimal individualized treatment regime. An individualized treatment regime is a function that maps patients covariates to the space of available treatment options. The method can be applied to both single-stage and two-stage studies.

PAL applied the Dantzig selector on the A-learning estimating equation for variable selection. The regularization parameter in the Dantzig selector is chosen according to the information criterion. Specifically, we provide a Bayesian information criterion (BIC), a concordance information criterion (CIC) and a value information criterion (VIC). For illustration of these information criteria, consider a single-stage study. Assume the data is summarized as $(Y_i, A_i, X_i), i=1,...,n$ where $Y_i$ is the response of the $i$-th patient, $A_i$ denotes the treatment that patient receives and $X_i$ is the corresponding baseline covariates. Let $\hat{\pi}_i$ and $\hat{h}_i$ denote the estimated propensity score and baseline mean of the $i$-th patient. For any linear treatment regime $I(x^T \beta>c)$, BIC is defined as $$BIC=-n\log\left( \sum_{i=1}^n (A_i-\hat{\pi}_i)^2 (Y_i-\hat{h}_i-A_i c-A_i X_i^T \beta)^2 \right)-\|\beta\|_0 \kappa_B,$$ where $\kappa_B=\{\log (n)+\log (p+1) \}/\code{kappa}$ and kappa is the model complexity penalty used in the function PAL.control. VIC is defined as $$VIC=\sum_{i=1}^n \left(\frac{A_i d_i}{\hat{\pi}_i}+\frac{(1-A_i) (1-d_i)}{1-\hat{\pi}_i} \right)\{Y_i-\hat{h}_i-A_i (X_i^T \beta+c)\}+ \{\hat{h}_i+\max(X_i^T \beta+c,0)\}-\|\beta\|_0 \kappa_V,$$ where $d_i=I(X_i^T \beta>-c)$ and $\kappa_V=n^{1/3} \log^{2/3} (p) \log (\log (n))/\code{kappa}$. CIC is defined as $$CIC=\sum_{i\neq j} \frac{1}{n} \left( \frac{(A_i-\hat{\pi}_i) \{Y_i-\hat{h}_i\} A_j}{\hat{\pi}_i (1-\hat{\pi}_i) \hat{\pi}_j}- \frac{(A_j-\hat{\pi}_j) \{Y_j-\hat{h}_j\} A_i}{\hat{\pi}_j (1-\hat{\pi}_j) \hat{\pi}_i} \right) I(X_i^T \beta> X_j^T \beta) -\|\beta\|_0 \kappa_C,$$ where $\kappa_C=\log (p) \log_{10}(n) \log(\log_{10}(n))/\code{kappa}$.

Under certain conditions, it can be shown that CIC and VIC is consistent as long as either the estimated propensity score or the estimated baseline is consistent.

For single-stage study, the formula should specified as y ~ x1 | a1 where y is the reponse vector (y should be specified in such a way that a larger value of y indicates better clinical outcomes), x1 is patient's baseline covariates and a1 is the treatment that patient receives.

For two-stage study, the formula should be specified as y ~ x1 | a1 | x2 | a2 where y is the response vector, a1 and a2 the vectors of patients' first and second treatments, x1 and x2 are the design matrices consisting of patients' baseline covariates and intermediate covariates.

PAL standardizes the covariates and includes an intercept in the estimated individualized treatment regime by default. For single-stage study, the estimated treamtent regime is given by $I(\code{x1}^T \code{beta1.est}>0)$. For two-stage study, the estimated regime is given by $\code{a1}=I(x1^T \code{beta1.est}>0)$ and $\code{a2}=I(\code{x}^T \code{beta2.est}>0)$ where x=c(x1, a1, x2).

References

Shi, C. and Fan, A. and Song, R. and Lu, W. (2018) High-Dimensional A-Learing for Optimal Dynamic Treatment Regimes. Annals of Statistics, 46: 925-957.

Shi, C. and Song, R. and Lu, W. (2018) Concordance and Value Information Criteria for Optimal Treatment Decision. Under review.

Examples

Run this code

# NOT RUN {
## single-stage study
set.seed(12345)
n <- 200
p <- 1000
X <- matrix(rnorm(n*p), nrow=n, ncol=p)
A <- rbinom(n, 1, 0.5)
CX <- (X[,1] + X[,2])
h <- 1 + X[,1] * X[,3]
Y <- h + A*CX + 0.5*rnorm(n)
result <- PAL(Y~X|A)

## two-stage study
set.seed(12345*2)
n <- 200
p <- 1000
X1 <- matrix(rnorm(n*p), nrow=n, ncol=p)
A1 <- rbinom(n, 1, 0.5)
X2 <- X1[,1] + A1 + 0.5*rnorm(n)
A2 <- rbinom(n, 1, 0.5)
Y <- A2*(A1 + X2) + A1*X1[,1] + 0.5*rnorm(n)
result <- PAL(Y~X1|A1|X2|A2)
# }
# NOT RUN {
## single-stage study
set.seed(12345)
n <- 50
p <- 20
X <- matrix(rnorm(n*p), nrow=n, ncol=p)
A <- rbinom(n, 1, 0.5)
CX <- (X[,1] + X[,2])
h <- 1 + X[,1] * X[,3]
Y <- h + A*CX + 0.5*rnorm(n)
result <- PAL(Y~X|A)

## two-stage study
set.seed(12345*2)
n <- 50
p <- 20
X1 <- matrix(rnorm(n*p), nrow=n, ncol=p)
A1 <- rbinom(n, 1, 0.5)
X2 <- X1[,1] + A1 + 0.5*rnorm(n)
A2 <- rbinom(n, 1, 0.5)
Y <- A2*(A1 + X2) + A1*X1[,1] + 0.5*rnorm(n)
result <- PAL(Y~X1|A1|X2|A2)
# }

Run the code above in your browser using DataLab