Penalized A-learning is developed to select important variables involved in the optimal
individualized treatment regime. An individualized treatment regime is a function that maps patients
covariates to the space of available treatment options. The method can be applied to both single-stage
and two-stage studies.
PAL applied the Dantzig selector on the A-learning estimating equation for variable selection. The
regularization parameter in the Dantzig selector is chosen according to the information criterion.
Specifically, we provide a Bayesian information criterion (BIC), a concordance information criterion
(CIC) and a value information criterion (VIC). For illustration of these information criteria, consider
a single-stage study. Assume the data is summarized as \((Y_i, A_i, X_i), i=1,...,n\) where \(Y_i\)
is the response of the \(i\)-th patient, \(A_i\) denotes the treatment that patient receives and
\(X_i\) is the corresponding baseline covariates. Let \(\hat{\pi}_i\) and \(\hat{h}_i\) denote the
estimated propensity score and baseline mean of the \(i\)-th patient. For any linear treatment regime
\(I(x^T \beta>c)\), BIC is defined as
$$BIC=-n\log\left( \sum_{i=1}^n (A_i-\hat{\pi}_i)^2 (Y_i-\hat{h}_i-A_i c-A_i X_i^T \beta)^2 \right)-\|\beta\|_0 \kappa_B,$$
where \(\kappa_B=\{\log (n)+\log (p+1) \}/\code{kappa}\) and kappa
is the model complexity penalty used in the function PAL.control
.
VIC is defined as
$$VIC=\sum_{i=1}^n \left(\frac{A_i d_i}{\hat{\pi}_i}+\frac{(1-A_i) (1-d_i)}{1-\hat{\pi}_i} \right)\{Y_i-\hat{h}_i-A_i (X_i^T \beta+c)\}+
\{\hat{h}_i+\max(X_i^T \beta+c,0)\}-\|\beta\|_0 \kappa_V,$$
where \(d_i=I(X_i^T \beta>-c)\) and \(\kappa_V=n^{1/3} \log^{2/3} (p) \log (\log (n))/\code{kappa}\).
CIC is defined as
$$CIC=\sum_{i\neq j} \frac{1}{n} \left( \frac{(A_i-\hat{\pi}_i) \{Y_i-\hat{h}_i\} A_j}{\hat{\pi}_i (1-\hat{\pi}_i) \hat{\pi}_j}-
\frac{(A_j-\hat{\pi}_j) \{Y_j-\hat{h}_j\} A_i}{\hat{\pi}_j (1-\hat{\pi}_j) \hat{\pi}_i} \right) I(X_i^T \beta> X_j^T \beta)
-\|\beta\|_0 \kappa_C,$$
where \(\kappa_C=\log (p) \log_{10}(n) \log(\log_{10}(n))/\code{kappa}\).
Under certain conditions, it can be shown that CIC and VIC is consistent as long as either the estimated
propensity score or the estimated baseline is consistent.
For single-stage study, the formula should specified as y ~ x1 | a1 where y is the reponse vector (y
should be specified in such a way that a larger value of y indicates better clinical outcomes), x1 is
patient's baseline covariates and a1 is the treatment that patient receives.
For two-stage study, the formula should be specified as y ~ x1 | a1 | x2 | a2 where y is the response
vector, a1 and a2 the vectors of patients' first and second treatments, x1 and x2 are the design matrices
consisting of patients' baseline covariates and intermediate covariates.
PAL
standardizes the covariates and includes an intercept in the estimated individualized treatment
regime by default. For single-stage study, the estimated treamtent regime is given by \(I(\code{x1}^T \code{beta1.est}>0)\).
For two-stage study, the estimated regime is given by \(\code{a1}=I(x1^T \code{beta1.est}>0)\) and \(\code{a2}=I(\code{x}^T \code{beta2.est}>0)\)
where x=c(x1, a1, x2)
.