sim_pte: Simulations for Personalized Treatment Effects

Description

Numerical simulation for treatment effect heterogeneity estimation as described in Tian et al. (2012)

Usage

sim_pte(n = 1000, p = 20, rho = 0, sigma = sqrt(2), beta.den = 4)

Arguments

number of observations.

number of predictors.

rho

covariance between predictors.

sigma

multiplier of error term.

beta.den

size of main effects relative to interaction effects. See details.

Value

A data frame including the response variable ($Y$), the treatment (treat=1) and control (treat=-1) assignment, the predictor variables ($X$) and the "true" treatment effect score (ts)

Details

sim_pte simulates data according to the following specification:

$$Y = I(\sum_{j=1}^p \beta_{j}X_{j} + \sum_{j=1}^p \gamma_{j}X_{j}T +\sigma_{0}\epsilon > 0)$$,

where $\gamma=(1/2,-1/2,1/2,-1/2, 0,...,0)$, $\beta=(-1)^{j+1}I(3 \leq j \leq 10) / \code{beta.den}$, $(X_{1}, \ldots, X_{p})$ follows a mean zero multivariate normal distribution with a compound symmetric variance-covariance matrix, $(1-\rho)\mathbf{I}_{p} +\rho \mathbf{1}^{T}\mathbf{1}$, $T=[-1,1]$ is the treatment indicator and $\epsilon$ is $N(0,1)$.

In this case, the "true" treatment effect score $(Prob(Y=1|T=1) - Prob(Y=1|T=-1))$ is given by

$$\Phi (\frac{\sum_{j=1}^p (\beta_{j} + \gamma_{j})X_{j}}{\sigma_{0}}) - \Phi (\frac{\sum_{j=1}^p (\beta_{j} - \gamma_{j})X_{j}}{\sigma_{0}})$$.

References

Tian, L., Alizadeh, A., Gentles, A. and Tibshirani, R. 2012. A simple method for detecting interactions between a treatment and a large number of covariates. Submitted on Dec 2012. arXiv:1212.2995 [stat.ME].

Guelman, L., Guillen, M., and Perez-Marin A.M. (2013). Optimal personalized treatment rules for marketing interventions: A review of methods, a new proposal, and an insurance case study. Submitted.

Examples

Run this code

library(uplift)
### Simulate train data

set.seed(12345)
dd <- sim_pte(n = 1000, p = 10, rho = 0, sigma =  sqrt(2), beta.den = 4)
dd$treat <- ifelse(dd$treat == 1, 1, 0) # required coding for upliftRF

### Fit model

form <- as.formula(paste('y ~', 'trt(treat) +', paste('X', 1:10, sep = '', collapse = "+"))) 

fit1 <- upliftRF(formula = form,
                 data = dd, 
                 ntree = 100, 
                 split_method = "Int",
                 interaction.depth = 3,
                 minsplit = 100, 
                 minbucket_ct0 = 50, 
                 minbucket_ct1 = 50,
                 verbose = TRUE)
summary(fit1)

Run the code above in your browser using DataLab