AIM (version 1.01)

lm.main: Main effect linear adaptive index model

Description

Estimate adpative index model for continuous outcomes in the context of linear regression. The resulting index characterizes the main covariate effect on the continuous response.

Usage

lm.main(x, y, nsteps=8, backfit=F, mincut=.1, maxnumcut=1, dirp=0)

Arguments

x
n by p matrix. The covariate matrix
y
n vector. The continuous response variable
nsteps
the maximum number of binary rules to be included in the index
backfit
T/F. Whether the existing split points are adjusted after including a new binary rule
mincut
The minimum cutting proportion for the binary rule at either end. It typically is between 0 and 0.2.
maxnumcut
The maximum number of binary splits per predictor
dirp
p vector. The given direction of the binary split for each of the p predictors. 0 represents "no pre-given direction"; 1 represents "(x>cut)"; -1 represents "(x

Value

lm.main returns maxsc, which is the score test statistics achieved in the fitted model and res, which is a list with components
jmaa
number of predictors
cutp
split points for the binary rules
maxdir
direction of split: 1 represents "(x>cut)" and -1 represents "(x
maxsc
observed score test statistics for the main effect

Details

lm.main sequentially estimates a sequence of adaptive index models with up to "nsteps" terms for continuous outcomes. The appropriate number of binary rules can be selected via K-fold cross-validation(cv.lm.main).

References

Lu Tian and Robert Tibshirani (2010) "Adaptive index models for marker-based risk stratification", Tech Report, available at http://www-stat.stanford.edu/~tibs/AIM.

Examples

Run this code
## generate data
set.seed(1)

n=500
p=20
x=matrix(rnorm(n*p), n, p)
z=(x[,1]<0.2)+(x[,5]>0.2)
beta=1
y=beta*z+rnorm(n)


## fit the main effects linear AIM
a=lm.main(x, y, nsteps=10)
 
## examine the model sequence 
print(a)


## compute the index based on the 2nd model of the sequence using data x 
z.prd=index.prediction(a$res[[2]],x)

## compute the index based on the 2nd model of the sequence using new data xx, and compare the result with the true index
nn=10
xx=matrix(rnorm(nn*p), nn, p)
zz=(xx[,1]<0.2)+(xx[,5]>0.2)
zz.prd=index.prediction(a$res[[2]],xx) 
cbind(zz, zz.prd)

Run the code above in your browser using DataLab