AIM (version 1.01)

logistic.main: Main effect logistic adaptive index model

Description

Estimate adpative index model for binary outcomes in the context of logistic regression. The resulting index characterizes the main covariate effect on the response probability.

Usage

logistic.main(x, y, nsteps=8, mincut=.1, backfit=F, maxnumcut=1, dirp=0, weight=1)

Arguments

x
n by p matrix. The covariate matrix
y
n 0/1 vector. The binary response variable
nsteps
the maximum number of binary rules to be included in the index
backfit
T/F. Whether the existing split points are adjusted after including a new binary rule
mincut
The minimum cutting proportion for the binary rule at either end. It typically is between 0 and 0.2.
maxnumcut
The maximum number of binary splits per predictor
dirp
p vector. The given direction of the binary split for each of the p predictors. 0 represents "no pre-given direction"; 1 represents "(x>cut)"; -1 represents "(x
weight
a positive number. The weight given to responses. "weight=0" means that all observations are equally weighted.

Value

logistic.main returns maxsc, which is the score test statistics achieved in the fitted model and res, which is a list with components
jmaa
number of predictors
cutp
split points for the binary rules
maxdir
direction of split: 1 represents "(x>cut)" and -1 represents "(x
maxsc
observed score test statistics for the main effect

Details

logistic.main sequentially estimates a sequence of adaptive index models with up to "nsteps" terms for binary outcomes. The appropriate number of binary rules can be selected via K-fold cross-validation(cv.logistic.main).

References

Lu Tian and Robert Tibshirani (2010) "Adaptive index models for marker-based risk stratification", Tech Report, available at http://www-stat.stanford.edu/~tibs/AIM.

Examples

Run this code
## generate data
set.seed(1)
n=200
p=10

x=matrix(rnorm(n*p), n, p)
z=(x[,1]<0.2)+(x[,5]>0.2)
beta=1
prb=1/(1+exp(-beta*z))
y=rbinom(n,1,prb)


## fit logistic main effects AIM
a=logistic.main(x, y, nsteps=10)
 
## examine the model sequence 
print(a)


## compute the index based on the 2nd model of the sequence using data x 
z.prd=index.prediction(a$res[[2]],x)

## compute the index based on the 2nd model of the sequence using new data xx, and compare the result with the true index
nn=10
xx=matrix(rnorm(nn*p), nn, p)
zz=(xx[,1]<0.2)+(xx[,5]>0.2)
zz.prd=index.prediction(a$res[[2]],xx) 
cbind(zz, zz.prd)

Run the code above in your browser using DataLab