logistic.main: Main effect logistic adaptive index model

Description

Estimate adpative index model for binary outcomes in the context of logistic regression. The resulting index characterizes the main covariate effect on the response probability.

Usage

logistic.main(x, y, nsteps=8, mincut=.1, backfit=F, maxnumcut=1, dirp=0, weight=1)

Arguments

n by p matrix. The covariate matrix

n 0/1 vector. The binary response variable

nsteps

the maximum number of binary rules to be included in the index

backfit

T/F. Whether the existing split points are adjusted after including a new binary rule

mincut

The minimum cutting proportion for the binary rule at either end. It typically is between 0 and 0.2.

maxnumcut

The maximum number of binary splits per predictor

dirp

p vector. The given direction of the binary split for each of the p predictors. 0 represents "no pre-given direction"; 1 represents "(x>cut)"; -1 represents "(x

weight

a positive number. The weight given to responses. "weight=0" means that all observations are equally weighted.

Value

jmaa: number of predictors
cutp: split points for the binary rules
maxdir: direction of split: 1 represents "(x>cut)" and -1 represents "(x
maxsc: observed score test statistics for the main effect

Details

logistic.main sequentially estimates a sequence of adaptive index models with up to "nsteps" terms for binary outcomes. The appropriate number of binary rules can be selected via K-fold cross-validation(cv.logistic.main).

References

Lu Tian and Robert Tibshirani (2010) "Adaptive index models for marker-based risk stratification", Tech Report, available at http://www-stat.stanford.edu/~tibs/AIM.

Examples

Run this code

## generate data
set.seed(1)
n=200
p=10

x=matrix(rnorm(n*p), n, p)
z=(x[,1]<0.2)+(x[,5]>0.2)
beta=1
prb=1/(1+exp(-beta*z))
y=rbinom(n,1,prb)


## fit logistic main effects AIM
a=logistic.main(x, y, nsteps=10)
 
## examine the model sequence 
print(a)


## compute the index based on the 2nd model of the sequence using data x 
z.prd=index.prediction(a$res[[2]],x)

## compute the index based on the 2nd model of the sequence using new data xx, and compare the result with the true index
nn=10
xx=matrix(rnorm(nn*p), nn, p)
zz=(xx[,1]<0.2)+(xx[,5]>0.2)
zz.prd=index.prediction(a$res[[2]],xx) 
cbind(zz, zz.prd)

Run the code above in your browser using DataLab