tune.ahazpen: Choice of penalty parameter in ahazpen

Description

Tuning of penalty parameters for the penalized semiparametric additive hazards model via cross-validation - or via non-stochastic procedures, akin to BIC for likelihood-based models.

Usage

tune.ahazpen(surv, X, weights, standardize=TRUE, penalty=lasso.control(),
             tune=cv.control(), dfmax=nvars, lambda, ...)

Arguments

surv

Response in the form of a survival object, as returned by the function Surv() in the package survival. Right-censored and counting process format (left-truncation) is supported. Tied survival times are not supported.

Design matrix. Missing values are not supported.

weights

Optional vector of observation weights. Default is 1 for each observation.

standardize

Logical flag for variable standardization, prior to model fitting. Parameter estimates are always returned on the original scale. Default is standardize=TRUE.

penalty

A description of the penalty function to be used for model fitting. This can be a character string naming a penalty function (currently "lasso" or stepwise SCAD, "sscad") or it can be a call to the penalty function. Default is penalty=lasso.control(). See ahazpen.pen.control for the available penalty functions and advanced options; see also the examples.

dfmax

Limit the maximum number of covariates included in the model. Default is nvars=nobs-1. Unless a complete regularization path is needed, it is highly recommended to initially choose a relatively smaller value of dfmax to reduce computation time and memory usage.

lambda

An optional user supplied sequence of penalty parameters. Typical usage is to have the program compute its own lambda sequence based on nlambda and lambda.min.

tune

A description of the tuning method to be used. This can be a character string naming a tuning control function (currently "cv" or "bic") or a call to the tuning control function. Default is 5-fold cross-validation, tune=cv.control(), see ahaz.tune.control for more options. See also the examples.

...

Additional arguments to be passed to ahazpen, see ahazpen for options.

Value

An object with S3 class "tune.ahazpen".

call

The call that produced this object.

lambda

The actual sequence of lambda values used.

tunem

The tuning score for each value of lambda (mean cross-validated error, if tune=cv.control()).

tunesd

Estimate of the cross-validated standard error, if tune=cv.control().

tunelo

Lower curve = tunem-tunemsd, if tune=cv.control().

tuneup

Upper curve = tunem+tunemsd, if tune=cv.control().

lambda.min

Value of lambda for which tunem is minimized.

Number of non-zero coefficients at each value of lambda.

tune

The selected tune of S3 class "ahaz.tune.control".

penalty

The selected penalty of S3 class "ahazpen.pen.control".

foldsused

Folds actually used, if tune=cv.control().

Details

The function performs an initial penalized fit based on the penalty supplied in penalty to obtain a sequence of penalty parameters. Subsequently, it selects among these an optimal penalty parameter based on the tuning control function described in tune, see ahaz.tune.control.

References

Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.

Examples

Run this code

# NOT RUN {
data(sorlie)

# Break ties
set.seed(10101)
time <- sorlie$time+runif(nrow(sorlie))*1e-2

# Survival data + covariates
surv <- Surv(time,sorlie$status)
X <- as.matrix(sorlie[,3:ncol(sorlie)])

# Training/test data
set.seed(20202)
train <- sample(1:nrow(sorlie),76)
test <- setdiff(1:nrow(sorlie),train)

# Run cross validation on training data
set.seed(10101)
cv.las <- tune.ahazpen(surv[train,], X[train,],dfmax=30)
plot(cv.las)

# Check fit on the test data
testrisk <- predict(cv.las,X[test,],type="lp")
plot(survfit(surv[test,]~I(testrisk<median(testrisk))),main="Low versus high risk")

# Advanced example, cross-validation of one-step SCAD
# with initial solution derived from univariate models.
# Since init.sol is specified as a function, it is
# automatically cross-validated as well
scadfun<-function(surv,X,weights){coef(ahaz(surv,X,univariate=TRUE))}
set.seed(10101)
cv.ssc<-tune.ahazpen(surv[train,],X[train,],
                     penalty=sscad.control(init.sol=scadfun),
                     tune=cv.control(rep=5),dfmax=30)
# Check fit on test data
testrisk <- predict(cv.ssc,X[test,],type="lp")
plot(survfit(surv[test,]~I(testrisk<median(testrisk))),main="Low versus high risk")
# }

Run the code above in your browser using DataLab