ahaz.tune.control: Tuning controls for regularization

Description

Define the type of tuning method used for regularization. Currently only used by tune.ahazpen.

Usage

# Cross-validation
cv.control(nfolds=5, reps=1, foldid=NULL, trace=FALSE)
# BIC-inspired
bic.control(factor = function(nobs){log(nobs)})

Arguments

nfolds

Number of folds for cross-validation. Default is nfolds=5. Each fold must have size > 1, i.e. nfolds must be less than half the sample size.

reps

Number of repetitions of cross-validation with nfolds folds. Default is rep=1. A rep larger than 1 can be useful to reduce variance of cross-validation scores.

foldid

An optional vector of values between 1 and nfolds identifying the fold to which each observation belongs. Supercedes nfolds and rep if supplied.

trace

Print progress of cross-validation. Default is trace=FALSE.

factor

Defines how strongly the number of nonzero penalty parameters penalizes the score in a BIC-type criterion; see the details.

Value

An object with S3 class "ahaz.tune.control".

type

Type of penalty.

factor

Function specified by factor, if applicable

getfolds

A function specifying how folds are calculated, if applicable.

rep

How many repetitions of cross-validation, if applicable.

trace

Print out progress?

Details

For examples of usage, see tune.ahazpen.

The regression coefficients of the semiparametric additive hazards model are estimated by solving a linear system of estimating equations of the form $D\beta=d$ with respect to $\beta$. The natural loss function for such a linear function is of the least-squares type $$L(\beta)=\beta' D \beta -2d'\beta.$$ This loss function is used for cross-validation as described by Martinussen & Scheike (2008).

Penalty parameter selection via a BIC-inspired approach was described by Gorst-Rasmussen & Scheike (2011). With $df$ is the degrees of freedom and $n$ the number of observations, we consider a BIC inspired criterion of the form $$BIC = \kappa L(\beta) + df\cdot factor(n)$$ where $\kappa$ is a scaling constant included to remove dependency on the time scale and better mimick the behavior of a `real' (likelihood) BIC. The default factor=function(n){log(n)} has desirable theoretical properties but may be conservative in practice.

References

Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.