Learn R Programming

BSGW (version 0.9.2)

bsgw.crossval: Convenience functions for cross-validation-based selection of shrinkage parameter in the bsgw model.

Description

bsgw.crossval calculates cross-validation-based, out-of-sample log-likelihood of a bsgw model for a data set, given the supplied folds. bsgw.crossval.wrapper applies bsgw.crossval to a set of combinations of shrinkage parameters (lambda,lambdas) and produces the resulting vector of log-likelihood values as well as the specific combination of shrinkage parameters associated with the maximum log-likelihood. bsgw.generate.folds generates random partitions, while bsgw.generate.folds.eventbalanced generates random partitions with events evenly distributed across partitions. The latter feature is useful for cross-valiation of small data sets with low event rates, since it prevents over-accumulation of events in one or two partitions, and lack of events altogether in other partitions.

Usage

bsgw.generate.folds(ntot, nfold=5) bsgw.generate.folds.eventbalanced(formula, data, nfold=5) bsgw.crossval(data, folds, all=FALSE, print.level=1 , control=bsgw.control(), ncores=1, ...) bsgw.crossval.wrapper(data, folds, all=FALSE, print.level=1 , control=bsgw.control(), ncores=1 , lambda.vec=exp(seq(from=log(0.01), to=log(100), length.out = 10)), lambdas.vec=NULL , lambda2=if (is.null(lambdas.vec)) cbind(lambda=lambda.vec, lambdas=lambda.vec) else as.matrix(expand.grid(lambda=lambda.vec, lambdas=lambdas.vec)) , plot=TRUE, ...)

Arguments

ntot
Number of observations to create partitions for. It must typically be set to nrow(data).
nfold
Number of folds or partitions to generate.
formula
Survival formula, used to extract the binary status field from the data. Right-hand side of the formula is ignored, so a formula of the form Surv(time,status)~1 is sufficient.
data
Data frame used in model training and prediction.
folds
An integer vector of length nrow(data), defining fold/partition membership of each observation. For example, in 5-fold cross-validation for a data set of 200 observations, folds must be a 200-long vector with elements from the set {1,2,3,4,5}. Convenience functions bsgw.generate.folds and bsgw.generate.folds.eventbalanced can be used to generate the folds vector for a given survival data frame.
all
If TRUE, estimation objects from each cross-validation task is collected and returned for diagnostics purposes.
print.level
Verbosity of progress report.
control
List of control parameters, usually the output of bsgw.control.
ncores
Number of cores for parallel execution of cross-validation code.
lambda.vec
Vector of shrinkage parameters to be tested for scale-parameter coefficients.
lambdas.vec
Vector of shrinkage parameters to be tested for shape-parameter coefficients.
lambda2
A data frame that enumerates all combinations of lambda and lambdas to be tested. By default, it is constructed from forming all permutations of lambda.vec and lambdas.vec. If lambdas.vec=NULL, it will only try equal values of the two parameters in each combination.
plot
If TRUE, and if the lambda and lambdas entries in lambda2 are identical, a plot of loglike as a function of either vector is produced.
...
Other arguments to be passed to bsgw.

Value

Functions bsgw.generate.folds and bsgw.generate.folds.eventbalanced produce integer vectors of length ntot or nrow(data) respectively. The output of these functions can be directly passed to bsgw.crossval or bsgw.crossval.wrapper. Function bsgw.crossval returns the log-likelihood of data under the assumed bsgw model, calculated using a cross-validation scheme with the supplied fold parameter. If all=TRUE, the estimation objects for each of the nfold estimation jobs will be returned as the "estobjs" attribute of the returned value. Function bsgw.crossval.wrapper returns a list with elements lambda and lambdas, the optimal shrinkage parameters for scale and shape coefficients, respectively. Additionally, the following attributes are attached: , the optimal shrinkage parameters for scale and shape coefficients, respectively. Additionally, the following attributes are attached:

Examples

Run this code
library("survival")
data(ovarian)
folds <- bsgw.generate.folds.eventbalanced(Surv(futime, fustat) ~ 1, ovarian, 5)
cv <- bsgw.crossval(ovarian, folds, formula=Surv(futime, fustat) ~ ecog.ps + rx
  , control=bsgw.control(iter=50, nskip=10), print.level = 3)
cv2 <- bsgw.crossval.wrapper(ovarian, folds, formula=Surv(futime, fustat) ~ ecog.ps + rx
  , control=bsgw.control(iter=50, nskip=10)
  , print.level=3, lambda.vec=exp(seq(from=log(0.1), to=log(1), length.out = 3)))

Run the code above in your browser using DataLab