Usage
cv.CoxBoost(time,status,x,subset=1:length(time),maxstepno=100,K=10,
type=c("verweij","naive"), parallel=FALSE,upload.x=TRUE,multicore=FALSE, folds=NULL,trace=FALSE,...)
Arguments
time
vector of length n
specifying the observed times.
status
censoring indicator, i.e., vector of length n
with entries 0
for censored observations and 1
for uncensored observations. If this vector contains elements not equal to 0
or 1
, these are taken to indicate events from a competing risk and a model for the subdistribution hazard with respect to event 1
is fitted (see e.g. Fine and Gray, 1999).
x
n * p
matrix of covariates.
subset
a vector specifying a subset of observations to be used in the fitting process.
maxstepno
maximum number of boosting steps to evaluate, i.e, the returned ``optimal'' number of boosting steps will be in the range [0,maxstepno]
.
K
number of folds to be used for cross-validation. If K
is larger or equal to the number of non-zero elements in status
, leave-one-out cross-validation is performed.
type
way of calculating the partial likelihood contribution of the observation in the hold-out folds: "verweij"
uses the more appropriate method described in Verweij and van Houwelingen (1996), "naive"
uses the approach where the observations that are not in the hold-out folds are ignored (often found in other R packages).
parallel
logical value indicating whether computations in the cross-validation folds should be performed in parallel on a compute cluster, using package snowfall
. Parallelization is performed via the package snowfall
and the initialization function of of this package, sfInit
, should be called before calling cv.CoxBoost
.
multicore
indicates whether computations in the cross-validation folds should be performed in parallel, using package parallel
. If TRUE
, package parallel
is employed using the default number of cores. A value larger than 1
is taken to be the number of cores that should be employed.
upload.x
logical value indicating whether x
should/has to be uploaded to the
compute cluster for parallel computation. Uploading this only once (using sfExport(x)
from library snowfall
) can save much time for large data sets.
folds
if not NULL
, this has to be a list of length K
, each element being a vector of indices of fold elements. Useful for employing the same folds for repeated runs.
trace
logical value indicating whether progress in estimation should be indicated by printing the number of the cross-validation fold and the index of the covariate updated.
...
miscellaneous parameters for the calls to CoxBoost