Performs support vectors analysis for data sets with survival outcome. Three approaches are available in the package: The regression approach takes censoring into account when formulating the inequality constraints of the support vector problem. In the ranking approach, the inequality constraints set the objective to maximize the concordance index for comparable pairs of observations. The hybrid approach combines the regression and ranking constraints in the same model.
survivalsvm(
formula = NULL,
data = NULL,
subset = NULL,
type = "regression",
diff.meth = NULL,
gamma.mu = NULL,
opt.meth = "quadprog",
kernel = "lin_kernel",
kernel.pars = NULL,
time.variable.name = NULL,
status.variable.name = NULL,
sgf.sv = 5,
sigf = 7,
maxiter = 20,
margin = 0.05,
bound = 10,
eig.tol = 1e-06,
conv.tol = 1e-07,
posd.tol = 1e-08
)
survivalsvm
Object of class survivalsvm
, with elements:
call | command calling this program, |
typeofsurvivalsvm | type of survival support vector machines approach, |
model.fit | the fitted survival model, |
var.names | names of variables used. |
[formula(1)
]
Object of class formula
. See formula
for more details.
[data.frame(1)
]
Object of class data.frame
containing data points that will be used to fit the model.
[vector(1)
]
An index vector specifying the cases to be used in the training sample.
[character(1)
]
String indicating which type of survival support vectors model is desired. This must be one
of the following strings: 'regression', 'vanbelle1', 'vanbelle2' or 'hybrid'.
[character(1)
]
String indicating which of 'makediff1'
, 'makediff2'
or 'makediff3'
is used in case of 'vanbelle1', 'vanbelle2' and 'hybrid'.
[numeric(1)|vector(1)
]
Parameters of regularization. Note that a vector with two parameters is required in case of hybrid
approach. Just
one value is required in case of regression
, vanbelle1
or vanbelle2
.
[character(1)
]
Program used to solve the quadratic optimization problem. Either "quadprog
" or "ipop
".
[Kernel(1)
]
Kernel used to fit the model: linear kern ('lin_kernel'), additive kernel ('add_kernel'),
radial basis kernels ('rbf_kernel') and the polynomial kernel ('poly_kernel').
[vector(1)
]
Parameters of kernel, when required.
[character
]
Name of the survival time variable in data
, when given in argument.
[character(1)
]
Name of the status variable in data
.
[character(1)
]
Number of decimal digits in the solution of the quadratic optimization problem.
[numeric(1)
]
Used by ipop
. See ipop
for details.
[integer(1)
]
Used by ipop
. See ipop
for details.
[numeric(1)
]
Used by ipop
. See ipop
for details.
[numeric(1)
]
Used by ipop
. See ipop
for details.
[numeric(1)
]
Used by nearPD
for adjusting positive definiteness. See nearPD
for detail.
[numeric(1)
]
Used by nearPD
for adjusting positive definiteness. See nearPD
for detail.
[numeric(1)
]
Used by nearPD
for adjusting positive definiteness. See nearPD
for detail.
Cesaire J. K. Fouodo
The following denotations are used for the models implemented:
'regression'
referring to the regression approach, named SVCR
in Van Belle et al. (2011b),
'vanbelle1'
according to the first version of survival surpport vector machines based on ranking constraints,
named RANKSVMC
by Van Belle et al. (2011b),
'vanbelle2'
according to the second version of survival surpport vector machines based on ranking constraints
like presented in model1
by Van Belle et al. (2011b) and
'hybrid'
combines simultaneously the regression and ranking constraints in the same model. Hybrid model is labeled
model2
by Van Belle et al. (2011b).
The argument 'type'
of the function survivalsvm
is used to set the type of model to be fitted.
For the models vanbelle1
, vanbelle2
and hybrid
, differences between comparable
pairs of observations are required. Each observation is compared with its nearest neighbor according to the survival time, and the
three possible comparison approaches makediff1, makediff2 and makediff3 are offered to compute the
differences between comparable neighbors.
The current version of survivalsvm
uses the solvers ipop
and quadprog
to solve the dual
optimization problems deduced from the suport vector formulations of the models presented above. Notice that for using quadprog
the kernel matrix needs to be symmetric and positive definite. Therefore when the conditions are not met, the kernel matrix needs be slightly perturbed to obtain the nearest positive definite kernel matrix.
The alternative to quadprog
is ipop
, that can also handle a non-negative definite kernel matrix, however more time may be
required to solve the quadratic optimization dual problem. The argument opt.meth
is used to select the solver.
The survivalsvm
command can be called giving a formula, in which the survival time and the status are grouped into a
two colunm matrix using the command Surv
from the package survival
. An alternative is to pass the data
frame of training data points as an argument using data
, to mention the name of the survival time variable and
the name of the status variable as illustrated in the third example below.
Van Belle, V., Pelcmans, K., Van Huffel S. and Suykens J. A.K. (2011a). Improved performance on high-dimensional survival data by application of Survival-SVM. Bioinformatics (Oxford, England) 27, 87-94.
Van Belle, V., Pelcmans, K., Van Huffel S. and Suykens J. A.K. (2011b). Support vector methods for survival analysis: a comparaison between ranking and regression approaches. Artificial Intelligence in medecine 53, 107-118.
predict.survivalsvm
survivalsvm(Surv(time, status) ~ ., veteran, gamma.mu = 0.1)
survsvm.reg <- survivalsvm(formula = Surv(diagtime, status) ~ ., data = veteran,
type = "regression", gamma.mu = 0.1,
opt.meth = "ipop", kernel = "add_kernel")
survsvm.vb2 <- survivalsvm(data = veteran, time.variable.name = "diagtime",
status.variable.name = "status",
type = "vanbelle2", gamma.mu = 0.1,
opt.meth = "quadprog", diff.meth = "makediff3",
kernel = "lin_kernel",
sgf.sv = 5, sigf = 7, maxiter = 20,
margin = 0.05, bound = 10)
Run the code above in your browser using DataLab