This function fits a modification of MI-SVM to ordinal outcome data based on the research method proposed by Kent and Yu.
# S3 method for default
omisvm(
x,
y,
bags,
cost = 1,
h = 1,
s = Inf,
method = c("qp-heuristic"),
weights = TRUE,
control = list(kernel = "linear", sigma = if (is.vector(x)) 1 else 1/ncol(x), max_step
= 500, type = "C-classification", scale = TRUE, verbose = FALSE, time_limit = 60),
...
)# S3 method for formula
omisvm(formula, data, ...)
# S3 method for mi_df
omisvm(x, ...)
An object of class omisvm. The object contains at least the
following components:
*_fit: A fit object depending on the method parameter. If method = 'qp-heuristic' this will be gurobi_fit from a model optimization.
call_type: A character indicating which method omisvm() was called
with.
features: The names of features used in training.
levels: The levels of y that are recorded for future prediction.
cost: The cost parameter from function inputs.
weights: The calculated weights on the cost parameter.
repr_inst: The instances from positive bags that are selected to be
most representative of the positive instances.
n_step: If method == 'qp-heuristic', the total steps used in the
heuristic algorithm.
x_scale: If scale = TRUE, the scaling parameters for new predictions.
A data.frame, matrix, or similar object of covariates, where each
row represents an instance. If a mi_df object is passed, y, bags are
automatically extracted, and all other columns will be used as predictors.
A numeric, character, or factor vector of bag labels for each
instance. Must satisfy length(y) == nrow(x). Suggest that one of the
levels is 1, '1', or TRUE, which becomes the positive class; otherwise, a
positive class is chosen and a message will be supplied.
A vector specifying which instance belongs to each bag. Can be a string, numeric, of factor.
The cost parameter in SVM. If method = 'heuristic', this will
be fed to kernlab::ksvm(), otherwise it is similarly in internal
functions.
A scalar that controls the trade-off between maximizing the margin and minimizing distance between hyperplanes.
An integer for how many replication points to add to the dataset. If
k represents the number of labels in y, must have 1 <= s <= k-1. The
default, Inf, uses the maximum number of replication points, k-1.
The algorithm to use in fitting (default 'heuristic'). When
method = 'heuristic', which employs an algorithm similar to Andrews et
al. (2003). When method = 'mip', the novel MIP method will be used. When
method = 'qp-heuristic, the heuristic algorithm is computed using the
dual SVM. See details.
named vector, or TRUE, to control the weight of the cost
parameter for each possible y value. Weights multiply against the cost
vector. If TRUE, weights are calculated based on inverse counts of
instances with given label, where we only count one positive instance per
bag. Otherwise, names must match the levels of y.
list of additional parameters passed to the method that control computation with the following components:
kernel either a character the describes the kernel ('linear' or
'radial') or a kernel matrix at the instance level.
sigma argument needed for radial basis kernel.
nystrom_args a list of parameters to pass to kfm_nystrom(). This is
used when method = 'mip' and kernel = 'radial' to generate a Nystrom
approximation of the kernel features.
max_step argument used when method = 'heuristic'. Maximum steps of
iteration for the heuristic algorithm.
type: argument used when method = 'heuristic'. The type argument is
passed to e1071::svm().
scale argument used for all methods. A logical for whether to rescale
the input before fitting.
verbose argument used when method = 'mip'. Whether to message output
to the console.
time_limit argument used when method = 'mip'. FALSE, or a time
limit (in seconds) passed to gurobi() parameters. If FALSE, no time
limit is given.
start argument used when method = 'mip'. If TRUE, the mip program
will be warm_started with the solution from method = 'qp-heuristic' to
potentially improve speed.
Arguments passed to or from other methods.
a formula with specification mi(y, bags) ~ x which uses the
mi function to create the bag-instance structure. This argument is an
alternative to the x, y, bags arguments, but requires the data
argument. See examples.
If formula is provided, a data.frame or similar from which
formula elements will be extracted
omisvm(default): Method for data.frame-like objects
omisvm(formula): Method for passing formula
omisvm(mi_df): Method for mi_df objects, automatically handling bag
names, labels, and all covariates.
Sean Kent
Currently, the only method available is a heuristic algorithm in linear SVM space. Additional methods should be available shortly.
predict.omisvm() for prediction on new data.
if (require(gurobi)) {
data("ordmvnorm")
x <- ordmvnorm[, 3:7]
y <- ordmvnorm$bag_label
bags <- ordmvnorm$bag_name
mdl1 <- omisvm(x, y, bags, weights = NULL)
predict(mdl1, x, new_bags = bags)
}
Run the code above in your browser using DataLab