Usage
messinaSurv(x, y, obj_min, obj_func, min_group_frac = 0.1, f_train = 0.8,
n_boot = 50, seed = NULL, parallel = NULL, silent = FALSE)
Arguments
x
feature expression values, either supplied as an
ExpressionSet, or as an object that can be converted to a
matrix by as.matrix. In the latter case, features should
be in rows and samples in columns, with feature names
taken from the rows of the object.
y
a Surv object containing survival times and
censoring status for each
obj_min
the minimum acceptable value of the
objective metric. The metric used is specified by the
parameter obj_func.
obj_func
the metric function that measures the
difference in survival between patients with feature
values above, and below, the threshold. Valid values are
"tau", "reltau", or "coxcoef"; see details for more
information.
min_group_frac
the size of the smallest sample
group that is allowed to be generated by thresholding, as
a fraction of the total sample. The default value of 0.1
means that no thresholds will be selected that result in
a sample split yielding a group of smaller than 10 the samples. A modest value of this parameter increases
the stability of the "reltau" and "coxcoef" objectives,
which tend to become unstable as the number of samples in
a group becomes very low; see details.
f_train
the fraction of samples to be used in the
training splits of the bootstrap rounds.
n_boot
the number of bootstrap rounds to use.
seed
an optional random seed for the analysis. If
NULL, the R PRNG is used as-is.
parallel
should calculations be parallelized using
the doMC framework? If NULL, parallel mode is used if
the doMC library is loaded, and more than one core has
been registered with registerDoMC(). Note that no
progress bar is displayed in parallel mode.
silent
be completely silent (except for error and
warning messages)?
Minimum group fraction
The parameter min_group_frac limits the size of the
smallest subgroups that messinaSurv can select. As the
groups become smaller, the "reltau" and "coxcoef"
objective functions become unstable, and can generate
spurious results. These are seen on the diagnostics
produced by the messina plot functions as very high
objective values at very low and high threshold values.
To control these results, set min_group_frac to a high
enough value that the objective functions reliably fit.
Generally, max(0.1, 10/N), where N is the total number of
patients, is sufficient. Keep in mind that setting this
parameter too high will limit messinaSurv's ability to
identify small subsets of patients with dramatically
different survival from the rest: the smallest subset
that will be reliably identified is min_group_frac of
patients.