- xdata
data matrix with observations as rows and variables as columns.
For multi-block stability selection, the variables in data have to be
ordered by group.
- pk
optional vector encoding the grouping structure. Only used for
multi-block stability selection where pk indicates the number of
variables in each group. If pk=NULL, single-block stability
selection is performed.
- Lambda
matrix of parameters controlling the level of sparsity in the
underlying feature selection algorithm specified in implementation.
If Lambda=NULL and implementation=PenalisedGraphical,
LambdaGridGraphical is used to define a relevant grid.
Lambda can be provided as a vector or a matrix with
length(pk) columns.
- lambda_other_blocks
optional vector of parameters controlling the
level of sparsity in neighbour blocks for the multi-block procedure. To use
jointly a specific set of parameters for each block,
lambda_other_blocks must be set to NULL (not recommended).
Only used for multi-block stability selection, i.e. if length(pk)>1.
- pi_list
vector of thresholds in selection proportions. If
n_cat=NULL or n_cat=2, these values must be >0 and
<1. If n_cat=3, these values must be >0.5 and
<1.
- K
number of resampling iterations.
- tau
subsample size. Only used if resampling="subsampling" and
cpss=FALSE.
- seed
value of the seed to initialise the random number generator and
ensure reproducibility of the results (see set.seed).
- n_cat
computation options for the stability score. Default is
NULL to use the score based on a z test. Other possible values are 2
or 3 to use the score based on the negative log-likelihood.
- implementation
function to use for graphical modelling. If
implementation=PenalisedGraphical, the algorithm implemented in
glassoFast is used for regularised estimation of
a conditional independence graph. Alternatively, a user-defined function
can be provided.
- start
character string indicating if the algorithm should be
initialised at the estimated (inverse) covariance with previous penalty
parameters (start="warm") or not (start="cold"). Using
start="warm" can speed-up the computations, but could lead to
convergence issues (in particular with small Lambda_cardinal). Only
used for implementation=PenalisedGraphical (see argument
"start" in glassoFast).
- scale
logical indicating if the correlation (scale=TRUE) or
covariance (scale=FALSE) matrix should be used as input of
glassoFast if
implementation=PenalisedGraphical. Otherwise, this argument must be
used in the function provided in implementation.
- resampling
resampling approach. Possible values are:
"subsampling" for sampling without replacement of a proportion
tau of the observations, or "bootstrap" for sampling with
replacement generating a resampled dataset with as many observations as in
the full sample. Alternatively, this argument can be a function to use for
resampling. This function must use arguments named data and
tau and return the IDs of observations to be included in the
resampled dataset.
- cpss
logical indicating if complementary pair stability selection
should be done. For this, the algorithm is applied on two non-overlapping
subsets of half of the observations. A feature is considered as selected if
it is selected for both subsamples. With this method, the data is split
K/2 times (K models are fitted). Only used if
PFER_method="MB".
- PFER_method
method used to compute the upper-bound of the expected
number of False Positives (or Per Family Error Rate, PFER). If
PFER_method="MB", the method proposed by Meinshausen and Bühlmann
(2010) is used. If PFER_method="SS", the method proposed by Shah and
Samworth (2013) under the assumption of unimodality is used.
- PFER_thr
threshold in PFER for constrained calibration by error
control. If PFER_thr=Inf and FDP_thr=Inf, unconstrained
calibration is used (the default).
- FDP_thr
threshold in the expected proportion of falsely selected
features (or False Discovery Proportion) for constrained calibration by
error control. If PFER_thr=Inf and FDP_thr=Inf, unconstrained
calibration is used (the default).
- Lambda_cardinal
number of values in the grid of parameters controlling
the level of sparsity in the underlying algorithm. Only used if
Lambda=NULL.
- lambda_max
optional maximum value for the grid in penalty parameters.
If lambda_max=NULL, the maximum value is set to the maximum
covariance in absolute value. Only used if
implementation=PenalisedGraphical and Lambda=NULL.
- lambda_path_factor
multiplicative factor used to define the minimum
value in the grid.
- max_density
threshold on the density. The grid is defined such that
the density of the estimated graph does not exceed max_density.
- optimisation
character string indicating the type of optimisation
method. With optimisation="grid_search" (the default), all values in
Lambda are visited. Alternatively, optimisation algorithms
implemented in nloptr can be used with
optimisation="nloptr". By default, we use
"algorithm"="NLOPT_GN_DIRECT_L", "xtol_abs"=0.1,
"ftol_abs"=0.1 and "maxeval"=Lambda_cardinal. These values
can be changed by providing the argument opts (see
nloptr). For stability selection using penalised
regression, optimisation="grid_search" may be faster as it allows
for warm start.
- n_cores
number of cores to use for parallel computing (see argument
workers in multisession). Using
n_cores>1 is only supported with optimisation="grid_search".
- output_data
logical indicating if the input datasets xdata and
ydata should be included in the output.
- verbose
logical indicating if a loading bar and messages should be
printed.
- beep
sound indicating the end of the run. Possible values are:
NULL (no sound) or an integer between 1 and 11 (see argument
sound in beep).
- ...
additional parameters passed to the functions provided in
implementation or resampling.