- xdata
data matrix with observations as rows and variables as columns.
For multi-block stability selection, the variables in data have to be
ordered by group.
- pk
optional vector encoding the grouping structure. Only used for
multi-block stability selection where pk
indicates the number of
variables in each group. If pk=NULL
, single-block stability
selection is performed.
- Lambda
matrix of parameters controlling the level of sparsity in the
underlying feature selection algorithm specified in implementation
.
If Lambda=NULL
and implementation=PenalisedGraphical
,
LambdaGridGraphical
is used to define a relevant grid.
Lambda
can be provided as a vector or a matrix with
length(pk)
columns.
- lambda_other_blocks
optional vector of parameters controlling the
level of sparsity in neighbour blocks for the multi-block procedure. To use
jointly a specific set of parameters for each block,
lambda_other_blocks
must be set to NULL
(not recommended).
Only used for multi-block stability selection, i.e. if length(pk)>1
.
- pi_list
vector of thresholds in selection proportions. If
n_cat=NULL
or n_cat=2
, these values must be >0
and
<1
. If n_cat=3
, these values must be >0.5
and
<1
.
- K
number of resampling iterations.
- tau
subsample size. Only used if resampling="subsampling"
and
cpss=FALSE
.
- seed
value of the seed to initialise the random number generator and
ensure reproducibility of the results (see set.seed
).
- n_cat
computation options for the stability score. Default is
NULL
to use the score based on a z test. Other possible values are 2
or 3 to use the score based on the negative log-likelihood.
- implementation
function to use for graphical modelling. If
implementation=PenalisedGraphical
, the algorithm implemented in
glassoFast
is used for regularised estimation of
a conditional independence graph. Alternatively, a user-defined function
can be provided.
- start
character string indicating if the algorithm should be
initialised at the estimated (inverse) covariance with previous penalty
parameters (start="warm"
) or not (start="cold"
). Using
start="warm"
can speed-up the computations, but could lead to
convergence issues (in particular with small Lambda_cardinal
). Only
used for implementation=PenalisedGraphical
(see argument
"start"
in glassoFast
).
- scale
logical indicating if the correlation (scale=TRUE
) or
covariance (scale=FALSE
) matrix should be used as input of
glassoFast
if
implementation=PenalisedGraphical
. Otherwise, this argument must be
used in the function provided in implementation
.
- resampling
resampling approach. Possible values are:
"subsampling"
for sampling without replacement of a proportion
tau
of the observations, or "bootstrap"
for sampling with
replacement generating a resampled dataset with as many observations as in
the full sample. Alternatively, this argument can be a function to use for
resampling. This function must use arguments named data
and
tau
and return the IDs of observations to be included in the
resampled dataset.
- cpss
logical indicating if complementary pair stability selection
should be done. For this, the algorithm is applied on two non-overlapping
subsets of half of the observations. A feature is considered as selected if
it is selected for both subsamples. With this method, the data is split
K/2
times (K
models are fitted). Only used if
PFER_method="MB"
.
- PFER_method
method used to compute the upper-bound of the expected
number of False Positives (or Per Family Error Rate, PFER). If
PFER_method="MB"
, the method proposed by Meinshausen and Bühlmann
(2010) is used. If PFER_method="SS"
, the method proposed by Shah and
Samworth (2013) under the assumption of unimodality is used.
- PFER_thr
threshold in PFER for constrained calibration by error
control. If PFER_thr=Inf
and FDP_thr=Inf
, unconstrained
calibration is used (the default).
- FDP_thr
threshold in the expected proportion of falsely selected
features (or False Discovery Proportion) for constrained calibration by
error control. If PFER_thr=Inf
and FDP_thr=Inf
, unconstrained
calibration is used (the default).
- Lambda_cardinal
number of values in the grid of parameters controlling
the level of sparsity in the underlying algorithm. Only used if
Lambda=NULL
.
- lambda_max
optional maximum value for the grid in penalty parameters.
If lambda_max=NULL
, the maximum value is set to the maximum
covariance in absolute value. Only used if
implementation=PenalisedGraphical
and Lambda=NULL
.
- lambda_path_factor
multiplicative factor used to define the minimum
value in the grid.
- max_density
threshold on the density. The grid is defined such that
the density of the estimated graph does not exceed max_density.
- optimisation
character string indicating the type of optimisation
method. With optimisation="grid_search"
(the default), all values in
Lambda
are visited. Alternatively, optimisation algorithms
implemented in nloptr
can be used with
optimisation="nloptr"
. By default, we use
"algorithm"="NLOPT_GN_DIRECT_L"
, "xtol_abs"=0.1
,
"ftol_abs"=0.1
and "maxeval"=Lambda_cardinal
. These values
can be changed by providing the argument opts
(see
nloptr
). For stability selection using penalised
regression, optimisation="grid_search"
may be faster as it allows
for warm start.
- n_cores
number of cores to use for parallel computing (see argument
workers
in multisession
). Using
n_cores>1
is only supported with optimisation="grid_search"
.
- output_data
logical indicating if the input datasets xdata
and
ydata
should be included in the output.
- verbose
logical indicating if a loading bar and messages should be
printed.
- beep
sound indicating the end of the run. Possible values are:
NULL
(no sound) or an integer between 1 and 11 (see argument
sound
in beep
).
- ...
additional parameters passed to the functions provided in
implementation
or resampling
.