Computes the stability score from selection proportions of models with a given parameter controlling the sparsity and for different thresholds in selection proportions. The score measures how unlikely it is that the selection procedure is uniform (i.e. uninformative) for a given combination of parameters.
StabilityScore(
selprop,
pi_list = seq(0.6, 0.9, by = 0.01),
K,
n_cat = 3,
group = NULL
)
A vector of stability scores obtained with the different thresholds in selection proportions.
array of selection proportions.
vector of thresholds in selection proportions. If
n_cat=NULL
or n_cat=2
, these values must be >0
and
<1
. If n_cat=3
, these values must be >0.5
and
<1
.
number of resampling iterations.
computation options for the stability score. Default is
NULL
to use the score based on a z test. Other possible values are 2
or 3 to use the score based on the negative log-likelihood.
vector encoding the grouping structure among predictors. This argument indicates the number of variables in each group and only needs to be provided for group (but not sparse group) penalisation.
The stability score is derived from the likelihood under the assumption of uniform (uninformative) selection.
We classify the features into three categories: the stably selected ones (that have selection proportions \(\ge \pi\)), the stably excluded ones (selection proportion \(\le 1-\pi\)), and the unstable ones (selection proportions between \(1-\pi\) and \(\pi\)).
Under the hypothesis of equiprobability of selection (instability), the likelihood of observing stably selected, stably excluded and unstable features can be expressed as:
\(L_{\lambda, \pi} = \prod_{j=1}^N [ ( 1 - F( K \pi - 1 ) )^{1_{H_{\lambda} (j) \ge K \pi}} \times ( F( K \pi - 1 ) - F( K ( 1 - \pi ) )^{1_{ (1-\pi) K < H_{\lambda} (j) < K \pi }} \times F( K ( 1 - \pi ) )^{1_{ H_{\lambda} (j) \le K (1-\pi) }} ]\)
where \(H_{\lambda} (j)\) is the selection count of feature \(j\) and \(F(x)\) is the cumulative probability function of the binomial distribution with parameters \(K\) and the average proportion of selected features over resampling iterations.
The stability score is computed as the minus log-transformed likelihood under the assumption of equiprobability of selection:
\(S_{\lambda, \pi} = -log(L_{\lambda, \pi})\)
The stability score increases with stability.
Alternatively, the stability score can be computed by considering only two
sets of features: stably selected (selection proportions \(\ge \pi\)) or
not (selection proportions \(< \pi\)). This can be done using
n_cat=2
.
ourstabilityselectionsharp
Other stability metric functions:
ConsensusScore()
,
FDP()
,
PFER()
,
StabilityMetrics()
# Simulating set of selection proportions
set.seed(1)
selprop <- round(runif(n = 20), digits = 2)
# Computing stability scores for different thresholds
score <- StabilityScore(selprop, pi_list = c(0.6, 0.7, 0.8), K = 100)
Run the code above in your browser using DataLab