coglasso networkselect_coglasso() selects the best combination of hyperparameters given to
coglasso() according to the selected model selection method. The three
availble options that can be set for the argument method are "xstars",
"xestars" and "ebic".
select_coglasso(
coglasso_obj,
method = "xestars",
stars_thresh = 0.1,
stars_subsample_ratio = NULL,
rep_num = 20,
max_iter = 10,
old_sampling = FALSE,
ebic_gamma = 0.5,
verbose = TRUE
)select_coglasso() returns an object of S3 class select_coglasso
containing the results of the
selection procedure, built upon an object of S3 class coglasso. Some
output elements depend on the chosen model selection method.
These elements are returned by all methods:
... are the same elements returned by coglasso().
sel_index_c, sel_index_lw and sel_index_lb are the indexes of the
final selected parameters \(c\), \(\lambda_w\) and \(\lambda_b\)
leading to the most stable sparse network.
sel_c, sel_lambda_w and sel_lambda_b are the final selected
parameters \(c\), \(\lambda_w\) and \(\lambda_b\) leading to the most
stable sparse network.
sel_adj is the adjacency matrix of the final selected network.
sel_density is the density of the final selected network.
sel_icov is the inverse covariance matrix of the final selected network.
sel_cov optional, given only when coglasso() was called with
cov_output = TRUE. It is the covariance matrix associated with the final
selected network.
call is the matched call.
method is the chosen model selection method.
These are the additional elements returned when choosing "xestars" or "xstars":
merge is the "merged" adjacency matrix, the average of all the adjacency
matrices estimated across all the different subsamples for the selected
combination of \(\lambda_w\), \(\lambda_b\), and \(c\) values in the
last path explored before convergence. Each entry is a measure of how
recurrent the corresponding edge is across the subsamples.
variability_lw, variability_lb and variability_c are numeric vectors
of as many items as the number of \(\lambda_w\), \(\lambda_b\), and
\(c\) values explored. Each item is the variability of the network
estimated for the corresponding hyperparameter value, keeping the other two
hyperparameters fixed to their selected value.
sel_variability is the variability of the final selected network.
These are the additional elements returned when choosing "ebic":
ebic_scores is a numerical vector containing the eBIC scores for all the
hyperparameter combination.
The object of S3 class coglasso returned by coglasso().
The model selection method to select the best combination of hyperparameters. The available options are "xstars", "xestars" and "eBIC". Defaults to "xestars".
The threshold set for variability of the explored networks at each iteration of the algorithm. The \(\lambda_w\) or the \(\lambda_b\) associated to the most stable network before the threshold is overcome is selected.
The proportion of samples in the multi-omics data set to be randomly subsampled to estimate the variability of the network under the given hyperparameters setting. Defaults to 80% when the number of samples is smaller than 144, otherwise it defaults to \(\frac{10}{n}\sqrt{n}\).
The amount of subsamples of the multi-omics data set used to estimate the variability of the network under the given hyperparameters setting. Defaults to 20.
The greatest number of times the algorithm is allowed to choose a new best \(\lambda_w\). Defaults to 10.
Perform the same subsampling xstars() would if set to
TRUE. Makes a difference with bigger data sets, where computing
a correlation matrix could take significantly longer. Defaults to FALSE.
The \(\gamma\) tuning parameter for eBIC selection, to set between 0 and 1. When set to 0 one has the standard BIC. Defaults to 0.5.
Print information regarding the progress of the selection procedure on the console.
select_coglasso() provides three model selection strategies:
"xstars" uses eXtended StARS (XStARS) selecting the most stable, yet sparse
network. Stability is computed upon network estimation from multiple subsamples of the
multi-omics data set, allowing repetition. Subsamples are collected for a
fixed amount of times (rep_num), and with a fixed proportion of the total
number of samples (stars_subsample_ratio). See xstars() for more
information on the methodology.
"xestars" uses eXtended Efficient StARS (XEStARS), a significantly
faster version of XStARS. It could produce marginally different results
to "xstars" due to a different sampling strategy. See xestars() for
more information on the methodology.
"ebic" uses the extended Bayesian Information
Criterion (eBIC) selecting the network that minimizes it. gamma sets the
wait given to the extended component, turning the model selection method to
the standard BIC if set to 0.
cg <- coglasso(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3,
nlambda_b = 3, nc = 3, verbose = FALSE)
# Using eXtended Efficient StARS, takes less than five seconds
sel_cg_xestars <- select_coglasso(cg, method = "xestars", verbose = FALSE)
# \donttest{
# Using eXtended StARS, takes around a minute
sel_cg_xstars <- select_coglasso(cg, method = "xstars", verbose = FALSE)
# }
# Using eBIC
sel_cg_ebic <- select_coglasso(cg, method = "ebic", verbose = FALSE)
Run the code above in your browser using DataLab