huge.select: Model selection for high-dimensional undirected graph estimation

Description

Implements the regularization parameter selection for high dimensional undirected graph estimation. The optional approaches are stability selection (StARS) and a variant of the extended BIC.

Usage

huge.select(est, criterion = NULL, EBIC.gamma = 0.5, stars.thresh = 0.1, 
sample.ratio = NULL, rep.num = 20, verbose = TRUE)

Arguments

est

An object with S3 class "huge" (output from huge)

criterion

Model selection criterion. If criterion = "EBIC", the extended BIC is applied (only applicable when "est$approx = FALSE"). If criterion = "stars", StARS is applied. The default value is "EBIC" when

EBIC.gamma

The tuning parameter for the extended BIC criterion. The default value is 0.5. Only applicable when est$approx = FALSE and criterion = "EBIC".

stars.thresh

The variability threshold in StARS selection. The default value is 0.1. An alternative value is 0.05. Only applicable when criterion = "stars".

sample.ratio

The subsampling ratio. The default value is 10*sqrt(n)/n when n>144 and 0.8 when n<=144< code="">, where n is the sample size. Only applicable when criterion = "stars".

rep.num

The number of subsampling for StARS selection. The default value is 20.Only applicable when criterion = "stars"

verbose

If verbose = FALSE, tracing information printing is disabled. The default value is TRUE.

Value

An object with S3 class "select" is returned:
refitThe optimal graph selected from the solution path
mergeThe merged by estimates from different subsamples. Only applicable when the input criterion = "stars".
EBIC.scoresExtended BIC scores for regularization parameter selection. Only applicable when criterion = "EBIC".
opt.indexThe index of the selected regularization parameter.
opt.lambdaThe selected regularization/thresholding parameter.
opt.sparsityThe sparsity level of "refit".
graphreturn "subgraph" when k and "fullgraph" when k==d
and anything else inluded in the input est

Details

The StARS is a natural way to select optimal regularization parameter for high dimensional undirected graphical models. It also provides an additional estimated graph by merging the corresponding subsampled graphs using the frequency counts. The subsampling procedure in StARS may not be very efficient, therefore we also provide another method, the extended BIC score based on pseudo-likelihood. However its theoretical properties have not been justified yet.

References

Tuo Zhao and Han Liu. HUGE: A Package for High-dimensional Undirected Graph Estimation. Technical Report, Carnegie Mellon University, 2010 Han Liu, Kathryn Roeder and Larry Wasserman. Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models. Advances in Neural Information Processing Systems, 2010. Jiahua Chen and Zehua Chen. Extended Bayesian information criterion for model selection with large model space. Biometrika. 95, 759-771.

Examples

Run this code

#generate data
L = huge.generator(graph="hub")

#subset indices
ind.group = c(1:30)

#estimate subgraph solution path using Meinshausen & Buhlmann graph estimation
out.huge = huge(L,ind.group = ind.group)


#model selection using extended BIC scores
out.select = huge.select(out.huge)
summary(out.select)
plot(out.select)

#model selection using stars
out.select = huge.select(out.huge, criterion = "stars", rep.num = 5)
summary(out.select)
plot(out.select)

#estimate subgraph solution path using GECA
out.approx = huge(L,ind.group = ind.group, approx = TRUE)

#model selection using stars
out.select = huge.select(out.approx, rep.num = 10)
summary(out.select)
plot(out.select)

Run the code above in your browser using DataLab