Learn R Programming

huge (version 1.0.1)

huge.MBGEL: Meinshausen & Buhlmann Graph Estimation via Lasso

Description

Implements Meinshausen & Buhlmann Graph Estimation via Lasso (MBGEL). It estimates the neighborhood of each variable by fitting a collection of Lasso regression problems.

Usage

huge.MBGEL(x, lambda = NULL, nlambda = NULL, lambda.min.ratio = NULL, scr = NULL, 
scr.num = NULL, idx.mat = NULL, sym = "or", verbose = TRUE)

Arguments

x
The n by d data matrix representing n observations in d dimensions
idx.mat
A scr.num by d matrix. Each column contains the indices of the preslected neighborhood. Typical usage is to leave the input idx.mat = NULL and have the program compute its own idx.mat matrix based on
lambda
A sequence of decresing positive numbers to control regularization. Typical usage is to leave the input lambda = NULL and have the program compute its own lambda sequence based on nlambda and lambda.min.ratio
nlambda
The number of regularization paramters. The default value is 10.
lambda.min.ratio
The smallest value for lambda, as a fraction of the uppperbound (MAX) of the regularization parameter which makes all estimates equal to 0. The program can automatically generate lambda as a sequence of
scr
If scr = TRUE, the Graph Sure Screening(GSS) is applied to preselect the neighborhood for MBGEL. The default value is TRUE for n and FALSE for n>=d.
scr.num
The neighborhood size after the GSS (the number of remaining neighbors per node). ONLY applicable when scr = TRUE. The default value is n-1. An alternative value is n/log(n).
sym
Symmetrize the output graphs. If sym = "and", the edge between node i and node j is selected ONLY when both node i and node j are selected as neighbors for each other. If sym = "or"
verbose
If verbose = FALSE, tracing information printing is disabled. The default value is TRUE.

Value

  • An object with S3 class "MBGEL" is returned:
  • pathA list of k by k adjacency matrices (in sparse matrix representation) of estimated graphs as the solution path corresponding to lambda.
  • lambdaThe sequence of regularization parameters used the graph estimation.
  • rssA k by nlambda matrix. Each row is corresponding to a variable in ind.group and contains all RSS's (Residual Sum of Squares) along the lasso solution path.
  • dfA k by nlambda matrix. Each row corresponds to a variable in ind.group and contains the number of nonzero coefficients along the lasso solution path.
  • sparsityThe sparsity levels of the graph path.

Details

The MBGEL simplifies the precision matrix estimation to fitting a collection of Lasso regression problems by using each variable as response and the others as predictors. Unlike Graphical Lasso (GLASSO), it cannot numerically estimate the precision matrix but ONLY the underlying structure. It is computationally easier and can be more flexible in high-dimensional settings. It can be furether accelerated by the Graph Sure Screening (GSS). The GSS is applied to preselect the neighborhood under ultrahigh-dimensional setting. With the dimensionality reduced from ultra-high to a medium level (usually below the sample size), the GSS can greatly reduce the computational burden and often achieves equally or better estimation without using the GSS. The implementation is based on C.

References

1. Tuo Zhao and Han Liu. HUGE: A Package for High-dimensional Undirected Graph Estimation. Technical Report, Carnegie Mellon University, 2010 2.Jianqing Fan and Jinchi Lv. Sure independence screening for ultra-high dimensional feature space (with discussion). Journal of Royal Statistical Society B, 2008. 3. Jerome Friedman, Trevor Hastie and Rob Tibshirani. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 2008. 4. Nicaolai Meinshausen and Peter Buhlmann. High-dimensional Graphs and Variable Selection with the Lasso. The Annals of Statistics, 2006.

See Also

huge and huge-package

Examples

Run this code
#generate data
L = huge.generator(n = 100, d = 200, graph = "hub")

#graph path estimation with the GSS
out = huge.MBGEL(L$data)
summary(out)
plot(out)

#graph path estimation with specified lambda.min.ratio and nlambda
out = huge.MBGEL(L$data, nlambda = 8, lambda.min.ratio = 0.05)
summary(out)
plot(out)

#graph path estimation without the GSS
sub.path = huge.MBGEL(L$data, scr = FALSE)
summary(sub.path)
plot(sub.path)

Run the code above in your browser using DataLab