Learn R Programming

huge (version 0.9.1)

huge.subgraph: Subgraph estimation using Meinshausen & Buhlmann Graph Estimation via Lasso

Description

Implements Meinshausen & Buhlmann Graph Estimation via Lasso (GEL). It estimates the neighborhood of each variable by fitting a collection of Lasso regression problems.

Usage

huge.subgraph(x, ind.group = NULL, ind.mat = NULL, alpha = 1, lambda = NULL, 
nlambda = 10, lambda.min.ratio = 0.1, sym = "or", verbose = TRUE)

Arguments

x
The n by d data matrix representing n observations in d dimensions
ind.group
A k dimensional vector indexing a subset of all d variables. ONLY applicable when estimating a subgraph of the whole graph. The default value is c(1:d).
ind.mat
A scr.num by k matrix. Each column corresponds to a variable in ind.group and contains the indices of the preslected neighborhood.
alpha
The tuning parameter for the elastic-net regression. The default value is 1 (lasso). When some dense pattern exists in the graph or some variables are highly correlated, the elastic-net are encouraged for its grouping effect.
lambda
A sequence of decresing positive numbers to control regularization. Typical usage is to leave the input lambda = NULL and have the program compute its own lambda sequence based on nlambda and lambda.min.ratio
nlambda
The number of regularization paramters. The default value is 10.
lambda.min.ratio
The smallest value for lambda, as a fraction of the uppperbound (MAX) of the regularization parameter which makes all estimates equal to 0. The program can automatically generate lambda as a sequence of
sym
Symmetrize the output graphs. If sym = "and", the edge between node i and node j is selected ONLY when both node i and node j are selected as neighbors for each other. If sym = "or"
verbose
If verbose = FALSE, tracing information printing is disabled. The default value is TRUE.

Value

  • An object with S3 class "subgraph" is returned:
  • pathA list of k by k adjacency matrices (in sparse matrix representation) of estimated graphs as the solution path corresponding to lambda.
  • lambdaThe sequence of regularization parameters used the graph estimation.
  • rssA k by nlambda matrix. Each row is corresponding to a variable in ind.group and contains all RSS's (Residual Sum of Squares) along the lasso solution path.
  • dfA k by nlambda matrix. Each row corresponds to a variable in ind.group and contains the number of nonzero coefficients along the lasso solution path.
  • sparsityThe sparsity levels of the solution path.

Details

The GEL simplifies the precision matrix estimation to fitting a collection of Lasso regression problems by using each variable as response and the others as predictors. Unlike Graphical Lasso (GLASSO), it cannot numerically estimate the precision matrix but ONLY the underlying structure. It is computationally easier and can be more flexible in high-dimensional settings. In some situations such as gene regulatory network analysis, we are only interested in the structure of the graph or a subgraph of the whole graph and the GEL is more scalable than other existing algorithms. The implementation is based on the well-known highly computationally efficient package "glmnet".

References

1. Tuo Zhao and Han Liu. HUGE: A Package for High-dimensional Undirected Graph Estimation. Technical Report, Carnegie Mellon University, 2010 2. Jerome Friedman, Trevor Hastie and Rob Tibshirani. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 2008. 3. Nicaolai Meinshausen and Peter Buhlmann. High-dimensional Graphs and Variable Selection with the Lasso. The Annals of Statistics, 2006.

See Also

huge and huge-package

Examples

Run this code
#generate data
L = huge.generator(graph = "hub")
ind.group = c(1:30)
ind.mat = huge.scr(L$data, ind.group)$ind.mat
lambda = exp(seq(log(0.8),log(0.1),length=12))

#subgraph solution path estimation with the preselected neighborhood and specified lambda sequence
sub.path = huge.subgraph(L$data, ind.group = ind.group, ind.mat = ind.mat, lambda = lambda)
summary(sub.path)
plot(sub.path)

#subgraph solution path estimation with specified lambda.min.ratio and nlambda
sub.path = huge.subgraph(L$data, ind.group = ind.group, ind.mat = ind.mat, 
nlambda = 8, lambda.min.ratio = 0.01)
summary(sub.path)
plot(sub.path)

#graph solution path estimation using elastic net
sub.path = huge.subgraph(L$data, ind.group = ind.group, alpha = 0.7)
summary(sub.path)
plot(sub.path)

Run the code above in your browser using DataLab