huge.npn: Nonparanormal(npn) transformation

Description

Implements the Gausianization to help relax the assumption of normality.

Usage

huge.npn(x, npn.func = "shrinkage", npn.thresh = NULL, verbose = TRUE)

Arguments

The n by d data matrix representing n observations in d dimensions

npn.func

The transformation function used in the npn transformation. If npn.func = "truncation", the truncated ECDF is applied. If npn.func = "shrinkage", the shrunken ECDF is applied. The default is "shrinkage". If npn.func = "skeptic", the nonparanormal skeptic is applied.

npn.thresh

The truncation threshold used in nonparanormal transformation, ONLY applicable when npn.func = "truncation". The default value is 1/(4*(n^0.25)* sqrt(pi*log(n))).

verbose

If verbose = FALSE, tracing information printing is disabled. The default value is TRUE.

Value

data

A d by d nonparanormal correlation matrix if npn.func = "skeptic", and A n by d data matrix representing n observations in d transformed dimensions other wise.

Details

The nonparanormal extends Gaussian graphical models to semiparametric Gaussian copula models.Motivated by sparse additive models, the nonparanormal method estimates the Gasussian copula by marginally transforming the variables using smooth functions.Computationally, the estimation of a nonparanormal transformation is very efficient and only requires one pass of the data matrix.

References

1. T. Zhao and H. Liu. The huge Package for High-dimensional Undirected Graph Estimation in R. Journal of Machine Learning Research, 2012 2. H. Liu, F. Han, M. Yuan, J. Lafferty and L. Wasserman. High Dimensional Semiparametric Gaussian Copula Graphical Models. Annals of Statistics,2012 3. D. Witten and J. Friedman. New insights and faster computations for the graphical lasso. Journal of Computational and Graphical Statistics, to appear, 2011. 4. Han Liu, Kathryn Roeder and Larry Wasserman. Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models. Advances in Neural Information Processing Systems, 2010. 5. R. Foygel and M. Drton. Extended bayesian information criteria for gaussian graphical models. Advances in Neural Information Processing Systems, 2010. 6. H. Liu, J. Lafferty and L. Wasserman. The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs. Journal of Machine Learning Research, 2009 7. J. Fan and J. Lv. Sure independence screening for ultra-high dimensional feature space (with discussion). Journal of Royal Statistical Society B, 2008. 8. O. Banerjee, L. E. Ghaoui, A. d'Aspremont: Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data. Journal of Machine Learning Research, 2008. 9. J. Friedman, T. Hastie and R. Tibshirani. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 2008. 10. J. Friedman, T. Hastie and R. Tibshirani. Sparse inverse covariance estimation with the lasso, Biostatistics, 2007. 11. N. Meinshausen and P. Buhlmann. High-dimensional Graphs and Variable Selection with the Lasso. The Annals of Statistics, 2006.

Examples

Run this code

# NOT RUN {
# generate nonparanormal data
L = huge.generator(graph = "cluster", g = 5)
L$data = L$data^5

# transform the data using the shrunken ECDF
Q = huge.npn(L$data)

# transform the non-Gaussian data using the truncated ECDF
Q = huge.npn(L$data, npn.func = "truncation")

# transform the non-Gaussian data using the truncated ECDF
Q = huge.npn(L$data, npn.func = "skeptic")

# }

Run the code above in your browser using DataLab