Learn R Programming

huge (version 0.8.1)

huge.npn: NonparaNormal(NPN) transformation

Description

Implements the NonparaNormal transformation (Gausianization) to relax the assumption of normality.

Usage

huge.npn(x, npn.func = "shrinkage", npn.thresh, verbose = TRUE)

Arguments

x
The n by d data matrix representing n observations in d dimensions
npn.func
The transformation function used in the NPN transformation. If npn.func = "truncation", the truncated ECDF is applied. If npn.func = "shrinkage", the shrunken ECDF is applied. The default is "shrinkage".
npn.thresh
The truncation threshold used in NPN transformation, ONLY applicable when npn.func = "truncation". The default value is 1/(4*(n^0.25)* sqrt(pi*log(n))).
verbose
If verbose = FALSE, tracing information printing is disabled. The default value is TRUE.

Value

  • An object with S3 class "npn" is returned:
  • dataThe n by d data matrix representing n observations in d transformed dimensions
  • ntdataThe original data matrix before the NPN transformation
  • npn.funcThe transformation function used in the NPN transformation

Details

The NPN is a very helpful tool to relax the normality assumption. Computationally, fitting a high dimensional NPN is no more difficult than estimating a multivariate Gaussian and one can also apply to other existing packages such as glasso. The transformed data are already standardized as sample mean zero and unit variance.

References

Tuo Zhao and Han Liu. HUGE: A Package for High-dimensional Undirected Graph Estimation. Technical Report, Carnegie Mellon University, 2010 Han Liu, John Lafferty and Larry Wasserman. The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs. Journal of Machine Learning Research (JMLR), Vol.10, Page 2295-2328, 2009

See Also

huge and huge-package.

Examples

Run this code
# generate data
L = huge.generator(graph = "cluster", g = 5)

# transform the non-Gaussian data using the shrunken ECDF
Q = huge.npn(L$data^5)
summary(Q)
plot(Q)

# transform the non-Gaussian data using the truncated ECDF
Q = huge.npn(5^(L$data), npn.func = "truncation")
plot(Q)

Run the code above in your browser using DataLab