pcorSimulator: Partial Correlation Matrix simulator

Description

pcorSimulator creates a block diagonal positive definite precision matrix with three possible graph structures: hubs-based, power-law and random. Then, it generates samples from a multivariate normal distribution with covariance matrix given by the inverse of such precision matrix.

Usage

pcorSimulator(nobs, nclusters, nnodesxcluster, pattern = "powerLaw", 
              low.strength = 0.5, sup.strength = 0.9, nhubs = 5, 
              degree.hubs = 20, nOtherEdges = 30, alpha = 2.3, plus = 0, 
              prob = 0.05, perturb.clust = 0, mu = 0,
              probSign = 0.5, seed = sample(10000, nclusters))

Arguments

nobs

number of observations.

nclusters

number of clusters or blocks of variables.

nnodesxcluster

number of nodes/variables per cluster.

pattern

graph structure pattern: name that uniquely identifies "hubs", "powerLaw" and "random".

low.strength

minimum magnitude for nonzero partial correlation elements before regularization.

sup.strength

maximum magnitude for nonzero partial correlation elements before regularization.

nhubs

number of hubs per cluster (if pattern = "hubs").

degree.hubs

degree of hubs (if pattern = "hubs").

nOtherEdges

number of edges for non-hub nodes (if pattern = "hubs").

alpha

positive coefficient for the Riemman function in power-law distributions.

plus

power-law distribution added complexity (zero by default).

prob

probability of edge presence for random networks (if pattern = "random").

perturb.clust

proportion of the total number of edges that are connecting two different clusters.

expected values vector to generate data (zero by default).

probSign

probability of positive sign for non-zero partial correlation coefficients. Thus, negative signs are obtained with probability 1-probSign.

seed

vector with seeds for each cluster.

Value

An object of class pcorSim containing the following components:

generated data set.

hubs

hub nodes position.

edgesInGraph

edges given by the non-zero elements in the precision matrix.

omega

precision matrix used to generate the data.

covMat

covariance matrix used to generate the data.

path

adjacency matrix corresponding to the non-zero structure of omega.

Details

Hubs-based networks are graphs where only few nodes have a much higher degree (or connectivity) than the rest. Power-law networks assume that the variable $p_k$, which denotes the fraction of nodes in the network that has degree $k$, is given by a power-law distribution $$ p_k = \frac{k^{-\alpha}}{\varsigma(\alpha)}, $$ for $k \geq 1$, a constant $\alpha>0$ and the normalizing function $\varsigma(\alpha)$ which is the Riemann zeta function. Finally, random networks are also defined by the distribution in the proportion $p_k$. In this case, $p_k$ follows a binomial distribution $$ p_k = {p\choose k} \theta^k (1-\theta)^{p-k}, $$ where the parameter $\theta$ determines the proportion of edges (or sparsity) in the graph.

The regularization is given by $\Omega^{(1)} = \Omega^{(0)} + \delta I$, with $\delta$ such that the condition number of $\Omega^{(1)}$ is less than the number of nodes.

References

Cai, T., W. Liu, and X. Luo (2011). A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation. Journal of the American Statistical Association 106(494), 594-607.

Newman, M. (2003). The structure and function of complex networks. SIAM REVIEW 45, 167-256.

Caballe, A., N. Bochkina, and C. Mayer (2016). Selection of the Regularization Parameter in Graphical Models using network charactaristics. eprint arXiv:1509.05326, 1-25.

Examples

Run this code

# NOT RUN {
# example to use pcorSimulator function

EX1 <- pcorSimulator(nobs = 50, nclusters=3, nnodesxcluster=c(100,30,50), 
                pattern="powerLaw", plus=0)
print(EX1)
                
EX2 <- pcorSimulator(nobs = 25, nclusters=2, nnodesxcluster=c(60,40), 
                pattern="powerLaw", plus=1)
print(EX2)
 
# }

Run the code above in your browser using DataLab