pcorSimulator creates a block diagonal positive definite precision matrix with three possible
graph structures: hubs-based, power-law and random. Then, it generates samples from a multivariate normal
distribution with covariance matrix given by the inverse of such precision matrix.
pcorSimulator(nobs, nclusters, nnodesxcluster, pattern = "powerLaw",
low.strength = 0.5, sup.strength = 0.9, nhubs = 5,
degree.hubs = 20, nOtherEdges = 30, alpha = 2.3, plus = 0,
prob = 0.05, perturb.clust = 0, mu = 0,
probSign = 0.5, seed = sample(10000, nclusters))number of observations.
number of clusters or blocks of variables.
number of nodes/variables per cluster.
graph structure pattern: name that uniquely identifies "hubs", "powerLaw" and "random".
minimum magnitude for nonzero partial correlation elements before regularization.
maximum magnitude for nonzero partial correlation elements before regularization.
number of hubs per cluster (if pattern = "hubs").
degree of hubs (if pattern = "hubs").
number of edges for non-hub nodes (if pattern = "hubs").
positive coefficient for the Riemman function in power-law distributions.
power-law distribution added complexity (zero by default).
probability of edge presence for random networks (if pattern = "random").
proportion of the total number of edges that are connecting two different clusters.
expected values vector to generate data (zero by default).
probability of positive sign for non-zero partial correlation coefficients. Thus, negative signs
are obtained with probability 1-probSign.
vector with seeds for each cluster.
An object of class pcorSim containing the following components:
generated data set.
hub nodes position.
edges given by the non-zero elements in the precision matrix.
precision matrix used to generate the data.
covariance matrix used to generate the data.
adjacency matrix corresponding to the non-zero structure of omega.
Hubs-based networks are graphs where only few nodes have a much higher degree (or connectivity) than the rest. Power-law networks assume that the variable \(p_k\), which denotes the fraction of nodes in the network that has degree \(k\), is given by a power-law distribution $$ p_k = \frac{k^{-\alpha}}{\varsigma(\alpha)}, $$ for \(k \geq 1\), a constant \(\alpha>0\) and the normalizing function \(\varsigma(\alpha)\) which is the Riemann zeta function. Finally, random networks are also defined by the distribution in the proportion \(p_k\). In this case, \(p_k\) follows a binomial distribution $$ p_k = {p\choose k} \theta^k (1-\theta)^{p-k}, $$ where the parameter \(\theta\) determines the proportion of edges (or sparsity) in the graph.
The regularization is given by \(\Omega^{(1)} = \Omega^{(0)} + \delta I\), with \(\delta\) such that the condition number of \(\Omega^{(1)}\) is less than the number of nodes.
Cai, T., W. Liu, and X. Luo (2011). A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation. Journal of the American Statistical Association 106(494), 594-607.
Newman, M. (2003). The structure and function of complex networks. SIAM REVIEW 45, 167-256.
Caballe, A., N. Bochkina, and C. Mayer (2016). Selection of the Regularization Parameter in Graphical Models using network charactaristics. eprint arXiv:1509.05326, 1-25.
plot.pcorSim for graphical representation of the generated partial correlation matrix.
pcorSimulatorJoint for joint partial correlation matrix generation.
# NOT RUN {
# example to use pcorSimulator function
EX1 <- pcorSimulator(nobs = 50, nclusters=3, nnodesxcluster=c(100,30,50),
pattern="powerLaw", plus=0)
print(EX1)
EX2 <- pcorSimulator(nobs = 25, nclusters=2, nnodesxcluster=c(60,40),
pattern="powerLaw", plus=1)
print(EX2)
# }
Run the code above in your browser using DataLab