pcorSimulator
creates a block diagonal positive definite precision matrix with three possible
graph structures: hubs-based, power-law and random. Then, it generates samples from a multivariate normal
distribution with covariance matrix given by the inverse of such precision matrix.
pcorSimulator(nobs, nclusters, nnodesxcluster, pattern = "powerLaw",
low.strength = 0.5, sup.strength = 0.9, nhubs = 5,
degree.hubs = 20, nOtherEdges = 30, alpha = 2.3, plus = 0,
prob = 0.05, perturb.clust = 0, mu = 0,
probSign = 0.5, seed = sample(10000, nclusters))
number of observations.
number of clusters or blocks of variables.
number of nodes/variables per cluster.
graph structure pattern: name that uniquely identifies "hubs"
, "powerLaw"
and "random"
.
minimum magnitude for nonzero partial correlation elements before regularization.
maximum magnitude for nonzero partial correlation elements before regularization.
number of hubs per cluster (if pattern = "hubs"
).
degree of hubs (if pattern = "hubs"
).
number of edges for non-hub nodes (if pattern = "hubs"
).
positive coefficient for the Riemman function in power-law distributions.
power-law distribution added complexity (zero by default).
probability of edge presence for random networks (if pattern = "random"
).
proportion of the total number of edges that are connecting two different clusters.
expected values vector to generate data (zero by default).
probability of positive sign for non-zero partial correlation coefficients. Thus, negative signs
are obtained with probability 1-probSign
.
vector with seeds for each cluster.
An object of class pcorSim
containing the following components:
generated data set.
hub nodes position.
edges given by the non-zero elements in the precision matrix.
precision matrix used to generate the data.
covariance matrix used to generate the data.
adjacency matrix corresponding to the non-zero structure of omega
.
Hubs-based networks are graphs where only few nodes have a much higher degree (or connectivity) than the rest. Power-law networks assume that the variable \(p_k\), which denotes the fraction of nodes in the network that has degree \(k\), is given by a power-law distribution $$ p_k = \frac{k^{-\alpha}}{\varsigma(\alpha)}, $$ for \(k \geq 1\), a constant \(\alpha>0\) and the normalizing function \(\varsigma(\alpha)\) which is the Riemann zeta function. Finally, random networks are also defined by the distribution in the proportion \(p_k\). In this case, \(p_k\) follows a binomial distribution $$ p_k = {p\choose k} \theta^k (1-\theta)^{p-k}, $$ where the parameter \(\theta\) determines the proportion of edges (or sparsity) in the graph.
The regularization is given by \(\Omega^{(1)} = \Omega^{(0)} + \delta I\), with \(\delta\) such that the condition number of \(\Omega^{(1)}\) is less than the number of nodes.
Cai, T., W. Liu, and X. Luo (2011). A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation. Journal of the American Statistical Association 106(494), 594-607.
Newman, M. (2003). The structure and function of complex networks. SIAM REVIEW 45, 167-256.
Caballe, A., N. Bochkina, and C. Mayer (2016). Selection of the Regularization Parameter in Graphical Models using network charactaristics. eprint arXiv:1509.05326, 1-25.
plot.pcorSim
for graphical representation of the generated partial correlation matrix.
pcorSimulatorJoint
for joint partial correlation matrix generation.
# NOT RUN {
# example to use pcorSimulator function
EX1 <- pcorSimulator(nobs = 50, nclusters=3, nnodesxcluster=c(100,30,50),
pattern="powerLaw", plus=0)
print(EX1)
EX2 <- pcorSimulator(nobs = 25, nclusters=2, nnodesxcluster=c(60,40),
pattern="powerLaw", plus=1)
print(EX2)
# }
Run the code above in your browser using DataLab