Simulate data from a p-dimensional (zero-mean) gaussian graphical model (GGM) with a specified (or random) topology and return the sample covariance matrix or matrices. Can also return the original simulated data or underlying precision matrix.
createS(n, p,
topology = "identity", # See details for other choices
dataset = FALSE, precision = FALSE,
nonzero = 0.25, m = 1L, banded.n = 2L,
invwishart = FALSE, nu = p + 1, Plist)
A numeric
vector giving number of samples. If the length is larger than 1, the covariance matrices are returned as a list.
A numeric
of length 1 giving the dimension of the samples/covariance.
character. The topology to use for the simulations. See the details.
A logical
value specifying whether the sample covariance or the simulated data itself should be returned.
A logical
value. If TRUE
the constructed precision matrix
is returned.
A numeric
of length 1 giving the value of the nonzero entries used in some topologies.
A integer
giving the number of blocks (i.e. conditionally independent components) to create. If m
is greater than 1, then the given topology
is used on m
blocks of approximately equal size.
A integer
of length one giving the number of bands. Only used if topology
is one of "banded"
, "small-world"
, or "Watts-Strogatz"
.
logical
. If TRUE
the constructed precision matrix is used as the scale matrix of an inverse Wishart distribution and class covariance matrices are drawn from this distribution.
numeric
greater than p + 1
giving the degrees of freedom in the inverse Wishart distribution.
A large nu
implies high class homogeneity.
A small nu
near p + 1
implies high class heterogeneity.
An optional list
of numeric
matrices giving the
precision matrices to simulate from. Useful when random matrices have already
been generated by setting precision = TRUE
.
The returned type is dependent on n
and covariance
.
The function generally returns a list
of numeric
matrices with the same length as n
.
If covariance
is FALSE
the simulated datasets with size n[i]
by p
are given in the i
entry of the output.
If covariance
is TRUE
the p
by p
sample covariances of the datasets are given.
When n
has length 1 the list
structure is dropped and the matrix is returned.
The data is simulated from a zero-mean p
-dimensional multivariate gaussian distribution with some precision matrix determined by the argument topology
which defines the GGM.
If precision
is TRUE
the population precision matrix is returned.
This is useful to see what the actual would-be-used precision matrices are.
The available values of topology
are described below.
Unless otherwise stated the diagonal entries are always one.
If m
is 2 or greater block diagonal precision matrices are constructed and used.
"identity"
: uses the identity matrix (diag(p)
) as precision matrix.
Corresponds to no conditional dependencies.
"star"
: simulate from a star topology. Within each block the first
node is selected as the "hub". The off-diagonal entries
"clique"
: simulate from clique topology where each block is a complete
graph with off-diagonal elements equal to nonzero
.
"complete"
: alias for (and identical to) "clique"
.
"chain"
: simulate from a chain topology where the precision matrix
is a tridiagonal matrix with off-diagonal elements (in each block) given
by argument nonzero
.
"banded"
: precision elements (i,j)
are given by
banded.n
and zero otherwise.
"scale-free"
: The non-zero pattern of each block is generated by a
Barabassi random graph. Non-zero off-diagonal values are given by nonzero
.
Gives are very "hubby" network.
"Barabassi"
: alias for "scale-free"
.
"small-world"
: The non-zero pattern of each block is generated by a
1-dimensional Watts-Strogatz random graph with banded.n
starting neighbors and nonzero
. Gives are very "bandy" network.
"Watts-Strogatz"
: alias for "small-world"
"random-graph"
: The non-zero pattern of each block is generated by a
Erdos-Renyi random graph where each edge is present with probability nonzero
.
"Erdos-Renyi"
: alias for "random-graph"
When n
has length greater than 1, the datasets are generated i.i.d. given the topology and number of blocks.
Arguments invwishart
and nu
allows for introducing class homogeneity.
Large values of nu
imply high class homogeneity.
nu
must be greater than p + 1
.
More precisely, if invwishart == TRUE
then the constructed precision matrix is used as the scale parameter in an inverse Wishart distribution with nu
degrees of freedom.
Each class covariance is distributed according to this inverse Wishart and independent.
# NOT RUN {
## Generate some simple sample covariance matrices
createS(n = 10, p = 3)
createS(n = c(3, 4, 5), p = 3)
createS(n = c(32, 55), p = 7)
## Generate some datasets and not sample covariance matrices
createS(c(3, 4), p = 6, dataset = TRUE)
## Generate sample covariance matrices from other topologies:
A <- createS(2000, p = 4, topology = "star")
round(solve(A), 3)
B <- createS(2000, p = 4, topology = "banded", banded.n = 2)
round(solve(B), 3)
C <- createS(2000, p = 4, topology = "clique") # The complete graph (as m = 1)
round(solve(C), 3)
D <- createS(2000, p = 4, topology = "chain")
round(solve(D), 3)
## Generate smaple covariance matrices from block topologies:
C3 <- createS(2000, p = 10, topology = "clique", m = 3)
round(solve(C3), 1)
C5 <- createS(2000, p = 10, topology = "clique", m = 5)
round(solve(C5), 1)
## Can also return the precision matrix to see what happens
## m = 2 blocks, each "banded" with 4 off-diagonal bands
round(createS(1, 12, "banded", m = 2, banded.n = 4, precision = TRUE), 2)
## Simulation using graph-games
round(createS(1, 10, "small-world", precision = TRUE), 2)
round(createS(1, 5, "scale-free", precision = TRUE), 2)
round(createS(1, 5, "random-graph", precision = TRUE), 2)
## Simulation using inverse Wishart distributed class covariance
## Low class homogeneity
createS(n = c(10,10), p = 5, "banded", invwishart = TRUE, nu = 10)
## Extremely high class homogeneity
createS(n = c(10,10), p = 5, "banded", invwishart = TRUE, nu = 1e10)
# The precision argument can again be used to see the actual realised class
# precision matrices used when invwishart = TRUE.
# The Plist argument is used to reuse old precision matrices or
# user-generated ones
P <- createS(n = 1, p = 5, "banded", precision = TRUE)
lapply(createS(n = c(1e5, 1e5), p = 5, Plist = list(P, P+1)), solve)
# }
Run the code above in your browser using DataLab