piv_sim: Generate Data from a Gaussian Nested Mixture

Description

Simulate $N$ observations from a nested Gaussian mixture model with $k$ pre-specified components under uniform group probabilities $1/k$, where each group is in turn drawn from a further level consisting of two subgroups.

Usage

piv_sim(
  N,
  k,
  Mu,
  stdev,
  Sigma.p1 = diag(2),
  Sigma.p2 = 100 * diag(2),
  W = c(0.5, 0.5)
)

Value

y: The $N$ simulated observations.
true.group: A vector of integers from $1:k$ indicating the values of the latent variables $Z_i$.
subgroups: A $k \times N$ matrix where each row contains the index subgroup for the observations in the $k$-th group.

Arguments

N: The desired sample size.
k: The desired number of mixture components.
Mu: The input mean vector of length $k$ for univariate Gaussian mixtures; the input $k \times D$ matrix with the means' coordinates for multivariate Gaussian mixtures.
stdev: For univariate mixtures, the $k \times 2$ matrix of input standard deviations, where the first column contains the parameters for subgroup 1, and the second column contains the parameters for subgroup 2.
Sigma.p1: The $D \times D$ covariance matrix for the first subgroup. For multivariate mixtures only.
Sigma.p2: The $D \times D$ covariance matrix for the second subgroup. For multivariate mixtures only.
W: The vector for the mixture weights of the two subgroups.

Details

The functions allows to simulate values from a double (nested) univariate Gaussian mixture:

$$ (Y_i|Z_i=j) \sim \sum_{s=1}^{2} p_{js}\, \mathcal{N}(\mu_{j}, \sigma^{2}_{js}), $$

or from a multivariate nested Gaussian mixture:

$$ (Y_i|Z_i=j) \sim \sum_{s=1}^{2} p_{js}\, \mathcal{N}_{D}(\bm{\mu}_{j}, \Sigma_{s}), $$

where $\sigma^{2}_{js}$ is the variance for the group $j$ and the subgroup $s$ (stdev is the argument for specifying the k x 2 standard deviations for univariate mixtures); $\Sigma_s$ is the covariance matrix for the subgroup $s, s=1,2$, where the two matrices are specified by Sigma.p1 and Sigma.p2 respectively; $\mu_j$ and $\bm{\mu}_j, \ j=1,\ldots,k$ are the mean input vector and matrix respectively, specified by the argument Mu; W is a vector of dimension 2 for the subgroups weights.

Examples

Run this code


# Bivariate mixture simulation with three components

N  <- 2000
k  <- 3
D <- 2
M1 <- c(-45,8)
M2 <- c(45,.1)
M3 <- c(100,8)
Mu <- rbind(M1,M2,M3)
Sigma.p1 <- diag(D)
Sigma.p2 <- 20*diag(D)
W   <- c(0.2,0.8)
sim <- piv_sim(N = N, k = k, Mu = Mu, Sigma.p1 = Sigma.p1,
Sigma.p2 = Sigma.p2, W = W)
graphics::plot(sim$y, xlab="y[,1]", ylab="y[,2]")

Run the code above in your browser using DataLab