Learn R Programming

clusteval (version 0.1)

sim_normal: Generates random variates from multivariate normal populations with intraclass covariance matrices.

Description

We generate $n_m$ observations $(m = 1, \ldots, M)$ from each of $M$ multivariate normal distributions such that the Euclidean distance between each of the means and the origin is equal and scaled by $\Delta \ge 0$.

Usage

sim_normal(n = rep(25, 5), p = 50, rho = rep(0.9, 5),
    delta = 0, sigma2 = 1, seed = NULL)

Arguments

Value

named list containing: [object Object],[object Object]

Details

Let $\Pi_m$ denote the $m$th population with a $p$-dimensional multivariate normal distribution, $N_p(\mu_m, \Sigma_m)$ with mean vector $\mu_m$ and covariance matrix $\Sigma_m$. Also, let $e_m$ be the $m$th standard basis vector (i.e., the $m$th element is 1 and the remaining values are 0). Then, we define $$\mu_m = \Delta \sum_{j=1}^{p/M} e_{(p/M)(m-1) + j}.$$ Note that p must be divisible by M. By default, the first 10 dimensions of $\mu_1$ are set to delta with all remaining dimensions set to 0, the second 10 dimensions of $\mu_2$ are set to delta with all remaining dimensions set to 0, and so on.

Also, we consider intraclass covariance (correlation) matrices such that $\Sigma_m = \sigma^2 (1 - \rho_m) J_p + \rho_m I_p$, where $-(p-1)^{-1} < \rho_m < 1$, $I_p$ is the $p \times p$ identity matrix, and $J_p$ denotes the $p \times p$ matrix of ones.

By default, we let $M = 5$, $\Delta = 0$, and $\sigma^2 = 1$. Furthermore, we generate 25 observations from each population by default.

For $\Delta = 0$ and $\rho_m = \rho$, $m = 1, \ldots, M$, the $M$ populations are equal.

Examples

Run this code
data_generated <- sim_normal(n = 10 * seq_len(5), seed = 42)
dim(data_generated$x)
table(data_generated$y)

data_generated2 <- sim_normal(p = 10, delta = 2, rho = rep(0.1, 5))
table(data_generated2$y)
sample_means <- with(data_generated2,
                     tapply(seq_along(y), y, function(i) {
                            colMeans(x[i,])
                     }))
(sample_means <- do.call(rbind, sample_means))

Run the code above in your browser using DataLab