Learn R Programming

sparsediscrim (version 0.3.0)

generate_blockdiag: Generates data from K multivariate normal data populations, where each population (class) has a covariance matrix consisting of block-diagonal autocorrelation matrices.

Description

This function generates K multivariate normal data sets, where each class is generated with a constant mean vector and a covariance matrix consisting of block-diagonal autocorrelation matrices. The data are returned as a single matrix x along with a vector of class labels y that indicates class membership.

Usage

generate_blockdiag(n, mu, num_blocks, block_size, rho, sigma2 = rep(1, K))

Arguments

n

vector of the sample sizes of each class. The length of n determines the number of classes K.

mu

matrix containing the mean vectors for each class. Expected to have p rows and K columns.

num_blocks

the number of block matrices. See details.

block_size

the dimensions of the square block matrix. See details.

rho

vector of the values of the autocorrelation parameter for each class covariance matrix. Must equal the length of n (i.e., equal to K).

sigma2

vector of the variance coefficients for each class covariance matrix. Must equal the length of n (i.e., equal to K).

Value

named list with elements:

  • x: matrix of observations with n rows and p columns

  • y: vector of class labels that indicates class membership for each observation (row) in x.

Details

For simplicity, we assume that a class mean vector is constant for each feature. That is, we assume that the mean vector of the \(k\)th class is \(c_k * j_p\), where \(j_p\) is a \(p \times 1\) vector of ones and \(c_k\) is a real scalar.

The \(k\)th class covariance matrix is defined as $$\Sigma_k = \Sigma^{(\rho)} \oplus \Sigma^{(-\rho)} \oplus \ldots \oplus \Sigma^{(\rho)},$$ where \(\oplus\) denotes the direct sum and the \((i,j)\)th entry of \(\Sigma^{(\rho)}\) is $$\Sigma_{ij}^{(\rho)} = \{ \rho^{|i - j|} \}.$$

The matrix \(\Sigma^{(\rho)}\) is referred to as a block. Its dimensions are provided in the block_size argument, and the number of blocks are specified in the num_blocks argument.

Each matrix \(\Sigma_k\) is generated by the cov_block_autocorrelation() function.

The number of classes K is determined with lazy evaluation as the length of n.

The number of features p is computed as block_size * num_blocks.

Examples

Run this code
# NOT RUN {
# Generates data from K = 3 classes.
means <- matrix(rep(1:3, each=9), ncol=3)
data <- generate_blockdiag(n = c(15, 15, 15), block_size = 3, num_blocks = 3,
rho = seq(.1, .9, length = 3), mu = means)
data$x
data$y

# Generates data from K = 4 classes. Notice that we use specify a variance.
means <- matrix(rep(1:4, each=9), ncol=4)
data <- generate_blockdiag(n = c(15, 15, 15, 20), block_size = 3, num_blocks = 3,
rho = seq(.1, .9, length = 4), mu = means)
data$x
data$y
# }

Run the code above in your browser using DataLab