Learn R Programming

VICatMix (version 1.0)

runVICatMixVarSelAvg: runVICatMixVarSelAvg

Description

An extension of `runVICatMixVarSel` to incorporate model averaging/summarisation over multiple initialisations.

Usage

runVICatMixVarSelAvg(
  data,
  K,
  alpha,
  a = 2,
  maxiter = 2000,
  tol = 5e-08,
  outcome = NA,
  inits = 25,
  loss = "VoIcomp",
  var_threshold = 0.95,
  parallel = FALSE,
  cores = getOption("mc.cores", 2L),
  verbose = FALSE
)

Value

A list with the following components: (maxNCat refers to the maximum number of categories for any covariate in the data)

labels_avg

A numeric N-vector listing the cluster assignments for the observations in the averaged model.

varsel_avg

A numeric P-vector with a variable selection indicator for the covariates in the averaged model.

init_results

A list where each entry is the cluster assignments for one of the initialisations included in the model averaging.

init_varsel_results

A list where each entry is the expected value for the variable selection parameters ('c') for one of the initialisations included in the model averaging.

Arguments

data

A data frame or data matrix with N rows of observations, and P columns of covariates.

K

Maximum number of clusters desired. Must be an integer greater than 1.

alpha

The Dirichlet prior parameter. Recommended to set this to a number < 1. Must be > 0.

a

Hyperparameter for variable selection hyperprior. Default is 2.

maxiter

The maximum number of iterations for the algorithm. Default is 2000.

tol

A convergence parameter. Default is 5x10^-8.

outcome

Optional outcome variable. Default is NA; having an outcome triggers semi-supervised profile regression.

inits

The number of initialisations included in the co-clustering matrix. Default is 25.

loss

The loss function to be used with the co-clustering matrix. Default is VoIcomp. Options are "VoIavg", "VoIcomp" and "medv".

var_threshold

Threshold for selection proportion for determining selected variables under the averaged model. Options are 0 < n <= 1 for a threshold. Default is 0.95.

parallel

Logical value indicating whether to run initialisations in parallel. Default is FALSE.

cores

User can specify number of cores for parallelisation if parallel = TRUE. Package automatically uses the user's parallel backend if one has already been registered.

verbose

Default FALSE. Set to TRUE to output ELBO values for each iteration.

See Also

runVICatMixVarSel

Examples

Run this code
# example code

set.seed(12)
generatedData <- generateSampleDataBin(500, 4, c(0.1, 0.2, 0.3, 0.4), 40, 10)
result <- runVICatMixVarSelAvg(generatedData$data, 10, 0.01, inits = 10)

print(result$labels_avg)
print(result$varsel_avg)



Run the code above in your browser using DataLab