An extension of `runVICatMixVarSel` to incorporate model averaging/summarisation over multiple initialisations.
runVICatMixVarSelAvg(
data,
K,
alpha,
a = 2,
maxiter = 2000,
tol = 5e-08,
outcome = NA,
inits = 25,
loss = "VoIcomp",
var_threshold = 0.95,
parallel = FALSE,
cores = getOption("mc.cores", 2L),
verbose = FALSE
)A list with the following components: (maxNCat refers to the maximum number of categories for any covariate in the data)
A numeric N-vector listing the cluster assignments for the observations in the averaged model.
A numeric P-vector with a variable selection indicator for the covariates in the averaged model.
A list where each entry is the cluster assignments for one of the initialisations included in the model averaging.
A list where each entry is the expected value for the variable selection parameters ('c') for one of the initialisations included in the model averaging.
A data frame or data matrix with N rows of observations, and P columns of covariates.
Maximum number of clusters desired. Must be an integer greater than 1.
The Dirichlet prior parameter. Recommended to set this to a number < 1. Must be > 0.
Hyperparameter for variable selection hyperprior. Default is 2.
The maximum number of iterations for the algorithm. Default is 2000.
A convergence parameter. Default is 5x10^-8.
Optional outcome variable. Default is NA; having an outcome triggers semi-supervised profile regression.
The number of initialisations included in the co-clustering matrix. Default is 25.
The loss function to be used with the co-clustering matrix. Default is VoIcomp. Options are "VoIavg", "VoIcomp" and "medv".
Threshold for selection proportion for determining selected variables under the averaged model. Options are 0 < n <= 1 for a threshold. Default is 0.95.
Logical value indicating whether to run initialisations in parallel. Default is FALSE.
User can specify number of cores for parallelisation if parallel = TRUE. Package automatically uses the user's parallel backend if one has already been registered.
Default FALSE. Set to TRUE to output ELBO values for each iteration.
runVICatMixVarSel
# example code
set.seed(12)
generatedData <- generateSampleDataBin(500, 4, c(0.1, 0.2, 0.3, 0.4), 40, 10)
result <- runVICatMixVarSelAvg(generatedData$data, 10, 0.01, inits = 10)
print(result$labels_avg)
print(result$varsel_avg)
Run the code above in your browser using DataLab