Learn R Programming

ClusterR (version 1.0.1)

Optimal_Clusters_GMM: Optimal number of Clusters for the gaussian mixture models

Description

Optimal number of Clusters for the gaussian mixture models

Usage

Optimal_Clusters_GMM(data, max_clusters, criterion = "AIC", dist_mode = "eucl_dist", seed_mode = "random_subset", km_iter = 10, em_iter = 5, verbose = FALSE, var_floor = 1e-10, plot_data = TRUE, seed = 1)

Arguments

data
matrix or data frame
max_clusters
the maximum number of clusters
criterion
one of 'AIC' or 'BIC'
dist_mode
the distance used during the seeding of initial means and k-means clustering. One of, eucl_dist, maha_dist.
seed_mode
how the initial means are seeded prior to running k-means and/or EM algorithms. One of, static_subset,random_subset,static_spread,random_spread.
km_iter
the number of iterations of the k-means algorithm
em_iter
the number of iterations of the EM algorithm
verbose
either TRUE or FALSE; enable or disable printing of progress during the k-means and EM algorithms
var_floor
the variance floor (smallest allowed value) for the diagonal covariances
plot_data
either TRUE or FALSE indicating whether the results of the function should be plotted
seed
integer value for random number generator (RNG)

Value

a vector with either the AIC or BIC for each iteration. In case of Error it returns the error message and the possible causes.

Details

AIC : the Akaike information criterion

BIC : the Bayesian information criterion

Examples

Run this code

data(dietary_survey_IBS)

dat = dietary_survey_IBS[, -ncol(dietary_survey_IBS)]

dat = center_scale(dat)

opt_gmm = Optimal_Clusters_GMM(dat, 10, criterion = "AIC", plot_data = FALSE)

Run the code above in your browser using DataLab