MiniBatchKmeans(data, clusters, batch_size = 10, num_init = 1, max_iters = 100, init_fraction = 1, initializer = "optimal_init", early_stop_iter = 10, verbose = FALSE, CENTROIDS = NULL, tol = 1e-04, tol_optimal_init = 0.5, seed = 1)
---------------initializers----------------------
optimal_init : this initializer adds rows of the data incrementally, while checking that they do not already exist in the centroid-matrix
quantile_init : initialization of centroids by using the cummulative distance between observations and by removing potential duplicates
kmeans++ : kmeans++ initialization. Reference : http://theory.stanford.edu/~sergei/papers/kMeansPP-soda.pdf AND http://stackoverflow.com/questions/5466323/how-exactly-does-k-means-work
random : random selection of data rows as initial centroids
data(dietary_survey_IBS)
dat = dietary_survey_IBS[, -ncol(dietary_survey_IBS)]
dat = center_scale(dat)
MbatchKm = MiniBatchKmeans(dat, clusters = 2, batch_size = 20, num_init = 5, early_stop_iter = 10)
Run the code above in your browser using DataLab