h2o.kmeans: KMeans Model in H2O

Description

Performs k-means clustering on an H2O dataset.

Usage

h2o.kmeans(training_frame, x, k, model_id, max_iterations = 1000,
  standardize = TRUE, init = c("Furthest", "Random", "PlusPlus"), seed,
  nfolds = 0, fold_column = NULL, fold_assignment = c("AUTO", "Random",
  "Modulo"), keep_cross_validation_predictions = FALSE)

Arguments

training_frame

An H2OFrame object containing the variables in the model.

(Optional) A vector containing the data columns on which k-means operates.

The number of clusters. Must be between 1 and 1e7 inclusive. k may be omitted if the user specifies the initial centers in the init parameter. If k is not omitted, in this case, then it should be equal to the number of user-specified centers.

model_id

(Optional) The unique id assigned to the resulting model. If none is given, an id will automatically be generated.

max_iterations

The maximum number of iterations allowed. Must be between 0

standardize

Logical, indicates whether the data should be standardized before running k-means.

init

A character string that selects the initial set of k cluster centers. Possible values are "Random": for random initialization, "PlusPlus": for k-means plus initialization, or "Furthest": for initialization at the furthest point from each successive center

seed

(Optional) Random seed used to initialize the cluster centroids.

nfolds

(Optional) Number of folds for cross-validation. If nfolds >= 2, then validation must remain empty.

fold_column

(Optional) Column with cross-validation fold index assignment per observation

fold_assignment

Cross-validation fold assignment scheme, if fold_column is not specified Must be "AUTO", "Random" or "Modulo"

keep_cross_validation_predictions

Whether to keep the predictions of the cross-validation models

Value

Returns an object of class H2OClusteringModel.

Examples

Run this code

library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
h2o.kmeans(training_frame = prostate.hex, k = 10, x = c("AGE", "RACE", "VOL", "GLEASON"))

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples