Spark ML - Clustering Evaluator
Evaluator for clustering results. The metric computes the Silhouette measure using the squared Euclidean distance. The Silhouette is a measure for the validation of the consistency within clusters. It ranges between 1 and -1, where a value close to 1 means that the points in a cluster are close to the other points in the same cluster and far from the points of the other clusters.
ml_clustering_evaluator(x, features_col = "features", prediction_col = "prediction", metric_name = "silhouette", uid = random_string("clustering_evaluator_"), ...)
spark_connectionobject or a
tbl_sparkcontaining label and prediction columns. The latter should be the output of
Name of features column.
Name of the prediction column.
The performance metric. Currently supports "silhouette".
A character string used to uniquely identify the ML estimator.
Optional arguments; currently unused.
The calculated performance metric