ml_clustering_evaluator
Spark ML - Clustering Evaluator
Evaluator for clustering results. The metric computes the Silhouette measure using the squared Euclidean distance. The Silhouette is a measure for the validation of the consistency within clusters. It ranges between 1 and -1, where a value close to 1 means that the points in a cluster are close to the other points in the same cluster and far from the points of the other clusters.
Usage
ml_clustering_evaluator(x, features_col = "features",
prediction_col = "prediction", metric_name = "silhouette",
uid = random_string("clustering_evaluator_"), ...)
Arguments
- x
A
spark_connection
object or atbl_spark
containing label and prediction columns. The latter should be the output ofsdf_predict
.- features_col
Name of features column.
- prediction_col
Name of the prediction column.
- metric_name
The performance metric. Currently supports "silhouette".
- uid
A character string used to uniquely identify the ML estimator.
- ...
Optional arguments; currently unused.
Value
The calculated performance metric