ml_clustering_evaluator

0th

Percentile

Spark ML - Clustering Evaluator

Evaluator for clustering results. The metric computes the Silhouette measure using the squared Euclidean distance. The Silhouette is a measure for the validation of the consistency within clusters. It ranges between 1 and -1, where a value close to 1 means that the points in a cluster are close to the other points in the same cluster and far from the points of the other clusters.

Usage
ml_clustering_evaluator(x, features_col = "features",
  prediction_col = "prediction", metric_name = "silhouette",
  uid = random_string("clustering_evaluator_"), ...)
Arguments
x

A spark_connection object or a tbl_spark containing label and prediction columns. The latter should be the output of sdf_predict.

features_col

Name of features column.

prediction_col

Name of the prediction column.

metric_name

The performance metric. Currently supports "silhouette".

uid

A character string used to uniquely identify the ML estimator.

...

Optional arguments; currently unused.

Value

The calculated performance metric

Aliases
  • ml_clustering_evaluator
Documentation reproduced from package sparklyr, version 0.8.1, License: Apache License 2.0 | file LICENSE

Community examples

Looks like there are no examples yet.