A set of functions to calculate performance metrics for prediction models. Also see the Spark ML Documentation https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.evaluation.package
ml_binary_classification_evaluator(x, label_col = "label",
raw_prediction_col = "rawPrediction", metric_name = "areaUnderROC",
uid = random_string("binary_classification_evaluator_"), ...)ml_binary_classification_eval(x, label_col = "label",
prediction_col = "prediction", metric_name = "areaUnderROC")
ml_multiclass_classification_evaluator(x, label_col = "label",
prediction_col = "prediction", metric_name = "f1",
uid = random_string("multiclass_classification_evaluator_"), ...)
ml_classification_eval(x, label_col = "label",
prediction_col = "prediction", metric_name = "f1")
ml_regression_evaluator(x, label_col = "label",
prediction_col = "prediction", metric_name = "rmse",
uid = random_string("regression_evaluator_"), ...)
A spark_connection object or a tbl_spark containing label and prediction columns. The latter should be the output of sdf_predict.
Name of column string specifying which column contains the true labels or values.
Raw prediction (a.k.a. confidence) column name.
The performance metric. See details.
A character string used to uniquely identify the ML estimator.
Optional arguments; currently unused.
Name of the column that contains the predicted
label or value NOT the scored probability. Column should be of type
Double.
The calculated performance metric
The following metrics are supported
Binary Classification: areaUnderROC (default) or areaUnderPR (not available in Spark 2.X.)
Multiclass Classification: f1 (default), precision, recall, weightedPrecision, weightedRecall or accuracy; for Spark 2.X: f1 (default), weightedPrecision, weightedRecall or accuracy.
Regression: rmse (root mean squared error, default),
mse (mean squared error), r2, or mae (mean absolute error.)
ml_binary_classification_eval() is an alias for ml_binary_classification_evaluator() for backwards compatibility.
ml_classification_eval() is an alias for ml_multiclass_classification_evaluator() for backwards compatibility.