ml-persistence

0th

Percentile

Spark ML -- Model Persistence

Save/load Spark ML objects

Usage
ml_save(x, path, overwrite = FALSE, ...)

# S3 method for ml_model ml_save(x, path, overwrite = FALSE, type = c("pipeline_model", "pipeline"), ...)

ml_load(sc, path)

Arguments
x

A ML object, which could be a ml_pipeline_stage or a ml_model

path

The path where the object is to be serialized/deserialized.

overwrite

Whether to overwrite the existing path, defaults to FALSE.

...

Optional arguments; currently unused.

type

Whether to save the pipeline model or the pipeline.

sc

A Spark connection.

Value

ml_save() serializes a Spark object into a format that can be read back into sparklyr or by the Scala or PySpark APIs. When called on ml_model objects, i.e. those that were created via the tbl_spark - formula signature, the associated pipeline model is serialized. In other words, the saved model contains both the data processing (RFormulaModel) stage and the machine learning stage.

ml_load() reads a saved Spark object into sparklyr. It calls the correct Scala load method based on parsing the saved metadata. Note that a PipelineModel object saved from a sparklyr ml_model via ml_save() will be read back in as an ml_pipeline_model, rather than the ml_model object.

Aliases
  • ml-persistence
  • ml_save
  • ml_save.ml_model
  • ml_load
Documentation reproduced from package sparklyr, version 0.7.0, License: Apache License 2.0 | file LICENSE

Community examples

Looks like there are no examples yet.