sparklyr (version 0.2.32)

ft_string_indexer: Feature Transformation -- StringIndexer

Description

Encode a column of labels into a column of label indices. The indices are in [0, numLabels), ordered by label frequencies, with the most frequent label assigned index 0. The transformation can be reversed with ft_index_to_string.

Usage

ft_string_indexer(x, input_col = NULL, output_col = NULL, params = NULL)

Arguments

x
An object (usually a spark_tbl) coercable to a Spark DataFrame.
input_col
The name of the input column(s).
output_col
The name of the output column.
params
An (optional) R environment -- when available, the index <-> label mapping generated by the string indexer will be injected into this environment under the labels key.

See Also

See http://spark.apache.org/docs/latest/ml-features.html for more information on the set of transformations available for DataFrame columns in Spark.

Other feature transformation routines: ft_binarizer, ft_bucketizer, ft_discrete_cosine_transform, ft_elementwise_product, ft_index_to_string, ft_one_hot_encoder, ft_quantile_discretizer, ft_sql_transformer, ft_vector_assembler, sdf_mutate