ft_one_hot_encoder
Feature Transformation -- OneHotEncoder (Transformer)
One-hot encoding maps a column of label indices to a column of binary
vectors, with at most a single one-value. This encoding allows algorithms
which expect continuous features, such as Logistic Regression, to use
categorical features. Typically, used with ft_string_indexer()
to
index a column first.
Usage
ft_one_hot_encoder(x, input_col = NULL, output_col = NULL,
drop_last = TRUE, uid = random_string("one_hot_encoder_"), ...)
Arguments
- x
A
spark_connection
,ml_pipeline
, or atbl_spark
.- input_col
The name of the input column.
- output_col
The name of the output column.
- drop_last
Whether to drop the last category. Defaults to
TRUE
.- uid
A character string used to uniquely identify the feature transformer.
- ...
Optional arguments; currently unused.
Value
The object returned depends on the class of x
.
spark_connection
: Whenx
is aspark_connection
, the function returns aml_transformer
, aml_estimator
, or one of their subclasses. The object contains a pointer to a SparkTransformer
orEstimator
object and can be used to composePipeline
objects.ml_pipeline
: Whenx
is aml_pipeline
, the function returns aml_pipeline
with the transformer or estimator appended to the pipeline.tbl_spark
: Whenx
is atbl_spark
, a transformer is constructed then immediately applied to the inputtbl_spark
, returning atbl_spark
See Also
See http://spark.apache.org/docs/latest/ml-features.html for more information on the set of transformations available for DataFrame columns in Spark.
Other feature transformers: ft_binarizer
,
ft_bucketizer
,
ft_chisq_selector
,
ft_count_vectorizer
, ft_dct
,
ft_elementwise_product
,
ft_feature_hasher
,
ft_hashing_tf
, ft_idf
,
ft_imputer
,
ft_index_to_string
,
ft_interaction
, ft_lsh
,
ft_max_abs_scaler
,
ft_min_max_scaler
, ft_ngram
,
ft_normalizer
, ft_pca
,
ft_polynomial_expansion
,
ft_quantile_discretizer
,
ft_r_formula
,
ft_regex_tokenizer
,
ft_sql_transformer
,
ft_standard_scaler
,
ft_stop_words_remover
,
ft_string_indexer
,
ft_tokenizer
,
ft_vector_assembler
,
ft_vector_indexer
,
ft_vector_slicer
, ft_word2vec