ft_hashing_tf

A <code>spark_connection</code>, <code>ml_pipeline</code>, or a <code>tbl_spark</code>.

input_col

output_col

Binary toggle to control term frequency counts.
If true, all non-zero counts are set to 1. This is useful for discrete
probabilistic models that model binary events rather than integer
counts. (default = <code>FALSE</code>)

binary

Number of features. Should be greater than 0. (default = <code>2^18</code>)

num_features

A character string used to uniquely identify the feature transformer.

Optional arguments; currently unused.

Maps a sequence of terms to their term frequencies using the hashing trick.

R interface to Apache Spark, a fast and general engine for big data
processing, see <http://spark.apache.org>. This package supports connecting to
local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end,
and provides an interface to Spark's built-in machine learning algorithms.

Yitao Li

sparklyr

R Interface to Apache Spark

Javier Luraschi

ft_hashing_tf: Feature Transformation -- HashingTF (Transformer)

Description

Usage

Arguments

Value

See Also