ft_quantile_discretizer

An object (usually a <code>spark_tbl</code>) coercable to a Spark DataFrame.

input.col

output.col

n.buckets

Optional arguments; currently unused.

Takes a column with continuous features and outputs a column with binned
categorical features. The bin ranges are chosen by taking a sample of the
data and dividing it into roughly equal parts. The lower and upper bin bounds
will be -Infinity and +Infinity, covering all real values. This attempts to
find numBuckets partitions based on a sample of the given input data, but it
may find fewer depending on the data sample values.

R interface to Apache Spark, a fast and general engine for big data
processing, see <http://spark.apache.org>. This package supports connecting to
local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end,
and provides an interface to Spark's built-in machine learning algorithms.

Javier Luraschi

ft_quantile_discretizer: Feature Transformation -- QuantileDiscretizer

Description

Usage

Arguments

Details

See Also