# ft_quantile_discretizer

##### Feature Transformation -- QuantileDiscretizer

Takes a column with continuous features and outputs a column with binned categorical features. The bin ranges are chosen by taking a sample of the data and dividing it into roughly equal parts. The lower and upper bin bounds will be -Infinity and +Infinity, covering all real values. This attempts to find numBuckets partitions based on a sample of the given input data, but it may find fewer depending on the data sample values.

##### Usage

`ft_quantile_discretizer(x, input.col, output.col, n.buckets = 5L, ...)`

##### Arguments

- x
An object (usually a

`spark_tbl`

) coercable to a Spark DataFrame.- input.col
The name of the input column(s).

- output.col
The name of the output column.

- n.buckets
The number of buckets to use.

- ...
Optional arguments; currently unused.

##### Details

Note that the result may be different every time you run it, since the sample strategy behind it is non-deterministic.

##### See Also

See http://spark.apache.org/docs/latest/ml-features.html for more information on the set of transformations available for DataFrame columns in Spark.

Other feature transformation routines: `ft_binarizer`

,
`ft_bucketizer`

,
`ft_count_vectorizer`

,
`ft_discrete_cosine_transform`

,
`ft_elementwise_product`

,
`ft_index_to_string`

,
`ft_one_hot_encoder`

,
`ft_regex_tokenizer`

,
`ft_stop_words_remover`

,
`ft_string_indexer`

,
`ft_tokenizer`

,
`ft_vector_assembler`

,
`sdf_mutate`

*Documentation reproduced from package sparklyr, version 0.6.3, License: Apache License 2.0 | file LICENSE*