ft_imputer

A <code>spark_connection</code>, <code>ml_pipeline</code>, or a <code>tbl_spark</code>.

input_cols

output_cols

The placeholder for the missing values. All occurrences of
<code>missing_value</code> will be imputed. Note that null values are always treated
as missing.

missing_value

The imputation strategy. Currently only "mean" and "median" are
supported. If "mean", then replace missing values using the mean value of the
feature. If "median", then replace missing values using the approximate median
value of the feature. Default: mean

strategy

(Optional) A <code>tbl_spark</code>. If provided, eagerly fit the (estimator)
feature "transformer" against <code>dataset</code>. See details.

dataset

A character string used to uniquely identify the feature transformer.

Optional arguments; currently unused.

Imputation estimator for completing missing values, either using the mean or
 the median of the columns in which the missing values are located. The input
 columns should be of numeric type. This function requires Spark 2.2.0+.

R interface to Apache Spark, a fast and general engine for big data
processing, see <http://spark.apache.org>. This package supports connecting to
local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end,
and provides an interface to Spark's built-in machine learning algorithms.

ft_imputer: Feature Transformation -- Imputer (Estimator)

Description

Usage

Arguments

Value

Details

See Also