Learn R Programming

RMCLab (version 0.1.0)

median_impute: Median imputation

Description

Perform median imputation. In case of discrete rating-scale data, a discretization step can be carried out afterwards to make sure that the imputed values are mapped to the rating scale of the observed values (as the median of a given column may lie in between two answer categories in case of an even number of observed values). This is done by randomly sampling from the largest answer category smaller than the median and the smallest answer category larger than the median (for each missing cell).

Usage

median_impute(X, discretize = TRUE, values = NULL)

Value

An object of class "median_impute" with the following components:

medians

a numeric vector containing the median of the observed values for each variable.

X

a numeric matrix containing the completed (i.e., imputed) data matrix.

X_discretized

a numeric matrix containing the completed (i.e., imputed) data matrix after the discretization step. This is only returned if requested via discretize = TRUE.

The class structure is still experimental and may change in the future. Use the accessor function get_completed() to extract the completed (i.e., imputed) data matrix.

Arguments

X

a matrix or data frame with missing values.

discretize

a logical indicating whether to include a discretization step after median imputation (defaults to TRUE). In case of discrete rating-scale data, this can be used to ensure that the imputed values are mapped to the discrete rating scale of the observed values.

values

an optional numeric vector giving the possible values of discrete ratings. This is ignored if discretize is FALSE. Currently, the possible values are assumed to be the same for all columns. If NULL, the unique values of the observed parts of X are used.

Author

Andreas Alfons

See Also

mode_impute()

Examples

Run this code
# toy example derived from MovieLens 100K dataset
data("MovieLensToy")
# median imputation with discretization step
fit <- median_impute(MovieLensToy, values = 1:5)
# extract discretized completed matrix
X_hat <- get_completed(fit)
head(X_hat)

Run the code above in your browser using DataLab