healthcareai (version 2.3.0)

hcai_impute: Specify imputation methods for an existing recipe

Description

`hcai-impute` adds various imputation methods to an existing recipe. Currently supports mean (numeric only), new_category (categorical only), bagged trees, or knn.

Usage

hcai_impute(recipe, nominal_method = "new_category",
  numeric_method = "mean", numeric_params = NULL,
  nominal_params = NULL)

Arguments

recipe

A recipe object. imputation will be added to the sequence of operations for this recipe.

nominal_method

Defaults to "new_category". Other choices are "bagimpute", "knnimpute" or "locfimpute".

numeric_method

Defaults to "mean". Other choices are "bagimpute", "knnimpute" or "locfimpute".

numeric_params

A named list with parmeters to use with chosen imputation method on numeric data. Options are bag_model (bagimpute only), bag_trees (bagimpute only), bag_options (bagimpute only), bag_trees (bagimpute only), knn_K (knnimpute only), impute_with (knnimpute only), (bag or knn) or seed_val (bag or knn). See step_bagimpute or step_knnimpute for details.

nominal_params

A named list with parmeters to use with chosen imputation method on nominal data. Options are bag_model (bagimpute only), bag_trees (bagimpute only), bag_options (bagimpute only), bag_trees (bagimpute only), knn_K (knnimpute only), impute_with (knnimpute only), (bag or knn) or seed_val (bag or knn). See step_bagimpute or step_knnimpute for details.

Value

An updated version of `recipe` with the new step added to the sequence of existing steps.

Examples

Run this code
# NOT RUN {
library(recipes)

n = 100
set.seed(9)
d <- tibble::tibble(patient_id = 1:n,
            age = sample(c(30:80, NA), size = n, replace = TRUE),
            hemoglobin_count = rnorm(n, mean = 15, sd = 1),
            hemoglobin_category = sample(c("Low", "Normal", "High", NA),
                                         size = n, replace = TRUE),
            disease = ifelse(hemoglobin_count < 15, "Yes", "No")
)

# Initialize
my_recipe <- recipe(disease ~ ., data = d)

# Create recipe
my_recipe <- my_recipe %>%
  hcai_impute()
my_recipe

# Train recipe
trained_recipe <- prep(my_recipe, training = d)

# Apply recipe
data_modified <- bake(trained_recipe, new_data = d)
missingness(data_modified)


# Specify methods:
my_recipe <- my_recipe %>%
  hcai_impute(numeric_method = "bagimpute",
    nominal_method = "locfimpute")
my_recipe

# Specify methods and params:
my_recipe <- my_recipe %>%
  hcai_impute(numeric_method = "knnimpute",
    numeric_params = list(knn_K = 4))
my_recipe
# }

Run the code above in your browser using DataLab