
Last chance! 50% off unlimited learning
Sale ends in
recipes
implementation`step_best_normalize` creates a specification of a recipe step (see `recipes` package) that will transform data using the best of a suite of normalization transformations estimated (by default) using cross-validation.
step_best_normalize(
recipe,
...,
role = NA,
trained = FALSE,
transform_info = NULL,
transform_options = list(),
num_unique = 5,
skip = FALSE,
id = rand_id("best_normalize")
)# S3 method for step_best_normalize
tidy(x, ...)
# S3 method for step_best_normalize
axe_env(x, ...)
An updated version of `recipe` with the new step added to the sequence of existing steps (if any). For the `tidy` method, a tibble with columns `terms` (the selectors or variables selected) and `value` (the lambda estimate).
A formula or recipe
One or more selector functions to choose which variables are affected by the step. See [selections()] for more details. For the `tidy` method, these are not currently used.
Not used by this step since no new variables are created.
For recipes functionality
A numeric vector of transformation values. This (was transform_info) is `NULL` until computed by [prep.recipe()].
options to be passed to bestNormalize
An integer where data that have less possible values will not be evaluate for a transformation.
For recipes functionality
For recipes functionality
A `step_best_normalize` object.
The bestnormalize transformation can be used to rescale a variable to be more similar to a normal distribution. See `?bestNormalize` for more information; `step_best_normalize` is the implementation of `bestNormalize` in the `recipes` context.
As of version 1.7, the `butcher` package can be used to (hopefully) improve scalability of this function on bigger data sets.
bestNormalize
orderNorm
,
[recipe()] [prep.recipe()] [bake.recipe()]
library(recipes)
rec <- recipe(~ ., data = as.data.frame(iris))
bn_trans <- step_best_normalize(rec, all_numeric())
bn_estimates <- prep(bn_trans, training = as.data.frame(iris))
bn_data <- bake(bn_estimates, as.data.frame(iris))
plot(density(iris[, "Petal.Length"]), main = "before")
plot(density(bn_data$Petal.Length), main = "after")
tidy(bn_trans, number = 1)
tidy(bn_estimates, number = 1)
Run the code above in your browser using DataLab