power_transform
transforms numeric values to normality.
power_transform(x, transformer = NULL, oob_action = "na", ...)
A vector of transformed values of x
.
A vector with numeric values that should be transformed to normality.
A transformer object created using
find_transformation_parameters
. If NULL
, a transformer is generated
internally.
Action that should be taken when out-of-bounds values are
encountered in x
. This can for example be 0 or negative values for
Box-Cox transformations.
na
(default): replaces out-of-bounds values by NA values.
valid
: replaces out-of-bounds values by the closest valid boundary
values.
This argument has no effect for Yeo-Johnson transformations.
Arguments passed on to find_transformation_parameters
method
One of the following methods for power transformation:
box_cox
: Transformation using the Box-Cox transformation (Box and Cox,
1964). The Box-Cox transformation requires that all data are strictly
positive. Features that contain zero or negative values cannot be
transformed using this transformation. In their work, Box and Cox define a
shifted variant. We use this variant to shift values to a strictly positive
range, when negative values are present. The Box-Cox transformation relies
on a single parameter lambda, which is estimated through maximisation of
the log-likelihood function corresponding to a normal distribution.
yeo_johnson
:Transformation using the Yeo-Johnson
transformation (Yeo and Johnson, 2000). Unlike the Box-Cox transformation,
the Yeo-Johnson transformation allows for negative and positive values.
Like the Box-Cox transformation, this transformation relies on a single
parameter lambda, which is estimated through maximisation of the
log-likelihood function corresponding to a normal distribution.
none
: A fall-back method that will not transform values.
robust
Flag for using a robust version of Box-Cox or Yeo-Johnson transformation, as defined by Raymaekers and Rousseeuw (2021). This version is less sensitive in the presence outliers.
invariant
Flag for using a version of Box-Cox or Yeo-Johnson transformation that simultaneously optimises location and scale in addition to the lambda parameter.
lambda
Single lambda value, or range of lambda values that should be
considered. Default: c(4.0, 6.0). Can be NULL
to force optimisation
without a constraint in lambda values.
empirical_gof_normality_p_value
Significance value for the empirical
goodness-of-fit test for central normality. The p-value is computed through
the assess_transformation
function. By setting this parameter to a
numeric value other than NULL
, the transformation will be rejected when
the p-value of the test is below the significance value.
find_transformation_parameters
x <- exp(stats::rnorm(1000))
y <- power_transform(
x = x,
method = "box_cox")
Run the code above in your browser using DataLab