- df
(required; dataframe, tibble, or sf) A dataframe with responses
(optional) and predictors. Must have at least 10 rows for pairwise
correlation analysis, and 10 * (length(predictors) - 1) for VIF.
Default: NULL.
- response
(optional, character string) Name of a numeric response variable in df. Default: NULL.
- predictors
(optional; character vector or NULL) Names of the
predictors in df. If NULL, all columns except responses and
constant/near-zero-variance columns are used. Default: NULL.
- encoding_method
(optional; character vector or NULL). Name of the target encoding methods. One or several of: "mean", "rank", "loo". If NULL, target encoding is ignored, and df is returned with no modification. Default: "loo"
- smoothing
(optional; integer vector) Argument of the method "mean". Groups smaller than this number have their means pulled towards the mean of the response across all cases. Default: 0
- overwrite
(optional; logical) If TRUE, the original predictors in df are overwritten with their encoded versions, but only one encoding method, smoothing, white noise, and seed are allowed. Otherwise, encoded predictors with their descriptive names are added to df. Default: FALSE
- quiet
(optional; logical) If FALSE, messages are printed. Default: FALSE.
- ...
(optional) Internal args (e.g. function_name for
validate_arg_function_name, a precomputed correlation matrix
m, or cross-validation args for preference_order).