update.EffectData: Update "EffectData" Object

Description

Updates an "EffectData" object by

turning discrete values to factor (especially useful with the next option),
collapsing levels of categorical variables with many levels,
dropping empty bins,
dropping small bins,
dropping bins with missing name, or
sorting the variables by their importance, see effect_importance()-

Except for sort_by, all arguments are vectorized, i.e., you can pass a vector or list of the same length as object.

Usage

# S3 method for EffectData
update(
  object,
  sort_by = c("no", "pd", "pred_mean", "y_mean", "resid_mean", "ale"),
  to_factor = FALSE,
  collapse_m = 30L,
  collapse_by = c("weight", "N"),
  drop_empty = FALSE,
  drop_below_n = 0,
  drop_below_weight = 0,
  na.rm = FALSE,
  ...
)

Value

A modified object of class "EffectData".

Arguments

object: Object of class "EffectData".
sort_by: By which statistic ("pd", "pred_mean", "y_mean", "resid_mean", "ale") should the results be sorted? The default is "no" (no sorting). Calculated after all other update steps, e.g., after collapsing or dropping rare levels.
to_factor: Should discrete features be treated as factors? In combination with collapse_m, this can be used to collapse rare values of discrete numeric features.
collapse_m: If a factor or character feature has more than collapse_m levels, rare levels are collapsed into a new level "other". Standard deviations are collapsed via root of the weighted average variances. The default is 30. Set to Inf for no collapsing.
collapse_by: How to determine "rare" levels in collapse_m? Either "weight" (default) or "N". Only matters in situations with case weights w.
drop_empty: Drop empty bins. Equivalent to drop_below_n = 1.
drop_below_n: Drop bins with N below this value. Applied after collapsing.
drop_below_weight: Drop bins with weight below this value. Applied after collapsing.
na.rm: Should missing bin centers be dropped? Default is FALSE.
...: Currently not used.

Examples

Run this code

fit <- lm(Sepal.Length ~ ., data = iris)
xvars <- colnames(iris)[-1]
feature_effects(fit, v = xvars, data = iris, y = "Sepal.Length", breaks = 5) |>
  update(sort = "pd", collapse_m = 2) |>
  plot()

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples