embed (version 0.1.2)

dictionary: Weight of evidence dictionary

Description

Builds the woe dictionary of a set of predictor variables upon a given binary outcome. Convenient to make a woe version of the given set of predictor variables and also to allow one to tweak some woe values by hand.

Usage

dictionary(.data, outcome, ..., Laplace = 1e-06)

Arguments

.data

A tbl. The data.frame where the variables come from.

outcome

The bare name of the outcome variable with exactly 2 distinct values.

...

bare names of predictor variables or selectors accepted by dplyr::select().

Laplace

Default to 1e-6. The pseudocount parameter of the Laplace Smoothing estimator. Value to avoid -Inf/Inf from predictor category with only one outcome class. Set to 0 to allow Inf/-Inf.

Value

a tibble with summaries and woe for every given predictor variable stacked up.

Details

You can pass a custom dictionary to step_woe(). It must have the exactly the same structure of the output of dictionary(). One easy way to do this is by tweaking an output returned from it.

References

Kullback, S. (1959). Information Theory and Statistics. Wiley, New York.

Hastie, T., Tibshirani, R. and Friedman, J. (1986). Elements of Statistical Learning, Second Edition, Springer, 2009.

Good, I. J. (1985), "Weight of evidence: A brief survey", Bayesian Statistics, 2, pp.249-270.

Examples

Run this code
# NOT RUN {
mtcars %>% dictionary("am", cyl, gear:carb)


# }

Run the code above in your browser using DataCamp Workspace