get_psi
is used to calculate Population Stability Index (PSI) of an independent variable.
get_psi_all
can loop through PSI for all specified independent variables.Calculate Population Stability Index (PSI)
get_psi
is used to calculate Population Stability Index (PSI) of an independent variable.
get_psi_all
can loop through PSI for all specified independent variables.
get_psi_all(dat, x_list = NULL, target = NULL, dat_test = NULL,
breaks_list = NULL, occur_time = NULL, start_date = NULL,
cut_date = NULL, oot_pct = 0.7, pos_flag = NULL,
parallel = FALSE, ex_cols = NULL, as_table = FALSE, g = 10,
bins_no = TRUE, note = FALSE)get_psi(dat, x, target = NULL, dat_test = NULL, occur_time = NULL,
start_date = NULL, cut_date = NULL, pos_flag = NULL,
breaks = NULL, breaks_list = NULL, oot_pct = 0.7, g = 10,
as_table = TRUE, note = FALSE, bins_no = TRUE)
A data.frame with independent variables and target variable.
Names of independent variables.
The name of target variable.
A data.frame of test data. Default is NULL.
A table containing a list of splitting points for each independent variable. Default is NULL.
The name of the variable that represents the time at which each observation takes place.
The earliest occurrence time of observations.
Time points for spliting data sets, e.g. : spliting Actual and Expected data sets.
Percentage of observations retained for overtime test (especially to calculate PSI). Defualt is 0.7
Value of positive class, Default is "1".
Logical, parallel computing. Default is FALSE.
Names of excluded variables. Regular expressions can also be used to match variable names. Default is NULL.
Logical, output results in a table. Default is TRUE.
Number of initial breakpoints for equal frequency binning.
Logical, add serial numbers to bins. Default is TRUE.
Logical, outputs info. Default is TRUE.
The name of an independent variable.
Splitting points for an independent variable. Default is NULL.
PSI Rules for evaluating the stability of a predictor Less than 0.02: Very stable 0.02 to 0.1: Stable 0.1 to 0.2: Unstable 0.2 to 0.5] : Change more than 0.5: Great change
# NOT RUN {
# dat_test is null
get_psi(dat = UCICreditCard, x = "PAY_3", occur_time = "apply_date")
# dat_test is not all
# train_test split
train_test = train_test_split(dat = UCICreditCard, prop = 0.7, split_type = "OOT",
occur_time = "apply_date", start_date = NULL, cut_date = NULL,
save_data = FALSE, note = FALSE)
dat_ex = train_test$train
dat_ac = train_test$test
# generate psi table
get_psi(dat = dat_ex, dat_test = dat_ac, x = "PAY_3",
occur_time = "apply_date", bins_no = TRUE)
# }
Run the code above in your browser using DataLab