Learn R Programming

drifter (version 0.2.1)

calculate_covariate_drift: Calculate Covariate Drift for two data frames

Description

Here covariate drift is defined as Non-Intersection Distance between two distributions. More formally, $$d(P,Q) = 1 - sum_i min(P_i, Q_i)$$. The larger the distance the more different are two distributions.

Usage

calculate_covariate_drift(data_old, data_new, bins = 20)

Arguments

.data

Data frame to be relabelled

.labels

Vector of variable labels (usually created using extract_variable_label) of same length as .data.

data_old

data frame with `old` data

data_new

data frame with `new` data

bins

continuous variables are discretized to `bins` intervals of equal sizes

Value

an object of a class `covariate_drift` (data.frame) with Non-Intersection Distances

Examples

Run this code
# NOT RUN {
library("DALEX")
# here we do not have any drift
d <- calculate_covariate_drift(apartments, apartments_test)
d
# here we do have drift
d <- calculate_covariate_drift(dragons, dragons_test)
d

# }

Run the code above in your browser using DataLab