powered by
Split a dataset into training and validation subsets with respect to the uplift sample distribution.
SplitUplift(data, p, group)
a data frame of interest that contains at least the response and the treatment variables.
The desired sample size. p is a value between 0 and 1 expressed as a decimal, it is set to be proportional to the number of observations per group.
Your grouping variables. Generally, for uplift modelling, this should be a vector of treatment and response variables names, e.g. c("treat", "y").
a training data frame of $p$ percent
a validation data frame of $1-p$ percent
Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2019) Uplift Regression, <https://dms.umontreal.ca/~murua/research/UpliftRegression.pdf>
# NOT RUN { library(tools4uplift) data("SimUplift") split <- SplitUplift(SimUplift, 0.8, c("treat", "y")) train <- split[[1]] valid <- split[[2]] # }
Run the code above in your browser using DataLab