Data-generating function to generate artificial data sets of a classification
problem with two response classes, denoted as "A"
and "B"
.
dgp_twoclass(n = 100, p = 4, noise = 16, rho = 0,
b0 = 0, b = rep(1, p), fx = identity)
A data.frame
including a column denoted as class
that is
a factor with two levels "A"
and "B"
. All other columns
represent the predictor variables (signal predictors followed by noise
predictors) and are named by "x1"
, "x2"
, etc..
integer. Number of observations. The default is 100.
integer. Number of signal predictors. The default is 4.
integer. Number of noise predictors. The default is 16.
numeric value between -1 and 1 specifying the correlation
between the signal predictors. The correlation is given by rho
^k,
where k is an integer value given by toeplitz
structure. The default is 0 (no correlation between predictors).
numeric value. Baseline probability for class "B"
on the logit
scale. The default is 0.
numeric value. Slope parameter for the predictors on the logit scale. The default is 1 for all predictors.
a function that is used to transform the predictors. The default
is identity
(equivalent to no transformation).
stability
dgp_twoclass(n = 200, p = 6, noise = 4)
Run the code above in your browser using DataLab