generate_data: Generate data following different missingness mechanisms
Description
This function performs generates
Usage
generate_data(n = 100, d = 3, type = "MAR well specified")
Value
Dataframe containing features as x.1..d, labels as y.
Arguments
n
The number of samples to return.
d
The dimension of samples to return.
type
The matrix of financed clients' labels
Author
Adrien Ehrhardt
Details
This function generates data from a uniform(0,1) distribution, and generates
labels y according to a logistic regression on this data with random -1/1
parameter for each coordinate (MAR well-specified), the square of this data
(MAR misspecified), or this data and some additional feature (from U(0,1) as
well - MNAR).
References
Ehrhardt, A., Biernacki, C., Vandewalle, V., Heinrich, P. and Beben, S. (2018), Reject Inference Methods in Credit Scoring: a rational review,