Learn R Programming

scoringTools (version 0.1.3)

generate_data: Generate data following different missingness mechanisms

Description

This function performs generates

Usage

generate_data(n = 100, d = 3, type = "MAR well specified")

Value

Dataframe containing features as x.1..d, labels as y.

Arguments

n

The number of samples to return.

d

The dimension of samples to return.

type

The matrix of financed clients' labels

Author

Adrien Ehrhardt

Details

This function generates data from a uniform(0,1) distribution, and generates labels y according to a logistic regression on this data with random -1/1 parameter for each coordinate (MAR well-specified), the square of this data (MAR misspecified), or this data and some additional feature (from U(0,1) as well - MNAR).

References

Ehrhardt, A., Biernacki, C., Vandewalle, V., Heinrich, P. and Beben, S. (2018), Reject Inference Methods in Credit Scoring: a rational review,

Examples

Run this code
# We simulate data from financed clients
generate_data(n = 100, d = 3, type = "MAR well specified")

Run the code above in your browser using DataLab