Learn R Programming

sortinghat (version 0.1)

simdata: Wrapper function to generate data from a variety of data-generating families for classification studies.

Description

We provide a wrapper function to generate random variates from any of the following data-generating families: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Usage

simdata(family = c("uniform", "normal", "t", "contaminated", "guo", "friedman"),
    ...)

Arguments

family
the family of distributions from which to generate data
...
optional arguments that are passed to the data-generating function

Value

  • named list containing: [object Object],[object Object]

Details

This wrapper function is useful for simulation studies, where the performance of supervised and unsupervised learning methods and algorithms are evaluated. For each data-generating model, we generate $n_k$ observations $(k = 1, \ldots, K)$ from each of $K$ multivariate distributions.

Each family returns a list containing a matrix of the multivariate observations generated as well as the class labels for each observation.

For details about an individual data-generating family, please see its respective documentation.

Examples

Run this code
data_normal <- simdata(family = "normal", n = c(10, 20), mean = c(0, 1), cov = diag(2), seed = 42)
data_uniform <- simdata(family = "uniform", delta = 2, seed = 42)
data_friedman <- simdata(family = "friedman", experiment = 4, seed = 42)

Run the code above in your browser using DataLab