Learn R Programming

rmcfs (version 1.1.1)

artificial.data: Creates artificial dataset

Description

Creates data.frame with artificial data. The last six columns are nominal and highly correlated to feature 'class'. This data set consists of objects from 3 classes, A, B and C, that contain 40, 20, 10 objects, respectively (70 objects altogether). For each object, 6 binary features (A1, A2, B1, B2, C1 and C2) are created and they are 'ideally' or 'almost ideally' correlated with class feature. If an object's 'class' equals 'A', then its features A1 and A2 are set to class value 'A'; otherwise A1 = A2 = 0. If an object's 'class' is 'B' or 'C', the processing is analogous, but some random corruption is introduced. For 2 observations from class 'B' and both attributes B1/B2, their values 'B' are replaced by '0'. For 4 observations from class 'C' and both attributes C1/C2, their values 'C' are replaced by '0'. The number of corrupted values for each class is defined by corruption parameter. The data also contains additional rnd.features = 500 random numerical features with uniformly [0,1] distributed values.

Usage

artificial.data(rnd.features = 500, size = c(40, 20, 10), corruption = c(0, 2, 4), seed = NA)

Arguments

rnd.features
number of numerical random features.
size
size of classes A, B, and C.
corruption
defines the number of corrupted values for a pairs of columns A1/A2, B1/B2, C1/C2,
seed
seed for random number generator.

Value

Examples

Run this code
  d <- artificial.data(rnd.features = 500)
  showme(d)

Run the code above in your browser using DataLab