Learn R Programming

CORElearn (version 0.9.29)

regDataGen: Artificial data for testing regression algorithms

Description

The generator produces regression data data with 4 discrete and 7 numeric attributes.

Usage

regDataGen(noInst, t1=0.8, t2=0.5, noise=0.1)

Arguments

noInst
Number of instances to generate.
t1, t2
Parameters controlling the shape of the distribution.
noise
Parameter controlling the amount of noise. If noise=0, there is no noise. If noise = 1, then the level of the signal and noise are the same.

Value

  • Returns a data.frame with noInst rows and 11 columns. Range of values of the attributes and response are
  • a10,1
  • a2a,b,c,d
  • a30,1 (irrelevant)
  • a4a,b,c,d (irrelevant)
  • x1numeric (gaussian with different sd for each class)
  • x2numeric (gaussian with different sd for each class)
  • x3numeric (gaussian, irrelevant)
  • x4numeric from [0,1]
  • x5numeric from [0,1]
  • x6numeric from [0,1]
  • responsenumeric

Details

The response variable is derived from x4, x5, x6 using two different functions. The choice depends on a hidden variable, which determines weather the response value would follow a linear dependency $f=x_4-2x_5+3x_6$, or a nonlinear one $f=cos(4\pi x_4)(2x_5-3x_6)$. Attributes a1, a2, x1, x2 carry some information on the hidden variables depending on parameters t1, t2. Extreme values of the parameters are t1=0.5 and t2=1, when there is no information. On the other hand, if t1=0 or t1=1 then each of the attributes a1, a2 carries full information. If t2=0, then each of x1, x2 carries full information on the hidden variable. The attributes x4, x5, x6 are available with a noise level depending on parameter noise. If noise=0, there is no noise. If noise=1, then the level of the signal and noise are the same.

See Also

classDataGen,ordDataGen,CoreModel,

Examples

Run this code
#prepare a regression data set
regData <-regDataGen(noInst=200)

# build regression tree similar to CART
modelRT <- CoreModel(response ~ ., regData, model="regTree", modelTypeReg=1)
print(modelRT)

Run the code above in your browser using DataLab