SimpleSimulation is a support function for generating multiresolution datasets.
All simulation types have three layers except the type 6 has four layers.
The type-1 simulation has all individuals belong to the same homogeneous partition in the first layer.
The type-2 simulation has four homogeneous partitions in a second layer. Each partition has its own models.
The type-3 simulation has eight homogeneous partitions in a third layer. Each partition has its own models
The type-4 simulation has one homogeneous partition in a second layer, four homogeneous partitions in a third layer, and eight homogeneous partitions in a fourth layer. Each partition has its own model.
The type-5 simulation is similar to type-4 simulation but Y=h(X) is an exponential function.
The type-6 simulation is similar to type-4 simulation but Y=h(X) is a polynomial function with degree parameter.
SimpleSimulation(indvN = 10000, type = 1, degree = 2)The function returns a multiresolution dataset.
DataT$X[i,d] is a value of feature d of individual i
DataT$Y[i] is value of target variable of individual i that
we want to fit DataT$Y ~ DataT$X in linear model
clsLayer[i,j] is a cluster ID of individual i at layer j;
clsLayer[i,1] is the first layer that everyone typically belongs to a single cluster.
DataT$TrueFeature[i]is equal to d if a true feature is DataT$X[i,d-1] that DataT$Y[i] is dependent with.
Note that d = 1 is reserved for the intercept value in a linear model.
is a number of individuals per homogeneous partition.
is a type of simulation dataset. There are four types.
is a degree parameter of a polynomial function for type-5 simulation
# Running SimpleSimulation to generate a dataset.
DataT<-SimpleSimulation(100,type=1)
Run the code above in your browser using DataLab