Usage
mixture_generator(n = 130, p = 100, ratio = 0.4, max_compl = 1,
valid = 1000, positive = 0.6, sigma_Y = 10, sigma_X = NULL,
R2 = NULL, R2Y = 0.4, meanvar = NULL, sigmavar = NULL, lambda = 3,
Amax = NULL, lambdapois = 10, gamma = FALSE, gammashape = 1,
gammascale = 0.5, tp1 = 1, tp2 = 1, tp3 = 1, pb = 0, nonlin = 0,
pnonlin = 2, scale = TRUE, Z = NULL)
Arguments
n
the number of individuals in the learning dataset
p
the number of covariates (without the response)
ratio
the ratio of explained covariates (dependent)
max_compl
the number of covariates in each subregression
valid
the size of the validation sample
positive
the ratio of positive coefficients in both the regression and the subregressions
sigma_Y
standard deviation for the noise of the regression
sigma_X
standard deviation for the noise of the subregression (all). ignored if gamma=T or if R2 is not NULL
R2
the strength of the subregressions
R2Y
the strength of the main regression
meanvar
vector of means for the covariates.
sigmavar
standard deviation of the covariates.
lambda
paramater of the law that define the number of components in gaussian mixture models
Amax
the maximum number of covariates with non-zero coefficients in the regression
lambdapois
parameter used to generate the coefficient in the subregressions. poisson distribution.
gamma
boolean to generate a p-sized vector sigma_X gamma-distributed
gammashape
shape parameter of the gamma distribution (if needed)
gammascale
scale parameter of the gamma distribution (if needed)
tp1
the ratio of right-side covariates allowed to have a non-zero coefficient in the regression
tp2
the ratio of left-side covariates allowed to have a non-zero coefficient in the regression
tp3
the ratio of strictly independent covariates allowed to have a non-zero coefficient in the regression
pb
generates Y in an heuristic way that will give some issues with correlations.
nonlin
to use non linear structure (half squared , half log). if not null, it is the proba to use power pnonlin instead of log
pnonlin
the power used if non linear structure
scale
boolean to scale X before computing Y
Z
the adjacency matrix to obtain