Generates data
fngendata(
n,
bin.k = 0,
bin.prob = NULL,
cont.k = 5,
y.gen.bin.k = 0,
y.gen.bin.beta = NULL,
y.gen.bin.prob = NULL,
y.gen.cont.beta = c(2, 4, 6, 8, 10),
y.gen.cont.mod.k = 0,
y.gen.cont.mod.beta = matrix(c(-2, 2), 1, 2, byrow = TRUE),
y.gen.bin.mod.prob = c(0.5),
y.gen.cont.sp.k = 0,
y.gen.cont.sp.groups = 2,
y.gen.cont.sp.rho = 0.2,
y.gen.cont.sp.dif = 1,
intercept.beta = 0,
Xgenerator.method = "simstudy",
corMatrix = 100,
rho = NULL,
corstr = NULL,
condnumber = 1,
mu = 0,
muvect = NULL,
sd = 1,
sdvect = NULL,
error.dist = "normal",
error.dist.mean = 0,
error.dist.sd = 1,
error.dist.snr = NULL,
error.dist.df = 2,
dataframe = TRUE,
seed = NULL
)A data.frame or a list composed of a matrix of
independent variables values (X), a vector of the dependent variable values
(y), a vector of coefficient values (coefficients), a vector of non-zero
coefficients (y.coefficients), and a vector of the error values (epsilon).
Number of individuals.
Number of binary variables not used for generating y.
A vector of probabilities with length equal to bin.k.
Number of continuous variables not used for generating y.
Number of binary variables used for generating y.
A vector of coefficients with length equal to bin.k
used to generate y.
A vector of probabilities with length equal
to y.gen.bin.k.
A vector of coefficients with length equal to cont.k
used to generate y.
Experimental
Experimental
Experimental
Experimental
Experimental
Experimental
Experimental
Value for the constant used to generate y.
Method used to generate X data ( "simstudy"
or "svd").
A positive number for alphad
(see rcorrmatrix), NULL or a correlation
matrix to be used when Xgenerator is "simstudy".
Correlation coefficient, -1 <= rho <= 1. Use when
Xgenerator is "simstudy" and corMatrix is NULL.
correlation structure ("ind", "cs" or
"ar1") (see genCorData) to be used when
Xgenerator is "simstudy" and corMatrix is NULL.
A value for the condition number of the X matrix to be used
when Xgenerator is "svd".
The mean of the variables. To be used when all variables have the same mean.
A vector of means. To be used when variables have different means.
The length of muvect must be k.
Standard deviation of the variables. To be used when all variables have the same standard deviation.
A vector of standard deviations. To be used when variables have
different standard deviations. The length of sdvect must be k.
Distribution of the error. "normal" for normal
distribution or "t" for t-student distribution.
Mean value used when error.dist is
"normal".
Standard deviation value used when error.dist is
"normal".
Signal to noise ratio. If not NULL, the value of
error.dist.sd will be ignored and it will be determined accordingly.
Degrees of freedom used when error.dist is
"t".
Logical. If TRUE, the default, returns a
data.frame else returns a list.
A seed for reproducibility.
Jorge Cabral, jorgecabral@ua.pt
dataGCEstim <- fngendata(
n = 100, cont.k = 2,
y.gen.cont.beta = c(3, 6, 9),
intercept.beta = 1,
Xgenerator.method = "svd", condnumber = 50,
mu = 0, sd = 1,
error.dist = "normal", error.dist.mean = 0, error.dist.snr = 5,
dataframe = TRUE, seed = 230676)
summary(dataGCEstim)
Run the code above in your browser using DataLab